1.配置插件

把hadoop-eclipse-plugin-1.2.1.jar拷贝到eclipse的plugins目录中，重启eclipse。
会看到eclipse左边的project explorer中出现DFS Locations，点击window->perspective->open perspective->other...，打开Map/Reduce。

Paste_Image.png

在下方新建Hadoop Locations

Paste_Image.png

填写参数：Location name随便填，Map/Reducer Master中的Port好像填9001和50020都行，与mapred-site.xml中一致，右边的Port与core-site.xml一致，写9000。

Paste_Image.png

启动start-all.sh后，就能通过插件来操作DFS了。

Paste_Image.png

在hadoop-wsj下新建文件夹input/wc和output，在wc中上传一个文件，用于统计单词个数；output用于存放输出结果。
注意：只新建output，不新建output/wc，因为wc会在程序运行时自动生成，提前建了反而报错。

2.新建map/reduce工程

注意建工程时要指定hadoop的安装路径

Paste_Image.png

最后是是mapreduce的Demo：
一共3个class文件，McMapper.class，WcReducer.class和JobRun.class。

McMapper.class：

package test0;
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
public class McMapper extends Mapper<LongWritable, Text, Text, IntWritable>{//输入(key,value)类型确定

 //每次调用map方法会传入split中的一行数据，key：该行数据所在文件中的位置下标，value:这行数据
 protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
 throws IOException, InterruptedException {
 String line = value.toString();
 StringTokenizer st = new StringTokenizer(line);
 while(st.hasMoreTokens()){
 String word = st.nextToken();
 context.write(new Text(word), new IntWritable(1));//map输出
 }
 }
}

WcReducer.class：

package test0;
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
public class WcReducer extends Reducer<Text, IntWritable, Text, IntWritable>{
 protected void reduce(Text key, Iterable<IntWritable> arg1,
 Reducer<Text, IntWritable, Text, IntWritable>.Context arg2) throws IOException, InterruptedException {
 int sum = 0;
 for(IntWritable i:arg1){
 sum = sum + i.get();
 }
 arg2.write(key, new IntWritable(sum));
 }
}

JobRun.class:

package test0;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class JobRun {
 public static void main(String[] args){
 Configuration conf = new Configuration();
 /**
  * 下面两行很重要，是为了定位到HDFS的文件系统中，而不是本地的路径
  * 但前提是core-site.xml和hdfs-site.xml中的配置信息完全按照官方文档写，
  * 自己不能改动hadoop.tmp.dir的路径，否则会报错
  */
 conf.addResource(new Path("/home/wsj/hadoop121/hadoop-1.2.1/conf/core-site.xml"));     
 conf.addResource(new Path("/home/wsj/hadoop121/hadoop-1.2.1/conf/hdfs-site.xml"));     
 try {
 Job job = new Job(conf);
 job.setJarByClass(JobRun.class);
 job.setMapperClass(McMapper.class);
 job.setReducerClass(WcReducer.class);
 job.setMapOutputKeyClass(Text.class);
 job.setMapOutputValueClass(IntWritable.class);
 job.setNumReduceTasks(1);
 
 FileInputFormat.addInputPath(job, new Path("/tmp/hadoop-wsj/input/wc"));
 FileOutputFormat.setOutputPath(job, new Path("/tmp/hadoop-wsj/output/wc"));
 try {
 System.exit(job.waitForCompletion(true) ? 0 : 1);
 } catch (ClassNotFoundException e) {
 e.printStackTrace();
 } catch (InterruptedException e) {
 e.printStackTrace();
 }
 
 } catch (IOException e) {
 e.printStackTrace();
 }

 }
}