MapReduce-API(1)创建WordCount程序

作者: geekAppke | 来源:发表于2018-11-19 00:11 被阅读2次

MapReduce-API(1)创建WordCount程序
Hadoop学习2
Hadoop学习-第二天（MapReduce原理及WordCou
Spark核心编程：使用Java、Scala和spark-she
wordCount执行过程中的源码解析
Spark集群启动流程和任务提交流程
3. Hadoop：MapReduce 编程及 shuffle
WordCount案例
WordCount程序
Storm的wordcount代码编写与分析

分布式应用开发，计算向数据移动

# 打成一个jar包，到数据上跑
[root@node001 ~]# hadoop jar MyWordCount.jar [com.hadoop.mr.MyWordCount](com.hadoop.mr.MyWordCount)

public class MyWordCount {
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration(true);
        
        // 创建1个作业
        Job job = Job.getInstance(conf);
        // 当前类的名字，导出jar包时用
        job.setJarByClass(MyWordCount.class);
        
        // 给作业起一个名字，在
        job.setJobName("myWordCount");
        
        // 设置输入输出路径
        Path input = new Path("/user/root/test.txt");
        FileInputFormat.addInputPath(job, input); // 不同的输入源
        
        Path out = new Path("/data/wc/out");
        if (out.getFileSystem(conf).exists(out)) {
            out.getFileSystem(conf).delete(out, true);
        }
        FileOutputFormat.setOutputPath(job, out);
        
        job.setMapperClass(MyMapper.class);
        // 序列化反序列化，类型要一致，准备一个对象接收
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setReducerClass(MyReducer.class);

        // 提交作业
        job.waitForCompletion(true);
    }
}

public class MyMapper extends Mapper<Object, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
//  private Text word = new Text();
    
    // key是行的偏移量
    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        final String[] split = value.toString().split(" ");
         
         for (String word : split) {
           context.write(new Text(word), one);
         }
         
//       StringTokenizer itr = new StringTokenizer(value.toString());
//       while (itr.hasMoreTokens()) {
//         word.set(itr.nextToken());
//         context.write(word, one);
//       }
     }
}

public class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable<IntWritable> values,
                      Context context) throws IOException, InterruptedException {
        int sum = 0;
     for (IntWritable val : values) {
       sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
   }
    
}

在eclipse 上也可直接运行！