美文网首页大数据BigData
MapReduce-API(1)创建WordCount程序

MapReduce-API(1)创建WordCount程序

作者: geekAppke | 来源:发表于2018-11-19 00:11 被阅读2次

分布式应用开发,计算向数据移动

# 打成一个jar包,到数据上跑
[root@node001 ~]# hadoop jar MyWordCount.jar [com.hadoop.mr.MyWordCount](com.hadoop.mr.MyWordCount)
public class MyWordCount {
    public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration(true);
        
        // 创建1个作业
        Job job = Job.getInstance(conf);
        // 当前类的名字,导出jar包时用
        job.setJarByClass(MyWordCount.class);
        
        // 给作业起一个名字,在
        job.setJobName("myWordCount");
        
        // 设置输入输出路径
        Path input = new Path("/user/root/test.txt");
        FileInputFormat.addInputPath(job, input); // 不同的输入源
        
        Path out = new Path("/data/wc/out");
        if (out.getFileSystem(conf).exists(out)) {
            out.getFileSystem(conf).delete(out, true);
        }
        FileOutputFormat.setOutputPath(job, out);
        
        job.setMapperClass(MyMapper.class);
        // 序列化反序列化,类型要一致,准备一个对象接收
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);
        job.setReducerClass(MyReducer.class);

        // 提交作业
        job.waitForCompletion(true);
    }
}
public class MyMapper extends Mapper<Object, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
//  private Text word = new Text();
    
    // key是行的偏移量
    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        final String[] split = value.toString().split(" ");
         
         for (String word : split) {
           context.write(new Text(word), one);
         }
         
//       StringTokenizer itr = new StringTokenizer(value.toString());
//       while (itr.hasMoreTokens()) {
//         word.set(itr.nextToken());
//         context.write(word, one);
//       }
     }
}
public class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable<IntWritable> values,
                      Context context) throws IOException, InterruptedException {
        int sum = 0;
     for (IntWritable val : values) {
       sum += val.get();
     }
     result.set(sum);
     context.write(key, result);
   }
    
}

在eclipse 上也可直接运行!


只需3个类,其它什么都不用勾选
客户端作业提交源码分析

相关文章

网友评论

    本文标题:MapReduce-API(1)创建WordCount程序

    本文链接:https://www.haomeiwen.com/subject/gvatfqtx.html