美文网首页Spark_Flink_Hadoop我爱编程
Windows通过IDEA开发虚拟机中Hadoop

Windows通过IDEA开发虚拟机中Hadoop

作者: kason_zhang | 来源:发表于2017-07-16 21:41 被阅读670次

    搭建Hadoop环境,让其能够在Windows中进行开发
    步骤1 关闭防火墙
    先关闭防火墙,这样可以让比如Hadoop的50070端口供给外界访问
    centOS 6.5关闭防火墙步骤
    关闭命令: service iptables stop
    永久关闭防火墙:chkconfig iptables off
    两个命令同时运行,运行完成后查看防火墙关闭状态
    service iptables status
    步骤2 搭建伪分布式环境
    具体搭建环境请参见Hadoop官网

    注意 为了能够让其在Windows中能够通过IDEA访问虚拟机中的Hadoop,那么就需要在core-site.xml等配置文件中使用ip地址,而不是hostname,不然windows端会报Connection Error

    执行bin/hadoop namenode -format
    执行sbin/start-dfs.sh启动hdfs
    执行sbin/start-yarn.sh启动yarn
    步骤3 Windows端配置
    1, windows端配置Hadoop 环境变量,

    Paste_Image.png

    2, Windows为了能够访问Hadoop,需要加入几个包放置到hadoop目录的bin文件夹中

    Paste_Image.png

    3, windows 在etc host文件配置能够访问虚拟机hadoop机器的hostname

    Paste_Image.png

    4, 打开IDEA开发项目,然后将配置文件放到resources文件中

    Paste_Image.png

    步骤4 IDEA开发Hadoop Yarn
    这里以WordCount例子为例
    package ComponentApp;
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.LongWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapred.JobConf;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.Mapper;
    import org.apache.hadoop.mapreduce.Reducer;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

    import org.apache.hadoop.util.Tool;
    import org.apache.hadoop.util.ToolRunner;

    import java.io.IOException;
    /**

    • Created by IBM on 2017/7/16.
      */
      public class WordCount2 implements Tool {
      public void setConf(Configuration configuration) {

      }

      public Configuration getConf() {
      return new JobConf(WordCount2.class);
      }

      public int run(String[] strings) throws Exception {
      try {
      Configuration conf = getConf();
      conf.set("mapreduce.job.jar", "D:\java\idea\ComponentApp\out\artifacts\ComponentApp_jar\ComponentApp.jar");
      conf.set("mapreduce.framework.name", "yarn");
      conf.set("yarn.resourcemanager.hostname", "192.168.137.131");
      conf.set("mapreduce.app-submission.cross-platform", "true");

           Job job = Job.getInstance(conf);
           job.setJarByClass(WordCount2.class);
      
           job.setOutputKeyClass(Text.class);
           job.setOutputValueClass(LongWritable.class);
      
           job.setMapperClass(WcMapper.class);
           job.setReducerClass(WcReducer.class);
      
           job.setInputFormatClass(TextInputFormat.class);
           job.setOutputFormatClass(TextOutputFormat.class);
      
           FileInputFormat.setInputPaths(job, "hdfs://192.168.137.131:9000/kason/myid");
           FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.137.131:9000/kason/out4"));
      
           job.waitForCompletion(true);
       } catch (Exception e) {
           e.printStackTrace();
       }
       return 0;
      

      }

      public static class WcMapper extends Mapper<LongWritable, Text, Text, LongWritable>{
      @Override
      protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
      String mVal = value.toString();
      String[] strs = mVal.split(" ");
      for(String s : strs) {
      System.out.println("data:" + s);
      context.write(new Text(s), new LongWritable(1));
      }
      }
      }
      public static class WcReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
      @Override
      protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
      long sum = 0;
      for(LongWritable lVal : values){
      sum += lVal.get();
      }
      context.write(key, new LongWritable(sum));
      }
      }
      public static void main(String[] args) throws Exception {
      ToolRunner.run(new WordCount2(),args);
      }
      }
      IDEA运行结果

    Paste_Image.png

    YARN 页面

    Paste_Image.png

    HDFS页面

    Paste_Image.png

    相关文章

      网友评论

        本文标题:Windows通过IDEA开发虚拟机中Hadoop

        本文链接:https://www.haomeiwen.com/subject/mwexkxtx.html