美文网首页
Hadoop本地开发环境搭建(eclispe、IDEA)

Hadoop本地开发环境搭建(eclispe、IDEA)

作者: 千释炎 | 来源:发表于2018-06-01 15:05 被阅读0次

            一般情况下,开发MapReduce程序后,我们需要将MapReduce程序打包成JAR包,然后再上传到Hadoop集群通过命令行运行,这样非常的不方便。为了提高开发效率,非常需要搭建一个Hadoop本地开发环境,下面简单将一个步骤:

            1.将集群上安装的Hadoop整个文件夹复制到本地

            2.本地设置Hadoop环境变量,我的本地Hadoop目录是:D:\hadoop-2.6.0-cdh5.14.0,设置的变量如下所示:

    #新建系统变量
    HADOOP_HOME=D:\hadoop-2.6.0-cdh5.14.0
    HADOOP_PREFIX=D:\hadoop-2.6.0-cdh5.14.0
    HADOOP_BIN_PATH=%HADOOP_HOME%\bin
    
    #在Path环境变量增加
    %HADOOP_HOME%\bin
    %HADOOP_HOME%\sbin
    

            3.Windows部署Hadoop还需要winutils.exe和hadoop.dll,下载winutils.exe以及对应版本的hadoop.dll,将hadoop.dll复制到系统盘的:C:\Windows\System32目录下,同时将hadoop.dll和winutils.exe复制到本地Hadoop的bin目录下,
            下面是hadoop-2.6.0的winutils.exe和hadoop.dll:

    https://pan.baidu.com/s/1VDD8k-9RBl1E5mSZXJO37w
    

            4.这时可能需要重启机器,我的是重启之后才生效。

            这时,就可以在本地直接提交MapReduce到集群了,提交任务代码配置如下所示:

    Configuration conf=new Configuration();
    conf.addResource("hadoop/core-site.xml");
    conf.addResource("hadoop/hdfs-site.xml");
    conf.addResource("hadoop/mapred-site.xml");
    conf.addResource("hadoop/yarn-site.xml");
    conf.set("fs.defaultFS","hdfs://192.168.199.100:9000");
    conf.set("mapreduce.framework.name","yarn");
    conf.set("yarn.resourcemanager.address","192.168.199.100:8032");
    conf.set("yarn.resourcemanager.scheduler.address","192.168.199.100:8030");
    conf.set("yarn.resourcemanager.hostname","192.168.199.100");
    conf.set("mapreduce.app-submission.cross-platform","true");
    Job job=Job.getInstance(conf,"MRJob_1");
    job.setJar("G:\\idea-workplace\\movie_hadoop.jar");
    job.setJarByClass(MRJob_1.class);
    job.setMapperClass(MRJob_1_Map.class);
    job.setReducerClass(MRJob_1_Reduce.class);
    job.setOutputKeyClass(IntWritable.class);
    job.setOutputValueClass(Text.class);
    job.setPartitionerClass(UserIdPartition.class);
    FileInputFormat.addInputPath(job,new Path(map.get("MR1_input")));
    Path outputPath=new Path(map.get("MR1_output"));
    FileOutputFormat.setOutputPath(job,outputPath);
    int flag=job.waitForCompletion(true)?0:1;
    

            上面只是一个示例,注意:提交前需要需要将MR工程导出为JAR,因为其无法自动打包,然后通过job的setJar方法设置JAR包的位置就可以了。

    相关文章

      网友评论

          本文标题:Hadoop本地开发环境搭建(eclispe、IDEA)

          本文链接:https://www.haomeiwen.com/subject/vhhksftx.html