Windows通过IDEA开发虚拟机中Hadoop

作者: kason_zhang | 来源:发表于2017-07-16 21:41 被阅读670次

Windows通过IDEA开发虚拟机中Hadoop
Windows下搭建Spark开发测试环境
windows环境javaAPI操作hadoop
IDEA 操作虚拟机中 HDFS 提示 Permission d
hadoop之旅6-windows本地MapReducer离线单
IntelliJ 配置hadoop方法
Java API操作HDFS文件系统
IDEA 使用虚拟机中 hdfs 运行 wordcount
Gradle IDEA集成插件的使用
EOS智能合约开发01 - 构建EOS

搭建Hadoop环境，让其能够在Windows中进行开发
步骤1 关闭防火墙
先关闭防火墙，这样可以让比如Hadoop的50070端口供给外界访问
centOS 6.5关闭防火墙步骤
关闭命令： service iptables stop
永久关闭防火墙：chkconfig iptables off
两个命令同时运行，运行完成后查看防火墙关闭状态
service iptables status
步骤2 搭建伪分布式环境
具体搭建环境请参见Hadoop官网

注意为了能够让其在Windows中能够通过IDEA访问虚拟机中的Hadoop，那么就需要在core-site.xml等配置文件中使用ip地址，而不是hostname，不然windows端会报Connection Error

执行bin/hadoop namenode -format
执行sbin/start-dfs.sh启动hdfs
执行sbin/start-yarn.sh启动yarn
步骤3 Windows端配置
1， windows端配置Hadoop 环境变量，

Paste_Image.png

2， Windows为了能够访问Hadoop，需要加入几个包放置到hadoop目录的bin文件夹中

Paste_Image.png

3， windows 在etc host文件配置能够访问虚拟机hadoop机器的hostname

Paste_Image.png

4, 打开IDEA开发项目，然后将配置文件放到resources文件中

Paste_Image.png

步骤4 IDEA开发Hadoop Yarn
这里以WordCount例子为例
package ComponentApp;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

import java.io.IOException;
/**

Created by IBM on 2017/7/16.
*/
public class WordCount2 implements Tool {
public void setConf(Configuration configuration) {

}

public Configuration getConf() {
return new JobConf(WordCount2.class);
}

public int run(String[] strings) throws Exception {
try {
Configuration conf = getConf();
conf.set("mapreduce.job.jar", "D:\java\idea\ComponentApp\out\artifacts\ComponentApp_jar\ComponentApp.jar");
conf.set("mapreduce.framework.name", "yarn");
conf.set("yarn.resourcemanager.hostname", "192.168.137.131");
conf.set("mapreduce.app-submission.cross-platform", "true");
```
     Job job = Job.getInstance(conf);
     job.setJarByClass(WordCount2.class);

     job.setOutputKeyClass(Text.class);
     job.setOutputValueClass(LongWritable.class);

     job.setMapperClass(WcMapper.class);
     job.setReducerClass(WcReducer.class);

     job.setInputFormatClass(TextInputFormat.class);
     job.setOutputFormatClass(TextOutputFormat.class);

     FileInputFormat.setInputPaths(job, "hdfs://192.168.137.131:9000/kason/myid");
     FileOutputFormat.setOutputPath(job, new Path("hdfs://192.168.137.131:9000/kason/out4"));

     job.waitForCompletion(true);
 } catch (Exception e) {
     e.printStackTrace();
 }
 return 0;
```
}

public static class WcMapper extends Mapper<LongWritable, Text, Text, LongWritable>{
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String mVal = value.toString();
String[] strs = mVal.split(" ");
for(String s : strs) {
System.out.println("data:" + s);
context.write(new Text(s), new LongWritable(1));
}
}
}
public static class WcReducer extends Reducer<Text, LongWritable, Text, LongWritable>{
@Override
protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException {
long sum = 0;
for(LongWritable lVal : values){
sum += lVal.get();
}
context.write(key, new LongWritable(sum));
}
}
public static void main(String[] args) throws Exception {
ToolRunner.run(new WordCount2(),args);
}
}
IDEA运行结果