1、需求
将student.txt文件上传到hdfs,根据所传文件创建外部表,再将表中查询到的结果写入到本地文件
2、准备数据
[alex@hadoop102 module]$ vim data/student.txt
1001 zhangsan
1002 lisi
1003 wangwu
1004 xiaoliu
3、第一个job put.job
load.job上传数据到hdfs
[alex@hadoop102 jobs]$ vim put.job
# put job
type=command
command=/opt/module/hadoop-2.7.2/bin/hadoop fs -put /opt/module/data/student.txt /azkaban
4、第二个job
1)创建create.sql
[alex@hadoop102 jobs]$ vim create.sql
use default;
drop table if exists student;
create table if not exists student(id int, name string)
row format delimited fields terminated by '\t';
load data inpath '/azkaban/student.txt' into table student;
insert into student values(1005,"alex");
insert into student values(1006,"mk");
2)create.job
create.job依赖load.job
[alex@hadoop102 jobs]$ vim create.job
# create.job
type=command
dependencies=put
command=/opt/module/hive/bin/hive -f /opt/module/azkaban/jobs/create.sql
5、第三个job
1)创建export.sql
讲查询出来的数据导出到本地
[alex@hadoop102 jobs]$ vim export.sql
insert overwrite local directory '/opt/module/data/studentexport'
row format delimited fields terminated by '\t'
select * from student;
2)export.job
insert.job依赖create.job
[alex@hadoop102 jobs]$ vim export.job
# export job
type=command
dependencies=create
command=/opt/module/hive/bin/hive -f /opt/module/azkaban/jobs/export.sql
6、将所有job资源文件打到一个zip包中
[alex@hadoop102 jobs]$ zip jobn.zip put.job create.job export.job
adding: put.job (deflated 20%)
adding: create.job (deflated 23%)
adding: export.job (deflated 21%)
7、Azkaban 执行
步骤和单Job案例 一样
在azkaban的web管理界面创建工程并上传zip包, 启动job
8、Success
9、查看结果
[alex@hadoop102 student]$ cat /opt/module/data/studentexport/000000_0
1005 alex
1006 mk
1001 zhangsan
1002 lisi
1003 wangwu
1004 xiaoliu
网友评论