美文网首页
idea用maven开发hive的udf详细过程

idea用maven开发hive的udf详细过程

作者: 解脱了 | 来源:发表于2017-12-07 00:32 被阅读295次
    1. 创建maven项目
      file>new>project





      2.添加依赖的jar包,第一次添加可能有点慢


    <?xml version="1.0" encoding="UTF-8"?>
    <project xmlns="http://maven.apache.org/POM/4.0.0"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
    
        <groupId>scc</groupId>
        <artifactId>UDF</artifactId>
        <version>1.0-SNAPSHOT</version>
        <dependencies>
            <dependency>
                <groupId>org.apache.hadoop</groupId>
                <artifactId>hadoop-client</artifactId>
                <version>2.7.3</version>
            </dependency>
            <dependency>
                <groupId>org.apache.hive</groupId>
                <artifactId>hive-exec</artifactId>
                <version>1.2.1</version>
            </dependency>
    
        </dependencies>
        <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-shade-plugin</artifactId>
                    <version>1.4</version>
                    <executions>
                        <execution>
                            <phase>package</phase>
                            <goals>
                                <goal>shade</goal>
                            </goals>
                            <configuration>
                                <filters>
                                    <filter>
                                        <artifact>*:*</artifact>
                                        <excludes>
                                            <exclude>META-INF/*.SF</exclude>
                                            <exclude>META-INF/*.DSA</exclude>
                                            <exclude>META-INF/*.RSA</exclude>
                                        </excludes>
                                    </filter>
                                </filters>
    
                                <transformers>
                                    <transformer
                                            implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                        <resource>META-INF/spring.handlers</resource>
                                    </transformer>
                                    <transformer
                                            implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                        <mainClass>com.neu.hive.UDF.ToUpperCaseUDF</mainClass>
                                    </transformer>
                                    <transformer
                                            implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                        <resource>META-INF/spring.schemas</resource>
                                    </transformer>
                                </transformers>
                            </configuration>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>
    </project>
    

    导入完毕后左边多出很多jar包,右边则没有红色横杠



    3.开始开发
    在java下新建new>package


    上传服务器,加载jar包,创建临时函数

    add jar /usr/local/usrJars/dulm/hiveUDF-0.0.1-SNAPSHOT-all.jar; 
    create temporary function my_uppercase as 'com.neu.hive.UDF.ToUpperCaseUDF';
    创建临时的方法叫做my_uppercase   as  你的包名+类名。
    select my_uppercase(datasource) from tenmindata limit 10;
    测试使用: 选出字段datasource下的数据并全部转为大写,显示前10条。
    
    码表是txt格式,关联表不管什么格式,输入输出都是string类型
    add jar /root/yl/udf_province4.jar;
    create temporary function split_province_txt as 'hive_udf_province.UDF_province_name_txt';
    select split_province_txt(province_id) from yl.dim_province;
    select split_province_txt(province_id) from yl.dim_province_orc;
    

    线上服务要在每个库上面创建一下永久函数。

    创建永久函数
    CREATE FUNCTION dws.bss_city_code AS 'com.ysten.bigdata.hive.udf.GetCityBYBssCity';
    

    相关文章

      网友评论

          本文标题:idea用maven开发hive的udf详细过程

          本文链接:https://www.haomeiwen.com/subject/ycyrixtx.html