美文网首页
Hive从入门到精通6:Hive中的查询操作

Hive从入门到精通6:Hive中的查询操作

作者: 金字塔下的小蜗牛 | 来源:发表于2020-04-02 10:51 被阅读0次

    本节使用的测试数据:

    hive (default)> select * from dept_in;
    dept_in.deptno dept_in.deptname dept_in.address
    10 ACCOUNTING NEW YORK
    20 RESEARCH DALLAS
    30 SALES CHICAGO
    40 OPERATIONS BOSTON

    hive (default)> select * from emp_in;
    emp_in.empno emp_in.ename emp_in.job emp_in.mgr emp_in.hiredate emp_in.sal emp_in.comm emp_in.deptno
    7369 SMITH CLERK 7902 1980/12/17 800 NULL 20
    7499 ALLEN SALESMAN 7698 1981/2/20 1600 300 30
    7521 WARD SALESMAN 7698 1981/2/22 1250 500 30
    7566 JONES MANAGER 7839 1981/4/2 2975 NULL 20
    7654 MARTIN SALESMAN 7698 1981/9/28 1250 1400 30
    7698 BLAKE MANAGER 7839 1981/5/1 2850 NULL 30
    7782 CLARK MANAGER 7839 1981/6/9 2450 NULL 10
    7788 SCOTT ANALYST 7566 1987/4/19 3000 NULL 20
    7839 KING PRESIDENT NULL 1981/11/17 5000 NULL 10
    7844 TURNER SALESMAN 7698 1981/9/8 1500 0 30
    7876 ADAMS CLERK 7788 1987/5/23 1100 NULL 20
    7900 JAMES CLERK 7698 1981/12/3 950 NULL 30
    7902 FORD ANALYST 7566 1981/12/3 3000 NULL 20
    7934 MILLER CLERK 7782 1982/1/23 1300 NULL 10

    1.简单查询

    hive (default)> select ename,job from emp_in;
    ename job
    SMITH CLERK
    ALLEN SALESMAN
    WARD SALESMAN
    JONES MANAGER
    MARTIN SALESMAN
    BLAKE MANAGER
    CLARK MANAGER
    SCOTT ANALYST
    KING PRESIDENT
    TURNER SALESMAN
    ADAMS CLERK
    JAMES CLERK
    FORD ANALYST
    MILLER CLERK</pre>

    2.过滤

    hive (default)> select ename,sal from emp_in where sal > 2000;
    ename sal
    JONES 2975
    BLAKE 2850
    CLARK 2450
    SCOTT 3000
    KING 5000
    FORD 3000

    hive (default)> select ename,deptno from emp_in where deptno != 20;
    ename deptno
    ALLEN 30
    WARD 30
    MARTIN 30
    BLAKE 30
    CLARK 10
    KING 10
    TURNER 30
    JAMES 30
    MILLER 10</pre>

    3.排序

    hive (default)> select ename,sal from emp_in order by sal;
    ename sal
    SMITH 800
    JAMES 950
    ADAMS 1100
    WARD 1250
    MARTIN 1250
    MILLER 1300
    TURNER 1500
    ALLEN 1600
    CLARK 2450
    BLAKE 2850
    JONES 2975
    SCOTT 3000
    FORD 3000
    KING 5000

    hive (default)> select ename,sal from emp_in sort by sal desc;
    ename sal
    KING 5000
    FORD 3000
    SCOTT 3000
    JONES 2975
    BLAKE 2850
    CLARK 2450
    ALLEN 1600
    TURNER 1500
    MILLER 1300
    MARTIN 1250
    WARD 1250
    ADAMS 1100
    JAMES 950
    SMITH 800</pre>

    4.多表查询

    4.1隐式JOIN操作

    Hive-0.13开始支持隐式join操作,from后面可以跟多个表,以逗号分隔:

    hive (default)> select e.ename,d.deptname,d.address from dept_in d,emp_in e where d.deptno = e.deptno;
    e.ename d.deptname d.address
    SMITH RESEARCH DALLAS
    ALLEN SALES CHICAGO
    WARD SALES CHICAGO
    JONES RESEARCH DALLAS
    MARTIN SALES CHICAGO
    BLAKE SALES CHICAGO
    CLARK ACCOUNTING NEW YORK
    SCOTT RESEARCH DALLAS
    KING ACCOUNTING NEW YORK
    TURNER SALES CHICAGO
    ADAMS RESEARCH DALLAS
    JAMES SALES CHICAGO
    FORD RESEARCH DALLAS
    MILLER ACCOUNTING NEW YORK</pre>

    4.2显式JOIN操作

    hive (default)> select e.ename,d.deptname,d.address from dept_in d join emp_in e on d.deptno = e.deptno;
    e.ename d.deptname d.address
    SMITH RESEARCH DALLAS
    ALLEN SALES CHICAGO
    WARD SALES CHICAGO
    JONES RESEARCH DALLAS
    MARTIN SALES CHICAGO
    BLAKE SALES CHICAGO
    CLARK ACCOUNTING NEW YORK
    SCOTT RESEARCH DALLAS
    KING ACCOUNTING NEW YORK
    TURNER SALES CHICAGO
    ADAMS RESEARCH DALLAS
    JAMES SALES CHICAGO
    FORD RESEARCH DALLAS
    MILLER ACCOUNTING NEW YORK

    更多详细内容请参考Hive帮助文档-Joins

    5.子查询

    5.1From子句中的子查询

    例1:简单子查询

    hive (default)> select tmp.ename from (select ename,deptno from emp_in ) tmp where tmp.deptno = 10;
    tmp.ename
    CLARK
    KING
    MILLER

    例2:包含Union All的子查询

    hive (default)> select tmp.ename, tmp.deptno from (
    select ename, deptno from emp_in where deptno = 10
    union all
    select ename, deptno from emp_in where deptno = 30
    ) tmp order by tmp.ename;
    tmp.ename tmp.deptno
    ALLEN 30
    BLAKE 30
    CLARK 10
    JAMES 30
    KING 10
    MARTIN 30
    MILLER 10
    TURNER 30
    WARD 30

    5.2 where子句中的子查询

    例1:使用IN或NOT IN约束

    hive (default)> select ename,deptno from emp_in where deptno in (select deptno from dept_in where deptno != 30);
    ename deptno
    SMITH 20
    JONES 20
    CLARK 10
    SCOTT 20
    KING 10
    ADAMS 20
    FORD 20
    MILLER 10

    例2:使用EXISTS或NOT EXISTS约束

    hive (default)> select e.ename,e.job,e.deptno from emp_in e where exists (select d.deptname from dept_in d where e.deptno = d.deptno );
    e.ename e.job e.deptno
    SMITH CLERK 20
    ALLEN SALESMAN 30
    WARD SALESMAN 30
    JONES MANAGER 20
    MARTIN SALESMAN 30
    BLAKE MANAGER 30
    CLARK MANAGER 10
    SCOTT ANALYST 20
    KING PRESIDENT 10
    TURNER SALESMAN 30
    ADAMS CLERK 20
    JAMES CLERK 30
    FORD ANALYST 20
    MILLER CLERK 10</pre>

    关于where子句中的子查询,有下面几点需要注意:

    1. 这些子查询只能用于表达式的右边
    2. IN或NOT IN子查询可以只选择单个列
    3. EXISTS或NOT EXISTS必须有一个或多个相关约束
    4. where子句的子查询不会引用父查询的结果
    5. where子句的子查询的where子句中可以引用父查询的结果

    更多详细内容请参考Hive帮助文档-SubQueries

    相关文章

      网友评论

          本文标题:Hive从入门到精通6:Hive中的查询操作

          本文链接:https://www.haomeiwen.com/subject/mgrkdhtx.html