美文网首页
Sqoop安装和简单使用

Sqoop安装和简单使用

作者: 百里香香 | 来源:发表于2018-02-24 16:52 被阅读0次
    下载sqoop

    因为官方并不建议在生产环境中使用sqoop2,即1.99.7,所以这里我用的是sqoop1,即1.4.7
    点击下载:http://mirror.bit.edu.cn/apache/sqoop/

    解压和上传

    我用的xshell辅助工具,先在本地解压改了名称之后上传到Linux上的


    sqoop.png

    上传linux


    linux下.png

    也可上传压缩包,执行解压缩,执行命令:
    cd /home/hadoop
    tar -xvf sqoop-1.4.7.bin__hadoop-2.6.0.tar

    配置环境变量
    1.配置环境变量

    编辑/etc/profile文件,添加SQOOP_HOME变量,并且将$SQOOP_HOME/bin添加到PATH变量中

    添加后的profile文件内容
    export JAVA_HOME=/home/hadoop/jdk1.8.0_144
    export HADOOP_HOME=/home/hadoop/hadoop-2.7.2
    export HBASE_HOME=/home/hadoop/hbase-1.2.2
    export HIVE_HOME=/home/hadoop/hive-2.1.0
    export SQOOP_HOME=/home/hadoop/sqoop-1.4.7
    export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin:$HIVE_HOME/bin:$SQOOP_HOME/bin:$PATH
    

    编辑完成后,执行命令: source /etc/profile

    2.sqoop配置文件修改

    2.1 sqoop-env.sh 文件
    进入 /home/hadoop/sqoop-1.4.7/conf 目录下,也就是执行:
    cd /home/hadoop/sqoop-1.4.7/conf

    将sqoop-env-template.sh复制一份,并取名为sqoop-env.sh,也就是执行命令:
    cp sqoop-env-template.sh sqoop-env.sh

    文件末尾加入一下配置:
    #Set the path for where zookeper config dir is
    #export ZOOCFGDIR=
    export HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.7.2
    export HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.7.2
    export HIVE_HOME=/home/hadoop/hive-2.1.0
    export HBASE_HOME=/home/hadoop/hbase-1.2.2
    

    注:上面的路径要改为自己的路径

    3.把MySQL的驱动包上传到sqoop的lib下
    驱动包.png

    本人测试这个版本在这里是可以的

    4.使用sqoop

    安装后,如果命令不涉及hive和hdfs的,可以不启动,例如sqoop help命令:
    此命令帮助查看sqoop有哪些命令

    [hadoop@master ~]$ sqoop help
    Warning: /home/hadoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
    Please set $HCAT_HOME to the root of your HCatalog installation.
    Warning: /home/hadoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    Warning: /home/hadoop/sqoop-1.4.7/../zookeeper does not exist! Accumulo imports will fail.
    Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
    18/02/24 16:39:16 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
    usage: sqoop COMMAND [ARGS]
    
    Available commands:
      codegen            Generate code to interact with database records
      create-hive-table  Import a table definition into Hive
      eval               Evaluate a SQL statement and display the results
      export             Export an HDFS directory to a database table
      help               List available commands
      import             Import a table from a database to HDFS
      import-all-tables  Import tables from a database to HDFS
      import-mainframe   Import datasets from a mainframe server to HDFS
      job                Work with saved jobs
      list-databases     List available databases on a server
      list-tables        List available tables in a database
      merge              Merge results of incremental imports
      metastore          Run a standalone Sqoop metastore
      version            Display version information
    
    See 'sqoop help COMMAND' for information on a specific command.
    
    5.使用sqoop查看mysql中的数据表:

    进入$SQOOP_HOME/bin目录下执行如下命令:连接mysql看有多少个表

     ./sqoop list-databases --connect jdbc:mysql://192.168.1.34:3306/test?characterEncoding=UTF-8 --username root --password '123'
    
    命令.png
    6.把MySQL中的表导入hdfs中

    前提:一定要启动hdfs和yarn,本人忘了启动yarn,一直显示连接超时,搞了半天才反应过来.......

    sqoop import -m 1 --connect jdbc:mysql://192.168.1.34:3306/test --username root --password 123 --table name --target-dir /user/sqoop/datatest
    
    显示:
    18/02/24 15:52:32 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
    18/02/24 15:52:33 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
    18/02/24 15:52:33 INFO client.RMProxy: Connecting to ResourceManager at master.hadoop/192.168.139.128:8032
    18/02/24 15:52:38 INFO db.DBInputFormat: Using read commited transaction isolation
    18/02/24 15:52:38 INFO mapreduce.JobSubmitter: number of splits:1
    18/02/24 15:52:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1519458728102_0001
    18/02/24 15:52:39 INFO impl.YarnClientImpl: Submitted application application_1519458728102_0001
    18/02/24 15:52:39 INFO mapreduce.Job: The url to track the job: http://master.hadoop:8088/proxy/application_1519458728102_0001/
    18/02/24 15:52:39 INFO mapreduce.Job: Running job: job_1519458728102_0001
    18/02/24 15:52:51 INFO mapreduce.Job: Job job_1519458728102_0001 running in uber mode : false
    18/02/24 15:52:51 INFO mapreduce.Job:  map 0% reduce 0%
    18/02/24 15:52:57 INFO mapreduce.Job:  map 100% reduce 0%
    18/02/24 15:52:57 INFO mapreduce.Job: Job job_1519458728102_0001 completed successfully
    18/02/24 15:52:58 INFO mapreduce.Job: Counters: 30
        File System Counters
            FILE: Number of bytes read=0
            FILE: Number of bytes written=136482
            FILE: Number of read operations=0
            FILE: Number of large read operations=0
            FILE: Number of write operations=0
            HDFS: Number of bytes read=87
            HDFS: Number of bytes written=16
            HDFS: Number of read operations=4
            HDFS: Number of large read operations=0
            HDFS: Number of write operations=2
        Job Counters 
            Launched map tasks=1
            Other local map tasks=1
            Total time spent by all maps in occupied slots (ms)=3648
            Total time spent by all reduces in occupied slots (ms)=0
            Total time spent by all map tasks (ms)=3648
            Total vcore-milliseconds taken by all map tasks=3648
            Total megabyte-milliseconds taken by all map tasks=3735552
        Map-Reduce Framework
            Map input records=1
            Map output records=1
            Input split bytes=87
            Spilled Records=0
            Failed Shuffles=0
            Merged Map outputs=0
            GC time elapsed (ms)=88
            CPU time spent (ms)=630
            Physical memory (bytes) snapshot=100278272
            Virtual memory (bytes) snapshot=2060853248
            Total committed heap usage (bytes)=16719872
        File Input Format Counters 
            Bytes Read=0
        File Output Format Counters 
            Bytes Written=16
    18/02/24 15:52:58 INFO mapreduce.ImportJobBase: Transferred 16 bytes in 24.5469 seconds (0.6518 bytes/sec)
    18/02/24 15:52:58 INFO mapreduce.ImportJobBase: Retrieved 1 records.
    
    未完待续.....

    相关文章

      网友评论

          本文标题:Sqoop安装和简单使用

          本文链接:https://www.haomeiwen.com/subject/myzqxftx.html