美文网首页
Mac上Hadoop, HDFS, Hive, Spark环境的

Mac上Hadoop, HDFS, Hive, Spark环境的

作者: GYBE | 来源:发表于2019-01-18 14:52 被阅读0次

    安装前准备

    Homebrew

    参见: Mac下Homebrew的安装和使用

    jdk安装

    java -version

    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
    

    如果没有安装, 建议安装Java8

    brew cask install java # 安装最新版本
    # 安装Java8
    brew tap caskroom/versions
    brew cask install java8
    

    配置ssh

    配置ssh就是为了实现免密登录, 这样方便远程管理Hadoop并无需登录密码在Hadoop集群上共享文件资源

    如果你的机子没有配置ssh, 在命令终端输入ssh localhost是需要输入你的电脑登录密码的.
    配置好ssh后, 就无需输入密码了.

    1. 打开设置 > 共享 > 打开远程登陆


      image.png
    2. iterm(终端)执行
    ssh-keygen -t rsa # 然后yes
    cat ~/.ssh/local.pub >> ~/.ssh/authorized_keys
    

    现在, 在终端输入 ssh localhost就OK了.

    ssh localhost # ssh 登陆
    # Last login: Fri Jan 18 14:44:36 2019
    exit # 退出登陆
    # Connection to localhost closed.
    

    安装hadoop

    下载安装

    • brew install hadoop (推荐), 安装完成后你会看到安装路径在那里
    • 官网下载压缩包, 解压到你指定的目录, 然后安装(不推荐)

    配置hadoop

    配置hadoop-env.sh

    hadoop-env.sh位置:

    /usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop
    

    添加JAVA_HOME路径

    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_202.jdk/Contents/Home 
    # Mac查看jdk 位置 /usr/libexec/java_home -V
    

    配置core-site.xml

    配置hdfs地址和端口

    <configuration>
      <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:8020</value>
      </property>
      <!-- 以下配置可防止系统重启导致NameNode 不能启动-->
      <!-- /Users/用户名/data 这个路径你可以随便配置, hadoop必须有权限-->
      <property>  
        <name>hadoop.tmp.dir</name>  
        <value>/Users/用户名/data/hadoop/tmp</value>  
        <description>A base for other temporary directories.</description>  
      </property> 
      <!-- DataNode存放块数据的本地文件系统路径 -->
      <property>  
        <name>dfs.name.dir</name>  
        <value>/Users/用户名/data/hadoop/filesystem/name</value>  
        <description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>  
      </property>  
      <property>  
        <name>dfs.data.dir</name>  
        <value>/Users/用户名/data/hadoop/filesystem/data</value>  
        <description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>  
      </property>  
    </configuration>
    

    配置hdfs-site.xml

    修改HDFS备份数, 配置namenode和datanode

    <configuration>
      <property>
        <name>dfs.replication</name>
        <value>1</value>
      </property>
    </configuration>
    

    配置mapred-site.xml

    配置mapreduce中jobtracker的地址和端口. 3.1.1版本下有这个文件, 可直接配置

    <configuration>
      <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
      </property>
    </configuration>
    

    配置yarn-site.xml

    <configuration> 
      <property> 
        <name>yarn.nodemanager.aux-services</name> 
        <value>mapreduce_shuffle</value> 
      </property> 
      <property> 
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value> 
      </property> 
    </configuration>
    

    格式化HDFS

    # /usr/local/Cellar/hadoop/3.1.1/libexec
    bin/hdfs namenode -format
    

    运行

    jps 可以查看进程

    jps
    # 34214 NameNode
    # 34313 DataNode
    # 34732 NodeManager
    # 34637 ResourceManager
    # 34446 SecondaryNameNode
    # 34799 Jps
    

    安装hive

    下载安装

    brew install hive
    

    配置Hive元数据库

    Hive默认用derby作为元数据库这, 我们这里换用大家熟悉的mysql来存储元数据

    # 进入数据库
    mysql -uroot -p 
    # 在数据库执行
    CREATE DATABASE metastore;
    # CREATE user 'hive'@'localhost' IDENTIFIED BY 'hive';
    # Unable to load authentication plugin 'caching_sha2_password'.
    ALTER USER 'hive'@'localhost' IDENTIFIED WITH mysql_native_password BY 'hive';
    GRANT SELECT,INSERT,UPDATE,DELETE,ALTER,CREATE,INDEX,REFERENCES ON METASTORE.* TO 'hive'@'localhost';
    FLUSH PRIVILEGES;
    

    配置hive

    配置mysql-connector jar包

    下载地址: https://dev.mysql.com/downloads/connector/j/
    将下载的文件解压, 复制

    cp mysql-connector-java-5.1.44-bin.jar /usr/local/Cellar/hive/3.1.1/libexec/lib/
    

    配置hive-site.xml

    修改以下部分

    <property>
      <name>javax.jdo.option.ConnectionURL</name>
      <value>jdbc:mysql://localhost/metastore</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionDriverName</name>
      <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionUserName</name>
      <value>hive(配置Hive元数据库: mysql中创建的用户名)</value>
    </property>
    <property>
      <name>javax.jdo.option.ConnectionPassword</name>
      <value>hive(配置Hive元数据库: mysql中创建的用户密码)</value>
    </property>
    <property>
      <name>hive.exec.local.scratchdir</name>
      <value>/Users/用户名/data/hive</value>
    </property>
    <property>
      <name>hive.querylog.location</name>
      <value>/Users/用户名/data/hive/querylog</value>
    </property>
    <property>
      <name>hive.downloaded.resources.dir</name>
      <value>/Users/用户名/data/hive/download</value>
    </property>
    <property>
      <name>hive.server2.logging.operation.log.location</name>
      <value>/Users/用户名/data/hive/log</value>
    </property>
    

    注意3210行可能会有�最好删掉, 不然在初始化元数据库会报错

    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/3.1.1/libexec/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/3.1.1/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
     at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3003)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2931)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2806)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:1460)
        at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
        at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
        at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
        at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5098)
        at org.apache.hive.beeline.HiveSchemaTool.<init>(HiveSchemaTool.java:96)
        at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:318)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
    Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
     at [row,col,system-id]: [3210,96,"file:/usr/local/Cellar/hive/3.1.1/libexec/conf/hive-site.xml"]
        at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
        at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
        at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
        at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
        at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
        at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
        at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
        at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
        at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3257)
        at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3063)
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2986)
        ... 15 more
    

    初始化元数据库

    schematool -initSchema -dbType mysql
    

    现在进入数据库metastore, 可以看到相关表(此处只做部分表展示)

    mysql> show tables;
    +-------------------------------+
    | Tables_in_metastore           |
    +-------------------------------+
    | AUX_TABLE                     |
    | BUCKETING_COLS                |
    | CDS                           |
    | COLUMNS_V2                    |
    | COMPACTION_QUEUE              |
    | COMPLETED_COMPACTIONS         |
    | COMPLETED_TXN_COMPONENTS      |
    | CTLGS                         |
    | DATABASE_PARAMS               |
    | DB_PRIVS                      |
    

    运行Hive

    安装spark

    brew install apache-spark
    

    一般直接安装就好了, 然后直接运行spark-shell

    实际上完成以上配置之后还是会有一些问题, 大家可以评论一起讨论

    相关文章

      网友评论

          本文标题:Mac上Hadoop, HDFS, Hive, Spark环境的

          本文链接:https://www.haomeiwen.com/subject/wmdndqtx.html