Handson-3

作者: dynamicheart | 来源:发表于2017-12-22 16:19 被阅读0次

    handson3

    515030910223
    杨健邦

    • 整个框架
    Hostname Zookeeper HDFS HBASE Spark
    hadoop-master NameNode
    SecondaryNameNode
    hadoop-slave DataNode
    hbase-master YES HMaster
    hbase-region1 YES RegionServer
    BackupMaster
    hbase-region2 YES RegionServer
    BackupMaster
    spark-master Master
    spark-worker1 Worker
    spark-worker2 Worker
    spark-worker3 Worker
    spark-worker4 Worker

    Question 1:
    After configuring HDFS, please type "jps" in the bash. What is the result for the two containers?

    • Master Node
    root@hadoop-master:~# jps
    516 ResourceManager
    786 Jps
    164 NameNode
    363 SecondaryNameNode
    
    • Slave Node
    root@hadoop-slave:~# jps
    69 DataNode
    310 Jps
    181 NodeManager
    

    Question 2:

    If you use a standalone ZooKeeper service, after setting up, type

    bin/zkServer.sh status
    

    Is there any difference among the outputs from containers? If so, what's the difference?

    root@habase-region1:~# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
    Mode: follower
    
    root@hbase-region2:~# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
    Mode: leader
    
    root@hbase-master:~# zkServer.sh status
    ZooKeeper JMX enabled by default
    Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
    Mode: follower
    

    Question 3:
    After killing the primary master, the backup master should be elected as a new primary. Please read the log in ZooKeeper and HBase and describe what acutally happens.

    • hbase-region1 中hbase的log
    2017-12-22 08:09:15,911 INFO  [hbase-region1:16000.activeMasterManager] master.ActiveMasterManager: Another master is the active master, hbase-master,16000,1513930144374; waiting to become the next active master
    
    2017-12-22 08:10:10,039 INFO  [hbase-region1:16000.activeMasterManager] master.ActiveMasterManager: Another master is the active master, hbase-region2,16000,1513930151343; waiting to become the next active master
    
    • hbase-region2 中hbase的log
    2017-12-22 08:09:15,963 INFO  [hbase-region2:16000.activeMasterManager] master.ActiveMasterManager: Another master is the active master, hbase-master,16000,1513930144374; waiting to become the next active master
    
    2017-12-22 08:10:10,046 INFO  [hbase-region2:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=hbase-region2,16000,1513930151343
    

    一开始,hbase-master节点是primary,其它hbase-region节点是作为backup primary的,每个节点都有zookeeper的进程来维护“谁是primary的一致性”。其它两个backup一直在监测有没有primary的存在,没有的话就一直等待。当发现master无法正常连接到的时候,hbase-region2的activeManger通过zookeeper将自己变为了primary,系统继续正常工作。

    Question 4
    Can you find where the data is stored in HDFS? Please answer it in detail by descibing the files within the directories related to HBase.

    <configuration>
      <property>
        <name>hbase.rootdir</name>
        <value>hdfs://hadoop-master:9000/hbase</value>
      </property>
      <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
      </property>
      <property>
        <name>hbase.zookeeper.quorum</name>
        <value>hbase-master,hbase-region1,hbase-region2</value>
      </property>
    </configuration>
    

    配置文件将hbase的数据放在HDFS的根目录的一个叫作hbase的目录里面。

    root@hadoop-master:~# hadoop fs -ls /
    Found 1 items
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase
    
    root@hadoop-master:~# hadoop fs -ls /hbase
    Found 8 items
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase/.tmp
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase/MasterProcWALs
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase/WALs
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase/corrupt
    drwxr-xr-x   - root supergroup          0 2017-12-22 07:48 /hbase/data
    -rw-r--r--   3 root supergroup         42 2017-12-22 07:48 /hbase/hbase.id
    -rw-r--r--   3 root supergroup          7 2017-12-22 07:48 /hbase/hbase.version
    drwxr-xr-x   - root supergroup          0 2017-12-22 08:26 /hbase/oldWALs
    

    每张表以它的名字作为文件夹放在/hbase/data中,如果未指定namespace,则在/hbase/data/default中。

    Question5
    What parameters need be configured to run the code above? Please list them all.

    将hbase的配置信息加载。

    conf.addResource("hbase-site.xml")
    conf.set(TableInputFormat.INPUT_TABLE, "test")
    

    test 为表名字。

    object WorkCount {
      def main(args: Array[String]) {
        val sparkConf = new SparkConf().setAppName("Spark_hbase").setMaster("local[2]")
        val sc = new SparkContext(sparkConf)
        val conf = HBaseConfiguration.create()
        conf.addResource("hbase-site.xml")
        conf.set(TableInputFormat.INPUT_TABLE, "test")
        val usersRDD = sc.newAPIHadoopRDD(conf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable],                        classOf[org.apache.hadoop.hbase.client.Result])
        val count = usersRDD.count()
        println("Temp RDD count:" + count)
      }
    }
    
    

    相关文章

      网友评论

        本文标题:Handson-3

        本文链接:https://www.haomeiwen.com/subject/pckxgxtx.html