1、HDFS的系统结构
cb2effc8389cc05bf5fcde353a07df492、NameNode
NameNode是Apache Hadoop HDFS体系结构中的主节点,用于维护和管理DataNode(从节点)上存在的块。NameNode是一个非常高可用性的服务器,用于管理文件系统命名空间并控制客户端对文件的访问。HDFS体系结构的构建方式使用户数据永远不会驻留在NameNode上。数据仅驻留在DataNodes上。
Functions of NameNode:
-
It is the master daemon that maintains and manages the DataNodes (slave nodes)(它是维护和管理数据节点(从节点)的主守护程序)
-
It records the metadata of all the files stored in the cluster, e.g. The location of blocks stored, the size of the files, permissions, hierarchy, etc. There are two files associated with the metadata:(它记录集群中存储的所有文件的元数据,如存储块的位置、文件的大小、权限、层次结构等。有两个与元数据关联的文件)
- FsImage: It contains the complete state of the file system namespace since the start of the NameNode.(它包含从NameNode开始的文件系统名称空间的完整状态)
- EditLogs: It contains all the recent modifications made to the file system with respect to the most recent FsImage.(它包含最近对文件系统所做的与最新的FsImage相关的所有修改)
-
It records each change that takes place to the file system metadata. For example, if a file is deleted in HDFS, the NameNode will immediately record this in the EditLog.(它记录文件系统元数据发生的每个更改。例如,如果在HDFS中删除了一个文件,NameNode将立即将其记录到EditLog中。)
-
It regularly receives a Heartbeat and a block report from all the DataNodes in the cluster to ensure that the DataNodes are live.(它定期从集群中的所有datanode接收一个心跳和一个块报告,以确保datanode是活动的)
-
It keeps a record of all the blocks in HDFS and in which nodes these blocks are located.(它记录HDFS中的所有块以及这些块所在的节点)
-
The NameNode is also responsible to take care of the replication factor of all the blocks which we will discuss in detail later in this HDFS tutorial blog.(NameNode还负责处理所有块的复制因子)
-
In case of the DataNode failure, the NameNode chooses new DataNodes for new replicas, balance disk usage and manages the communication traffic to the DataNodes.(在DataNode失败的情况下,NameNode为新的副本选择新的DataNode,平衡磁盘使用,并管理到DataNode的通信流量)
3、Secondary NameNode
第二名称节点(Secondary NameNode,SNN)是用于定期合并命名空间镜像和镜像编辑日志的辅助守护进程。和名称节点一样,每个集群都有一个第二名称节点,在大规模部署的集群条件下,一般第二名称节点也独自占用一台服务器。
除了这两个守护进程之外,还有第三个守护进程或称为辅助NameNode的进程。辅助NameNode作为辅助守护进程与主NameNode并发工作。不要混淆次要的NameNode是备份的NameNode,因为它不是。
58d35490c1dcef307dd07740d55be20fFunctions of Secondary NameNode:
- The Secondary NameNode is one which constantly reads all the file systems and metadata from the RAM of the NameNode and writes it into the hard disk or the file system.(辅助NameNode是一个不断从NameNode的RAM中读取所有文件系统和元数据并将其写入硬盘或文件系统中的节点)
- It is responsible for combining the EditLogs with FsImage from the NameNode. (它负责将editlog与NameNode中的FsImage结合起来)
- It downloads the EditLogs from the NameNode at regular intervals and applies to FsImage. The new FsImage is copied back to the NameNode, which is used whenever the NameNode is started the next time.(它会定期从NameNode下载EditLogs,并将其应用于FsImage。
新的FsImage被复制回NameNode,下一次启动NameNode时将使用该FsImage)
Hence, Secondary NameNode performs regular checkpoints in HDFS. Therefore, it is also called CheckpointNode(因此,辅助NameNode在HDFS中执行常规检查点。因此,它也称为CheckpointNode。)
4、DataNode
Functions of DataNode:
- These are slave daemons or process which runs on each slave machine.(这些是从守护进程或进程在每个从机器上运行。)
- The actual data is stored on DataNodes.(实际数据存储在DataNodes上)
- The DataNodes perform the low-level read and write requests from the file system’s clients.(数据节点执行来自文件系统客户端的低级读写请求)
- They send heartbeats to the NameNode periodically to report the overall health of HDFS, by default, this frequency is set to 3 seconds.(它们定期向NameNode发送心跳以报告HDFS的整体健康状况,默认情况下,该频率设置为3秒)
到现在为止,你一定已经意识到NameNode对我们来说非常重要。如果失败了,我们就完了。但是不要担心,我们将在下一篇Apache Hadoop HDFS架构博客中讨论Hadoop如何解决这个单点故障问题。
5、Block
block不过是硬盘上存储数据的最小连续位置。通常,在任何文件系统中,您都将数据存储为块的集合。类似地,HDFS将每个文件存储为块,这些块分散在整个Apache Hadoop集群中。在Apache Hadoop 2.x中,每个块的默认大小是128 MB(在Apache Hadoop 1.x中是64 MB),您可以根据需要进行配置。
image-20191109195948409在HDFS中,不必将每个文件都以配置的块大小的精确倍数存储(128 MB,256 MB等)。让我们举一个例子,我有一个大小为514 MB的文件“ example.txt”,如上图所示。假设我们使用的默认块大小配置为128 MB。那么,将创建多少个块?5个:前四个块的大小为128 MB。但是,最后一个块的大小仅为2 MB。
现在,你一定在想为什么我们需要这么大的块大小,即128兆字节?
每当我们谈到HDFS,我们就谈到巨大的数据集,即兆兆字节和千兆字节的数据。因此,如果我们有一个比如4 KB的块大小,就像在Linux文件系统中一样,我们会有太多的块,因此会有太多的元数据。因此,管理这些数量的块和元数据会产生巨大的开销,这是我们不想要的
6、Replication Management
HDFS提供了一种以数据块的形式在分布式环境中存储大量数据的可靠方法。还复制这些块以提供容错能力。默认的复制因子是3,这也是可配置的。因此,如下图所示,每个块复制三次并存储在不同的datanode上(考虑默认的复制因子)
54c0b332b83cdec383d83aa2b3c7acf7NameNode定期从DataNode收集阻止报告以维护复制因子。因此,每当块被过度复制或复制不足时,NameNode都会根据需要删除或添加副本。
7、Rack
87c5cb756434a88fba0ff32a4828cacd同样,NameNode还确保所有副本都不存储在同一机架或单个机架中。它遵循内置的机架感知算法,以减少延迟并提供容错能力。考虑到复制因子为3,机架感知算法表示,一个块的第一个副本将存储在本地机架上,接下来的两个副本将存储在不同的(远程)机架上。下面就是实际的Hadoop生产集群的外观。在这里,可以装多个有DataNodes的机架。
image-20191109200316535Advantages of Rack Awareness:
So, now you will be thinking why do we need a Rack Awareness algorithm? The reasons are:
-
To improve the network performance: The communication between nodes residing on different racks is directed via switch. In general, you will find greater network bandwidth between machines in the same rack than the machines residing in different rack. So, the Rack Awareness helps you to have reduce write traffic in between different racks and thus providing a better write performance. Also, you will be gaining increased read performance because you are using the bandwidth of multiple racks.(为了提高网络性能:位于不同机架上的节点之间的通信通过交换机进行。通常,您会发现同一机架中的机器之间的网络带宽大于不同机架中的机器。因此,机架感知有助于减少不同机架之间的写入流量,从而提供更好的写入性能。此外,您将获得更高的读取性能,因为您正在使用多个机架的带宽。)
-
To prevent loss of data: We don’t have to worry about the data even if an entire rack fails because of the switch failure or power failure. And if you think about it, it will make sense, as it is said that never put all your eggs in the same basket.(为防止数据丢失:即使由于交换机故障或电源故障导致整个机架出现故障,我们也不必担心数据。如果你仔细想想,这是有道理的,因为据说永远不要把所有的鸡蛋放在同一个篮子里。*)
8、参考资料
https://www.edureka.co/blog/apache-hadoop-hdfs-architecture/
网友评论