Zookeeper

作者: 上山走18398 | 来源:发表于2019-10-27 23:39 被阅读0次

    简介

    要开始学习了,首先要了解你准备学习的东西叫啥,环境如何搭建,他是用来干什么的,他是如何实现的;利用这套框架可以帮助我们解决什么,提升什么,他与市面上同类型的框架的异同点;
    3W2H分析法
    5Why分析法

    架构/功能/设计理念/特性
    分布式场景的问题
    高并发 高吞吐 低延迟 可靠无单点故障 replica
    分布式事务
    分布式数据一致性 、分布式锁、分布式队列
    Zab协议(Zookeeper Atomic Broadcast)
    Paxos协议
    领导者选举,快速选举 过半原则 Zxid
    CAP
    2P(Two-phased Commit)
    3P
    监控性能指标

    Zookeeper

    Zookeeper是啥

    zookeeper is ordered
    zookeeper is fast 10:1 读写

    为分布式应用程序提供分布式的协调服务,
    through a shared hierarchical namespace which is organized similarly to a standard file system;
    Zookeeper 是高性能的分布式协同服务管理hosts集群

    在分布式系统中,管理和协同工作是一项非常复杂的过程(zookeeper就是为了解决这个问题,他是如何解决的呢,他是采用了何种架构以及开放的API)

    假设你搞定了分布式系统中的协同和管理的问题,那开发者就可以专注到应用代码,而不用考虑分布式带来的问题(这就是zookeeper的带来帮助)

    For example,Apache HBASE uses Zookeeper to track the status and distributed data

    目标:利用zookeeper搭建分布式集群

    分布式集群

    Cluster

    Node

    分布式应用有两部分组成: Server and client Application

    Server application are actually distributded and have **a common interface **

    so that clients can connect to any server in the cluster and get the same result;

    image

    分布式应用的优点:

    1. Reliability --- 不会出现单点故障,使得整个系统挂掉

    2. Scalability --- 可以无缝的添加物理机器而不影响系统服务,改改配置文件

    3. Transparency --- 隐藏了复杂的系统组成,show itself as a single entity / application

    分布式的缺点:

    1. Race condition --- 共享资源在任意时刻,只允许一台机器进行修改

    2. DeadLock --- Two or more operations waiting for each other to complete indefinitely

    3. Inconsistency --- Partial failure of data

    Apache Zookeeper

    Apache Zookeeper is a service used by a cluster (groups of nodes) to coordinate between themselves and maintain shared data with robust synchronization techniques.

    协同 维护共享数据

    Zookeeper提供了以下服务:

    1. Naming service — Identifying the nodes in a cluster by name。It is similar to DNS,but for nodes

    the name space consists of data registers - called znodes
    Zookeeper data is kept in-memory->所以能够高吞吐低延迟
    严格有序,所以可以用这个来实现分布式锁
    Zookeeper was designed to store coordination data :
    status information
    configuration
    location information
    所以节点存储的内容非常的小,一般1m左右,分布式数据
    每个节点还维护着一个stat structure

    1. Configuration management(maintenance) —

    2. Cluster management -- Joining/leaving of a node in a cluster and node status at real time

    3. Leader election -- Electing a node as leader for coordination purpose

    4. Locking and synchronization service -- Locking the data while modifying it

    5. Highly reliable data registry --当一个node或者多个node挂掉,还有数据可用

    Apache Zookeeper framework provides a complete mechanism to solving the Race condition and deadlock are handled using fail-safe synchronization approach ,数据不一致的情况,通过atomicity 来解决;

    使用zookeeper的优势

    1. Simple distributed coordination process

    2. Synchronization

    3. Ordered Messages

    4. Serialization

    5. Reliability

    6. Atomicity

    Zookeeper的架构

    image

    Clients:one of the nodes in our distributed application cluster (发送心跳)

    Server:one of the nodes in our Zookeeper ensemble (通过一些信息告诉客户端我还活着)

    Ensemble: Group of Zookeeper servers.(3)

    Leader: Leaders are elected on service startup

    Follower: server node which follows leader instruction

    Hierarchical Namespace

    Zookeeper node is referred as znode;

    Every znode is identified by a name and separated by a sequence of path(/)

    image

    Every znode in the Zookeeper data model maintains a stat structure;

    A stat simply provides the metadata of a znode

    --这边就说znode 是啥,里面放了些什么,分别是用来干什么的--

    metadata包含了哪些东西:

    1. Version number(版本号控制,每次改动会增加)

    2. Action Control List(ACL) --控制着所有znode的读和写的权限

    3.Timestamp --创建和修改经历的时间(represents time elapsed),毫秒级,Zookeeper通过"Transaction ID"来标识每个改动并告知Znodes--Zxid是唯一的并且保存着每一个事物的时间

    4. Datalength Total amount of the data stored in a znode(最多可以存储1MB)

    Types of Znodes

    persistence: default

    sequential: /myapp -> /myapp0000000001 ,next sequence number as 0000000002 ,每个znode的number保证不同 Locking and Synchronization

    ephemeral : Leader Election

    Sessions

    Session are very important for the operation of Zookeeper(http??)

    FIFO order;

    1. Once a client connects to a server --> session session id

    2. keep the session valid --> client sends heartbeats at a paticular time (设定时间内_启动服务时设定的,未收到干掉client)

    3. Session timeouts --> milliseconds

    Watchs

    client的一种简单机制(观察者模式),当Zookeeper ensemble有改变时通知client

    当监听一个特殊的Znode,Watchs会通知这些注册的client,当znode(client注册znode)发生改变时

    当client断连,watchs也失效了

    Zookeeper - Workflow

    1. 服务端启动 wait for,Once a Zookeeper ensemble starts,it will wait for the cilents to connect.

    2. client 建立连接(和某一个znode)

    3. SessionId + an acknowledgement

    4. 如果没收到acknowledgement,连另外一个znode

    5. 一旦连接上了,就要定时发心跳确定没丢失

    如果想获得特定的znode,可以请求 read request to the node with the znode path,然后node就去他们的database 拿东西;

    如果client想存储些数据到zookeeper ensemble:

    sends the znode path and the data to the server
    
    The connected server will forward the request to the leader and then the leader will reissue the writing request to all the followers ,然后看写入是否成功
    

    Nodes in a Zookeeper Ensemble

    A single Node : 单点故障,不推荐

    two nodes:不推荐

    three nodes :推荐最小nodes数为3个

    four nodes: 推荐为3,5,7.。。。

    image

    Leader Election

    在zookeeper中如何选择leader

    1. 所有nodes创建sequential,ephemeral znode with same path

    **/app/leader_election/guid_**
    

    2. Zookeeper ensemble will append 10-digit sequence number /app/leader_election/guid_0000000001 .......

    Zookeeper

    zookeeper的安装

    zookeeper的配置

    zookeeper数据存储

    zookeeper命令行CLI:和zookeeper ensemble 交互

    Once the client starts,可以做如下操作:

    1. Create znodes

    2. Get data

    3. Watch znode for changes

    4 Set data

    5. Create children of a znode

    6. List children if a znode

    7. check status

    8. Remove/Delete znode

    Create Znodes

    create a znode with the given path.默认地都为persistent
    
    Ephemeral znodes(flag:e)自动删除当session过期
    
    Sequential znodes 保证znode path的唯一性
    

    Get Data

    get /path
    

    Watch

    get /path [watch] 1
    

    Set Data

    set /path /data
    

    Create Children /Sub-znode

    create /parent/path/subnode/path /data
    

    List Children

    ls /path
    

    Check Status

    stat /path
    

    Remove a ZNode

    rmr /path
    

    Zookeeper API

    Zookeeper作为注册中心

    在Zookeeper中,进行服务注册,实际上就是在zookeeper中创建一个Znode节点,
    该节点存储了该服务点IP、端口、调用方式(协议,序列化方式)
    他由服务提供者创建,以供服务消费者获取节点中的信息,
    从而定位到服务提供者真正网络拓扑位置以及如何调用。
    第一次消费后就会缓存在本地,并且监控上下线 => 心跳检测

    感知服务上下线:

    1. zookeeper注册中心,心跳检测,定时向各个服务提供者发送请求(实际上建立一个socket长连接),如长时间未响应剔除,znode路径删除
    2. 服务消费者会去监听相应路径,一旦数据有变化,通知消费者

    如何保证分布式情况下的同步:

    1. Sequential Consistency-顺序一致性:按照客户端sent的顺序
    2. Atomicity:要么成功,要么失败,没有partial result
    3. Single System image:每个server看到的image都是一致的
    4. Reliability:一旦有了更新并且被应用,直到下一次更新
    5. Timeliness-及时性: within a certain time bound

    市面上常用的监控工具

    zkcomponents.jpg

    分布式事务的解决方案
    2p/3p是为了保证事务的ACID
    CAP

    Zookeeper Internals

    1. Atomic Broadcast: sync

    2. guarantees,properties and definitions

    Reliable delivery: 原子广播,全员都知道
    Total order:a message before b message by one server,通知全员也是a先于b
    causal order:

    Zookeeper分布式环境下,断网会出现两个leader的情况嘛

    不会出现这种脑裂情况,有过半机制保护,超过半数而去不等于半数,leader选举才能够生效

    zookeeper Leader选举机制

    过半机制,快速选举 zxid(事务ID)来标示数据的新旧,越大越新
    选举流程: A -> A B -> A B C ->C当领导,过半即获胜,
    存在网络交互谁的Zxid大,需要socket,所以只允许服务器ID大的去请求服务器ID小的,小的请求大的会被拒绝

    参考链接:https://www.cnblogs.com/leeSmall/p/9571514.html
    参考链接:https://www.cnblogs.com/zlslch/p/7667636.html
    分布式ID生成https://www.jianshu.com/p/9d7ebe37215e
    https://www.cnblogs.com/lanqiu5ge/p/9405601.html

    相关文章

      网友评论

          本文标题:Zookeeper

          本文链接:https://www.haomeiwen.com/subject/zdvvtctx.html