Zookeeper

作者: 上山走18398 | 来源:发表于2019-10-27 23:39 被阅读0次

简介

要开始学习了，首先要了解你准备学习的东西叫啥，环境如何搭建，他是用来干什么的，他是如何实现的；利用这套框架可以帮助我们解决什么，提升什么，他与市面上同类型的框架的异同点；
3W2H分析法
5Why分析法

架构/功能/设计理念/特性
分布式场景的问题
高并发高吞吐低延迟可靠无单点故障 replica
分布式事务
分布式数据一致性、分布式锁、分布式队列
Zab协议(Zookeeper Atomic Broadcast)
Paxos协议
领导者选举，快速选举过半原则 Zxid
CAP
2P(Two-phased Commit)
3P
监控性能指标

Zookeeper

Zookeeper是啥

zookeeper is ordered
zookeeper is fast 10:1 读写

为分布式应用程序提供分布式的协调服务，
through a shared hierarchical namespace which is organized similarly to a standard file system;
Zookeeper 是高性能的分布式的协同服务来管理hosts集群

在分布式系统中，管理和协同工作是一项非常复杂的过程（zookeeper就是为了解决这个问题，他是如何解决的呢，他是采用了何种架构以及开放的API）

假设你搞定了分布式系统中的协同和管理的问题，那开发者就可以专注到应用代码，而不用考虑分布式带来的问题（这就是zookeeper的带来帮助）

For example，Apache HBASE uses Zookeeper to track the status and distributed data

目标：利用zookeeper搭建分布式集群

分布式集群

Cluster

Node

分布式应用有两部分组成： Server and client Application

Server application are actually distributded and have **a common interface **

so that clients can connect to any server in the cluster and get the same result;

image

分布式应用的优点：

1. Reliability --- 不会出现单点故障，使得整个系统挂掉

2. Scalability --- 可以无缝的添加物理机器而不影响系统服务，改改配置文件

3. Transparency --- 隐藏了复杂的系统组成，show itself as a single entity / application

分布式的缺点：

1. Race condition --- 共享资源在任意时刻，只允许一台机器进行修改

2. DeadLock --- Two or more operations waiting for each other to complete indefinitely

3. Inconsistency --- Partial failure of data

Apache Zookeeper

Apache Zookeeper is a service used by a cluster (groups of nodes) to coordinate between themselves and maintain shared data with robust synchronization techniques.

协同维护共享数据

Zookeeper提供了以下服务：

Naming service — Identifying the nodes in a cluster by name。It is similar to DNS，but for nodes

the name space consists of data registers - called znodes
Zookeeper data is kept in-memory->所以能够高吞吐低延迟
严格有序，所以可以用这个来实现分布式锁
Zookeeper was designed to store coordination data :
status information
configuration
location information
所以节点存储的内容非常的小，一般1m左右，分布式数据
每个节点还维护着一个stat structure

Configuration management(maintenance) —
Cluster management -- Joining/leaving of a node in a cluster and node status at real time
Leader election -- Electing a node as leader for coordination purpose
Locking and synchronization service -- Locking the data while modifying it
Highly reliable data registry --当一个node或者多个node挂掉，还有数据可用

Apache Zookeeper framework provides a complete mechanism to solving the Race condition and deadlock are handled using fail-safe synchronization approach ，数据不一致的情况，通过atomicity 来解决；

使用zookeeper的优势

1. Simple distributed coordination process

2. Synchronization

3. Ordered Messages

4. Serialization

5. Reliability

6. Atomicity

Zookeeper的架构

image

Clients：one of the nodes in our distributed application cluster (发送心跳)

Server：one of the nodes in our Zookeeper ensemble (通过一些信息告诉客户端我还活着)

Ensemble: Group of Zookeeper servers.(3)

Leader: Leaders are elected on service startup

Follower: server node which follows leader instruction

Hierarchical Namespace

Zookeeper node is referred as znode；

Every znode is identified by a name and separated by a sequence of path(/)

image

Every znode in the Zookeeper data model maintains a stat structure;

A stat simply provides the metadata of a znode

--这边就说znode 是啥，里面放了些什么，分别是用来干什么的--

metadata包含了哪些东西：

1. Version number（版本号控制，每次改动会增加）

2. Action Control List（ACL） --控制着所有znode的读和写的权限

3.Timestamp --创建和修改经历的时间（represents time elapsed），毫秒级，Zookeeper通过"Transaction ID"来标识每个改动并告知Znodes--Zxid是唯一的并且保存着每一个事物的时间

4. Datalength Total amount of the data stored in a znode（最多可以存储1MB）

Types of Znodes

persistence: default

sequential: /myapp -> /myapp0000000001 ,next sequence number as 0000000002 ，每个znode的number保证不同 Locking and Synchronization

ephemeral ： Leader Election

Sessions

Session are very important for the operation of Zookeeper(http??)

FIFO order;

1. Once a client connects to a server --> session session id

2. keep the session valid --> client sends heartbeats at a paticular time (设定时间内_启动服务时设定的，未收到干掉client)

3. Session timeouts --> milliseconds

Watchs

client的一种简单机制(观察者模式)，当Zookeeper ensemble有改变时通知client

当监听一个特殊的Znode，Watchs会通知这些注册的client，当znode（client注册znode）发生改变时

当client断连，watchs也失效了

Zookeeper - Workflow

1. 服务端启动 wait for，Once a Zookeeper ensemble starts,it will wait for the cilents to connect.

2. client 建立连接（和某一个znode）

3. SessionId + an acknowledgement

4. 如果没收到acknowledgement，连另外一个znode

5. 一旦连接上了，就要定时发心跳确定没丢失

如果想获得特定的znode，可以请求 read request to the node with the znode path，然后node就去他们的database 拿东西；

如果client想存储些数据到zookeeper ensemble：

sends the znode path and the data to the server

The connected server will forward the request to the leader and then the leader will reissue the writing request to all the followers ,然后看写入是否成功

Nodes in a Zookeeper Ensemble

A single Node : 单点故障，不推荐

two nodes：不推荐

three nodes ：推荐最小nodes数为3个

four nodes：推荐为3,5,7.。。。

image

Leader Election

在zookeeper中如何选择leader

1. 所有nodes创建sequential，ephemeral znode with same path

**/app/leader_election/guid_**

2. Zookeeper ensemble will append 10-digit sequence number /app/leader_election/guid_0000000001 .......

Zookeeper

zookeeper的安装

zookeeper的配置

zookeeper数据存储

zookeeper命令行CLI：和zookeeper ensemble 交互

Once the client starts，可以做如下操作：

1. Create znodes

2. Get data

3. Watch znode for changes

4 Set data

5. Create children of a znode

6. List children if a znode

7. check status

8. Remove/Delete znode

Create Znodes

create a znode with the given path.默认地都为persistent

Ephemeral znodes(flag：e）自动删除当session过期

Sequential znodes 保证znode path的唯一性

Get Data

get /path

Watch

get /path [watch] 1

Set Data

set /path /data

Create Children /Sub-znode

create /parent/path/subnode/path /data

List Children

ls /path

Check Status

stat /path

Remove a ZNode

rmr /path

Zookeeper API

Zookeeper作为注册中心

在Zookeeper中，进行服务注册，实际上就是在zookeeper中创建一个Znode节点，
该节点存储了该服务点IP、端口、调用方式(协议，序列化方式)
他由服务提供者创建，以供服务消费者获取节点中的信息，
从而定位到服务提供者真正网络拓扑位置以及如何调用。
第一次消费后就会缓存在本地，并且监控上下线 => 心跳检测

感知服务上下线：

zookeeper注册中心，心跳检测，定时向各个服务提供者发送请求（实际上建立一个socket长连接），如长时间未响应剔除，znode路径删除
服务消费者会去监听相应路径，一旦数据有变化，通知消费者

如何保证分布式情况下的同步：

Sequential Consistency-顺序一致性：按照客户端sent的顺序
Atomicity：要么成功，要么失败，没有partial result
Single System image：每个server看到的image都是一致的
Reliability：一旦有了更新并且被应用,直到下一次更新
Timeliness-及时性： within a certain time bound

市面上常用的监控工具

zkcomponents.jpg

分布式事务的解决方案
2p/3p是为了保证事务的ACID
CAP

Zookeeper Internals

Atomic Broadcast: sync
guarantees,properties and definitions

Reliable delivery：原子广播，全员都知道
Total order：a message before b message by one server，通知全员也是a先于b
causal order:

Zookeeper分布式环境下，断网会出现两个leader的情况嘛

不会出现这种脑裂情况，有过半机制保护，超过半数而去不等于半数，leader选举才能够生效

zookeeper Leader选举机制

过半机制，快速选举 zxid(事务ID)来标示数据的新旧，越大越新
选举流程： A -> A B -> A B C ->C当领导，过半即获胜,
存在网络交互谁的Zxid大，需要socket，所以只允许服务器ID大的去请求服务器ID小的，小的请求大的会被拒绝

参考链接：https://www.cnblogs.com/leeSmall/p/9571514.html
参考链接：https://www.cnblogs.com/zlslch/p/7667636.html
分布式ID生成https://www.jianshu.com/p/9d7ebe37215e
https://www.cnblogs.com/lanqiu5ge/p/9405601.html

网友评论

本文标题：Zookeeper

本文链接：https://www.haomeiwen.com/subject/zdvvtctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Zookeeper

简介

Zookeeper是啥

分布式集群

Apache Zookeeper

使用zookeeper的优势

Zookeeper的架构

Hierarchical Namespace

Types of Znodes

Sessions

Watchs

Zookeeper - Workflow

Nodes in a Zookeeper Ensemble

Leader Election

Zookeeper

Zookeeper作为注册中心

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读