美文网首页
[Zookeeper] 服务端之单机版服务器启动

[Zookeeper] 服务端之单机版服务器启动

作者: LZhan | 来源:发表于2020-02-20 15:37 被阅读0次

    1 服务器端整体概览图

    概览图
    • ServerCnxnFactory:负责与client之间的网络交互,支持NIO(默认)以及Netty
    • SessionTrackerImpl:会话管理器
    • DatadirCleanupManager:定期清理存在磁盘上的log文件和snapshot文件
    • PreRequestProcessor,SyncRequestProcessor,FinalRequestProcessor:请求处理流程,责任链模式
    • LearnerHandler:Leader与Learner之间的交互
    • FileTxnSnapLog:存储在磁盘上的日志文件
    • DataTree:体现在内存中的存储结构
    • Sessions:Session的相关信息存储

    2 单机版服务器启动流程

    1. 执行QuorumPeerMain的main方法,其中先创建一个QuorumPeerMain对象
    2. 调用initializeAndRun方法
        protected void initializeAndRun(String[] args)
            throws ConfigException, IOException
        {
            QuorumPeerConfig config = new QuorumPeerConfig();
            if (args.length == 1) {
                config.parse(args[0]);
            }
    
            // Start and schedule the the purge task
            DatadirCleanupManager purgeMgr = new DatadirCleanupManager(config
                    .getDataDir(), config.getDataLogDir(), config
                    .getSnapRetainCount(), config.getPurgeInterval());
            purgeMgr.start();
    
            if (args.length == 1 && config.servers.size() > 0) {
                runFromConfig(config);
            } else {
                LOG.warn("Either no config or no quorum defined in config, running "
                        + " in standalone mode");
                // there is only server in the quorum -- run as standalone
                ZooKeeperServerMain.main(args);
            }
        }
    

    2.1

    • args实际上就是zoo.cfg中的配置,如下

    2.2

    • 创建DatadirCleanupManager实例,参数有snapDir,dataLogDir,snapRetainCount(要保存snapshot文件的个数),purgeInterval(定期清理的频率,单位为小时),snapRetainCountpurgeInterval在zoo.cfg中均可以配置
    • 调用DatadirCleanupManagerstart方法,里面主要依赖PurgeTask,这也是一个线程,其run方法

    PurgeTxnLogpurge方法:

        public static void purge(File dataDir, File snapDir, int num) throws IOException {
            // snapshot文件保存的数量小于3,抛异常
            if (num < 3) {
                throw new IllegalArgumentException(COUNT_ERR_MSG);
            }
            // 根据dataDir和snapDir创建FileTxnSnapLog
            FileTxnSnapLog txnLog = new FileTxnSnapLog(dataDir, snapDir);
            // 根据给定数量获取最近的文件
            List<File> snaps = txnLog.findNRecentSnapshots(num);
            // 获取数目
            int numSnaps = snaps.size();
            if (numSnaps > 0) {
                // 第二个参数是最近的snapshot文件
                purgeOlderSnapshots(txnLog, snaps.get(numSnaps - 1));
            }
        }
    

    找最近n个snapshot文件:

        public List<File> findNRecentSnapshots(int n) throws IOException {
            List<File> files = Util.sortDataDir(snapDir.listFiles(), SNAPSHOT_FILE_PREFIX, false);
            int count = 0;
            List<File> list = new ArrayList<File>();
            for (File f: files) {
                if (count == n)
                    break;
                if (Util.getZxidFromName(f.getName(), SNAPSHOT_FILE_PREFIX) != -1) {
                    count++;
                    list.add(f);
                }
            }
            return list;
        }
    

    purgeOlderSnapshots:

        static void purgeOlderSnapshots(FileTxnSnapLog txnLog, File snapShot) {
            // 从snapshot文件名中获取zxid
            final long leastZxidToBeRetain = Util.getZxidFromName(
                    snapShot.getName(), PREFIX_SNAPSHOT);
    
            /**
             * 我们删除名称中带有zxid且小于leastZxidToBeRetain的所有文件。
             * 该规则适用于快照文件和日志文件,
             * zxid小于X的日志文件可能包含zxid大于X的事务。
             * 更准确的说,命名为log.(X-a)的log文件可能包含比snapshot.X文件更新的事务,
             * 如果在该间隔中没有其他以zxid开头即(X-a,X]的日志文件
     
             */
            final Set<File> retainedTxnLogs = new HashSet<File>();
            //获取快照日志,其中可能包含比给定zxid更新的事务,这些日志需要保留下来
            retainedTxnLogs.addAll(Arrays.asList(txnLog.getSnapshotLogs(leastZxidToBeRetain)));
    
            /**
             * Finds all candidates for deletion, which are files with a zxid in their name that is less
             * than leastZxidToBeRetain.  There's an exception to this rule, as noted above.
             */
            class MyFileFilter implements FileFilter{
                private final String prefix;
                MyFileFilter(String prefix){
                    this.prefix=prefix;
                }
                public boolean accept(File f){
                    if(!f.getName().startsWith(prefix + "."))
                        return false;
                    if (retainedTxnLogs.contains(f)) {
                        return false;
                    }
                    long fZxid = Util.getZxidFromName(f.getName(), prefix);
                    if (fZxid >= leastZxidToBeRetain) {
                        return false;
                    }
                    return true;
                }
            }
            // add all non-excluded log files
            List<File> files = new ArrayList<File>();
            File[] fileArray = txnLog.getDataDir().listFiles(new MyFileFilter(PREFIX_LOG));
            if (fileArray != null) {
                files.addAll(Arrays.asList(fileArray));
            }
    
            // add all non-excluded snapshot files to the deletion list
            fileArray = txnLog.getSnapDir().listFiles(new MyFileFilter(PREFIX_SNAPSHOT));
            if (fileArray != null) {
                files.addAll(Arrays.asList(fileArray));
            }
    
            // remove the old files
            for(File f: files)
            {
                final String msg = "Removing file: "+
                    DateFormat.getDateTimeInstance().format(f.lastModified())+
                    "\t"+f.getPath();
                LOG.info(msg);
                System.out.println(msg);
                if(!f.delete()){
                    System.err.println("Failed to remove "+f.getPath());
                }
            }
    
        }
    

    先获取到那些需要保留的文件,之后再去删除这些不在保留文件之内的文件。

    2.3
    判断集群是单机启动还是集群启动,集群走runFromConfig(config),单机走ZooKeeperServerMain.main(args)(其实单机版最终走的是ZooKeeperServerMainrunFromConfig,)

    runFromConfig方法:

        public void runFromConfig(ServerConfig config) throws IOException {
            LOG.info("Starting server");
            FileTxnSnapLog txnLog = null;
            try {
                final ZooKeeperServer zkServer = new ZooKeeperServer();
                // Registers shutdown handler which will be used to know the
                // server error or shutdown state changes.
                final CountDownLatch shutdownLatch = new CountDownLatch(1);
                zkServer.registerServerShutdownHandler(
                        new ZooKeeperServerShutdownHandler(shutdownLatch));
                // 构建FileTxnSnapLog对象
                txnLog = new FileTxnSnapLog(new File(config.dataLogDir), new File(
                        config.dataDir));
                zkServer.setTxnLogFactory(txnLog);
                zkServer.setTickTime(config.tickTime);
                zkServer.setMinSessionTimeout(config.minSessionTimeout);
                zkServer.setMaxSessionTimeout(config.maxSessionTimeout);
                // 构建与client之间的网络通信服务组件
                // 这里可以通过zookeeper.serverCnxnFactory配置NIO还是Netty
                cnxnFactory = ServerCnxnFactory.createFactory();
                cnxnFactory.configure(config.getClientPortAddress(),
                        config.getMaxClientCnxns());
                cnxnFactory.startup(zkServer);
                // Watch status of ZooKeeper server. It will do a graceful shutdown
                // if the server is not running or hits an internal error.
                shutdownLatch.await();
                shutdown();
    
                cnxnFactory.join();
                if (zkServer.canShutdown()) {
                    zkServer.shutdown(true);
                }
            } catch (InterruptedException e) {
                // warn, but generally this is ok
                LOG.warn("Server interrupted", e);
            } finally {
                if (txnLog != null) {
                    txnLog.close();
                }
            }
        }
    

    初始化网络服务组件后,cnxnFactory.startup(zkServer);
    这里以默认网络服务组件为例NIOServerCnxnFactory

        @Override
        public void start() {
            // ensure thread is started once and only once
            if (thread.getState() == Thread.State.NEW) {
                thread.start();
            }
        }
    
        @Override
        public void startup(ZooKeeperServer zks) throws IOException,
                InterruptedException {
            //调用上面的start方法,实际调用thread的start
            //也就调用了该类的run方法,启动网络服务
            start();
            //这是ZooKeeperServer
            setZooKeeperServer(zks);
            zks.startdata();
            zks.startup();
        }
    

    2.4
    zks.startdata()方法:

        public void startdata() 
        throws IOException, InterruptedException {
            //check to see if zkDb is not null
            if (zkDb == null) {
                zkDb = new ZKDatabase(this.txnLogFactory);
            }  
            if (!zkDb.isInitialized()) {
                //loadData进行初始化
                loadData();
            }
        }
    

    ZKDatabase在内存中维护了zookeeper的sessions, datatree和commit logs集合。 当zookeeper server启动的时候会将txnlogs和snapshots从磁盘读取到内存中

    ZKDatabase的loadData()方法:

            //如果zkDb已经初始化,设置zxid(内存当中DataTree最新的zxid)
            if(zkDb.isInitialized()){
                setZxid(zkDb.getDataTreeLastProcessedZxid());
            }
            else {
                // 没有初始化,就loadDataBase
                setZxid(zkDb.loadDataBase());
            }
    
          public long loadDataBase() throws IOException {
              long zxid = snapLog.restore(dataTree, sessionsWithTimeouts, commitProposalPlaybackListener);
              initialized = true;
              return zxid;
          }
    

    loadDataBase()内部调用的是FileTxnSnapLogrestore方法

    2.5
    zks.startup()方法:

        public synchronized void startup() {
            if (sessionTracker == null) {
                // 创建会话管理器
                createSessionTracker();
            }
            // 启动会话管理器
            startSessionTracker();
            // 设置请求处理器
            setupRequestProcessors();
            // 注册jmx
            registerJMX();
    
            setState(State.RUNNING);
            notifyAll();
        }
    

    相关文章

      网友评论

          本文标题:[Zookeeper] 服务端之单机版服务器启动

          本文链接:https://www.haomeiwen.com/subject/asmtqhtx.html