美文网首页
9. 二叉搜索树

9. 二叉搜索树

作者: 邢昱 | 来源:发表于2018-08-18 18:28 被阅读0次

    Tree insert: searchs for that element A[i]
    Tree walk:

    Theorem:

    E[height of a randomly built binary search tree] = O(lg n)

    Proof outline

    1. Prove Jensen's inequality:
      f(E[X]) \leq E[f(X)] for convex function f
    2. Instead of analyzing X_n = the random variable of the height of a randomly constructed BST on n nodes, we analize Y_n = 2^{X_n}
    3. Prove that E[Y_n] = O(n^3)
    4. Conclude that
      2^{E[X_n]} \leq E[2^{X_n}] = E[Y_n] = O(n^3)
      ⇒ E[X_n] \leq lg(O(n^3)) = 3lg(n) + O(1)

    E[X_n] \approx 2.9882lg\,n (by Luke Devroy, 1986)

    Proof 1.Jensen's inequality

    f: \mathbb{R} \rightarrow \mathbb{R} is convex if for all x,y \in \mathbb{R} and all \alpha, \beta \geq 0 with \alpha + \beta = 1, f(\alpha x + \beta y) \leq \alpha f(x) + \beta f(y)

    vector and convex function
    Lemma

    if \mathbb{R} \rightarrow \mathbb{R} is convex & x_1, x_2, ... , x_n \in \mathbb{R} & \alpha_1, \alpha_2, ... , \alpha_n \geq 0 & \sum^n_{k=1}{x_k} = 1,
    then f \left( \sum_{k=1}^n{\alpha_kx_k} \right) \leq \sum_{k=1}^n{\alpha_kf(x_k)}

    灰色区域某一点意味着不等号右边
    Induction
    Base Case: When n = 1, \alpha_1 = 1 , 不等式 f \left( \sum_{k=1}^n{\alpha_kx_k} \right) \leq \sum_{k=1}^n{\alpha_kf(x_k)} 成立。
    Induction Step: f \left( \sum_{k=1}^n{\alpha_kx_k} \right)
    = f \left( \alpha_nx_n +(1 - \alpha_n) \sum^{n-1}_{k=1}{\frac{\alpha_k}{1-\alpha_n}x_k} \right)
    \leq \alpha_nf(x_n) + (1 - \alpha_n)f\left( \sum^{n-1}_{k=1}{\frac{\alpha_k}{1-\alpha_n}x_k} \right)
    = \alpha_nf(x_n) + f \left( \sum^{n-1}_{k=1}{\alpha_kx_k} \right) = \sum_{k=1}^n{\alpha_kf(x_k)}
    Jensen's inequality

    f(E[X]) \leq E[f(X)] (f is a convex and X is a integer random variable)
    proof:
    f(E[X]) = f \left(\sum^{∞}_{-∞}{x * Pr\{X = x\}} \right)
    \leq \sum^{∞}_{-∞}{Pro\{X = x\} * f(x)} = E[f(X)]

    Proof 2. Analysis Yn

    Xn = the random variable of the height of a randomly constructed BST on n nodes. Yn = 2 Xn

    If root r has rank k, then Xn = 1 + max{Xk-1, Xn-k}, Yn = 2 * max{Yk-1, Yn-k}

    define indicator r.v.'s
    Z_{nk}=\begin{cases} 1, rank(root) = k\\ 0, otherwise \end{cases}
    Pr{Znk = 1} = E[Znk] = 1/n
    Y_n = \sum^n_{k = 1}{Z_{nk}(2max\{Y_{k-1}, Y_{n-k}\})}

    Proof 3. Analysis E[Yn]

    E[Y_n] = E \left[ \sum^n_{k = 1}{Z_{nk}(2max\{Y_{k-1}, Y_{n-k}\})} \right]
    = \sum^n_{k = 1}{E[Z_{nk}(2max\{Y_{k-1}, Y_{n-k}\})]}
    = 2\sum^n_{k=1}{E[Z_{nk}]E[max\{Y_{k-1}, Y_{n-k}\}]}
    = \frac{2}{n} \sum^n_{k=1}{E[max\{Y_{k-1}, Y_{n-k}\}]}
    \leq \frac{2}{n} \sum^n_{k=1}{E[Y_{k-1} + Y_{n-k}]}
    = \frac{4}{n} \sum^{n-1}_{k=0}{E[Y_k]}
    Then, we are going to use substitution.
    Claim: E[X_n] \leq cn^3
    Proof:
    The base case must be true when c is sufficiently large.
    Induction Step: E[Y_n] \leq \frac{4}{n} \sum^{n-1}_{k=0}{E[Y_k]} \leq \frac{4}{n} \sum^{n-1}_{k=0}{ck^3} \leq \frac{4c}{n} \int_0^n {x^3}\,{\rm d}x = cn^3

    Proof 4.

    2^{E[X_n]} \leq E[2^{X_n}] = E[Y_n] = O(n^3) (简单地使用了Jensen's inequality)
    ⇒ E[X_n] \leq lg(O(n^3)) = 3lg(n) + O(1)( lg(O(n3)) = lg(cn3) = 3lg(n) + lg(c) )

    相关文章

      网友评论

          本文标题:9. 二叉搜索树

          本文链接:https://www.haomeiwen.com/subject/kilobftx.html