美文网首页
COMP9311 Database Systems WEEK10

COMP9311 Database Systems WEEK10

作者: Sisyphus235 | 来源:发表于2017-10-04 17:46 被阅读0次

    1.补充视频

    1.1 Computing minimal cover

    Steps to produce Functions:
    1.ReduceRight: ensure all functional dependencies have one attribute on right hand side;
    2.ReduceLeft: remove redundant attributes on left hand side;
    3.RemoveRedundant: eliminate redundant functional dependencies

    例题:
    R = {ABCDEG}
    F = {A—>BCD, B—>CDE, AC—>E}

    After step 1:
    A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, AC—>E

    After step 2:
    consider AC—>E, if remove A, it will be C—>E, while in step 1, C cannot produce anything; if remove C, A—>E, based on step 1, A—>B, B—>E, so A—>E, it works.
    A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, A—>E

    After step 3:
    consider A—>B, if we delete it, based on the rest, A—>C, A—>D, A—>E, but C/D/E cannot produce B, so it cannot be removed.
    consider A—>C, if we delete it, based on the rest, A—>B, B—>C, so it can be removed.
    A—>B, A—>D, B—>C, B—>D, B—>E, A—>E
    consider A—>D, if we delete it, based on the rest, A—>B, B—>D, so it can be removed.
    A—>B, B—>C, B—>D, B—>E, A—>E
    consider B—>C, it cannot be removed.
    consider B—>D, it cannot be removed.
    consider B—>E, it cannot be removed.
    consider A—>E, if we delete it, based on the rest, A—>B, B—>E, so it can be removed.
    A—>B, B—>C, B—>D, B—>E

    This is the minimal cover, and the primary key should be AB

    1.2 BCNF normalisation

    Inputs: a schema R, a set Functional Dependencies(FDs)
    Output: a set Res of BCNF schemas

    BCNF requirement:
    for all FDs X —> Y
    either
    X —> Y is trivial
    or
    X is a (super)key

    method:
    Res = {R}
    while (any schema S in Res is not in BCNF) {
    choose an fd X —> Y on S that violates BCNF
    Res = (Res - S) U (S - Y) U XY
    }
    Res - S含义是出去S关系的剩余schema,S - Y 同理

    例题:
    R = ABCDEFGH
    FDs = ABH—>C, A—>DE, BGH—>F, F—>ADH, BH—>GE

    解法:
    (1)找key
    关系中F—>ADH是影响最广的,而推出F的是BGH,在BGH三者间,BH—>GE,所以BH更重要,而BH二者间没有推出关系,所以尝试从BH作为key推导。
    BH—>GE(BEGH), BGH—>F(BEFGH), F—>ADH(ABDEFGH), ABH—>C(ABCDEFGH),推导出完整的R,所以key是BH。
    (2)根据FDs寻找违背BCNF的关系
    A—>DE最明显,因为A不是key,BCNF不允许A和其他element建立关系。根据上述方法
    Res = {ABCDEFGH}

    Fix A—>DE
    XY = ADE, S - Y = ABCFGH

    Res = {ADE, ABCFGH}
    ADE commits with BCNF
    (3)根据FDs寻找违背BCNF的关系
    R1 = ABCFGH
    FDs = ABH—>C, BGH—>F, F—>AH, BH—>G(因为DE被放置在上一步的符合关系中,所有右侧涉及DE的都删除,剩余关系成立)
    再次选择key,尝试继续从BH推导:
    BH—>G(BGH), BGH—>F(BFGH), F—>AH(ABFGH), ABH—>C(ABCFGH),成立
    Key = BH

    这次发现F—>AH违背BCNF关系
    Fix F—>AH
    XY = FAH, S - Y = BCFG

    Res = {ADE, FAH, BCFG}
    ADE, FAH commit with BCNF
    对于剩下的BCFG而言,剩下的所有关系都不成立,ABH—>C中H不在,BGH—>F中H不在,F—>AH中AH都不在,BH—>G中H不在
    所以BCFG commits with BCNF

    最后结果是{ADE, FAH, BCFG}

    1.3 3NF normalisation

    Inputs: a schema R, a set F of Functional Dependencies(FDs)
    Output: a set Res of 3NF schemas

    3NF requirement:
    for all FDs X—>Y
    either
    X—>Y is trivial
    or
    X is a (super)key
    or
    Y is a single attribute and Y is in the key

    methods:
    F’ = condensed minimal cover of F
    Res = {}
    for each fd X—>Y in F’ {
    if (no schema S in Res contains XY)
    Res = Res U XY
    }
    if (no schema S in Res contains key(R))
    Res = Res U key(R)

    例题:
    R = ABCDEFGH
    F = ABH—>C, A—>D, C—>E, BGH—>F, F—>AD, E—>F, BH—>E
    (1)minimal cover:
    ABH—>C, A—>D, C—>E, BGH—>F, F—>A, F—>D, E—>F, BH—>E
    BH—>C, A—>D, C—>E, BH—>F, F—>A, F—>D, E—>F, BH—>E
    BH—>C, BH—>E, F—>A, F—>D, E—>F
    (2)key:
    key = BHG
    (3)F’:
    F’ = BH—>CE, E—>F, F—>AD

    BHCE(key = BH), EF(key = E), FAD(key = F), BHG(key = BHG)

    2. Week9 Continue

    继续上周的normal forms,重点是BCNF和3NF。简述如下:

    2.1 BCNF

    对于所有的FDs X-->Y
    (1)要么X是superkey
    (2)要么Y是X的子集
    做BCNF decomposition的时候,从全集R开始,对于任意一个不符合BCNF的FD关系,拆解出来,更新FDs,直到全部符合BCNF。

    2.2 3NF

    对于所有的FDs X-->Y
    (1)要么X是superkey
    (2)要么Y是X的子集
    (3)或者Y是candidate key的一个attribute
    做3NF decomposition的时候,从空集开始。先把所有的FDs做minimal cover,过程是先拆解右侧FDs,再拆解左侧FDs,最后检查有没有任何FD是冗余的。将所有minimal cover组合成table,如果其中未包括candidate key,要加入candidate key形成一个table。

    3. DBMS Architecture

    DBMS应用中常见的technique有两类:query processing(QP)用来提升查询效率,transaction processsing(TxP)用来保障数据库安全。

    在数据库效率考量中:
    join比子查询效率高,如果subquery中包括外层查询的信息,那么效率更低;
    要尽可能先筛选filter后输出返回;
    减少在where和group by语句中的函数使用;
    如果已有的数据很少做更新,那么可以建立indexes提升查询效率,代价是每次update都要同时update真实关系和index。

    例如:

    select name from Employee
    where  salary > 50000 and
           empid in (select empid from WorksIn
                     where  dept = 'Sales')
    

    使用了subquery效率低,可以更新为:

    SalesEmps = (select empid from WorksIn where dept='Sales')
    foreach e in Employee {
        if (e.empid in SalesEmps && e.salary > 50000)
            add e to result set
    }
    

    或者写成:

    select name
    from   Employee join WorksIn using (empid)
    where  Employee.salary > 5000 and
           WorksIn.dept = 'Sales'
    

    还可以更新为:

    SalesEmps = (select * from WorksIn where dept='Sales')
    foreach e in (Employee join SalesEmps) {
        if (e.salary > 50000)
            add e to result set
    }
    

    DBMS的构成如下:
    --File manager: manages allocation of disk space and data structures
    --Buffer manager: manages data transfer between disk and main memory
    --Query optimiser: translates queries into efficient sequence of relational operations
    --Recovery manager: ensures consistent database state after system failures
    --Concurrency manager: controls concurrent access to database
    --Integrity manager: verifies integrity constraints and user privileges

    DBMS Architecture

    4. Relational Algebra

    Relational Algebra (RA)通常包括:
    --operands: relations, or variables representing relations
    --operators that map relations to relations
    --rules for combining operands/operators into expressions
    --rules for evaluating such expressions

    4.1 Selection

    可以理解为筛选tuples,减少rows,课上使用的表达法是Sel[expr](Rel),标准表达法是σexpr(Rel)。
    例如:SelC,从关系r(R)中选择一个条件为C的子集tuples。

    4.2 Projection

    可以理解为筛选attributes,减少columns,课上使用的表达法是Proj[A, B, C](Rel),标准表达法是ΠA,B,C(Rel)。
    例如:Proj[X](r),从关系r(R)中选择attributes X。

    4.3 Product

    课上使用的表达法是Rel1 Join[expr] Rel2。
    例如:r ✖️ s,要注意r和s可能有attribute命名相同,为了避免名字的冲突,要对一些attribute重命名。

    4.4 Join

    (1)Natural Join: r Join s,选取r和s中attribute一致的所有tuples。
    r Join s = Proj[R U S](Sel[match](r ✖️ s))
    (2)Theta Join: r JoinC s,在条件C下join,如果条件C是True,则与笛卡尔积一致(r ✖️ s)

    4.5 Union

    r1 U r2,与集合关系一致,但是和sql不同,RA不允许重复element。

    4.6 Intersection

    r1 ∩ r2,与集合关系一致,但是和sql不同,RA不允许重复element。

    4.7 Difference

    r1 - r2,代表只属于r1,不属于r2的所有elements。

    4.8 Division

    r/s,显示的column是r - s的attributes,显示的tuples是所有在r中包含全部s的tuples value的value。

    RA Division

    4.9 Rename

    课上使用的表达法是Rename[schema](Rel)。
    例如:如果expression E returns a relation R(A1, A2...An),那么Rename[S(B1, B2...Bn)](E)。tuples还是E,但是Ai改名为Bi

    相关文章

      网友评论

          本文标题:COMP9311 Database Systems WEEK10

          本文链接:https://www.haomeiwen.com/subject/psozextx.html