COMP9311 Database Systems WEEK10

作者: Sisyphus235 | 来源:发表于2017-10-04 17:46 被阅读0次

COMP9311 Database Systems 传送门
COMP9311 Database Systems WEEK10
COMP9311 Database Systems WEEK4
COMP9311 Database Systems WEEK2
COMP9311 Database Systems WEEK3
COMP9311 Database Systems WEEK1
COMP9311 Database Systems Lab1
COMP9311 Database Systems WEEK6
COMP9311 Database Systems WEEK12
COMP9311 Database Systems Lab6

1.补充视频

1.1 Computing minimal cover

Steps to produce Functions:
1.ReduceRight: ensure all functional dependencies have one attribute on right hand side;
2.ReduceLeft: remove redundant attributes on left hand side;
3.RemoveRedundant: eliminate redundant functional dependencies

例题：
R = {ABCDEG}
F = {A—>BCD, B—>CDE, AC—>E}

After step 1:
A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, AC—>E

After step 2:
consider AC—>E, if remove A, it will be C—>E, while in step 1, C cannot produce anything; if remove C, A—>E, based on step 1, A—>B, B—>E, so A—>E, it works.
A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, A—>E

After step 3:
consider A—>B, if we delete it, based on the rest, A—>C, A—>D, A—>E, but C/D/E cannot produce B, so it cannot be removed.
consider A—>C, if we delete it, based on the rest, A—>B, B—>C, so it can be removed.
A—>B, A—>D, B—>C, B—>D, B—>E, A—>E
consider A—>D, if we delete it, based on the rest, A—>B, B—>D, so it can be removed.
A—>B, B—>C, B—>D, B—>E, A—>E
consider B—>C, it cannot be removed.
consider B—>D, it cannot be removed.
consider B—>E, it cannot be removed.
consider A—>E, if we delete it, based on the rest, A—>B, B—>E, so it can be removed.
A—>B, B—>C, B—>D, B—>E

This is the minimal cover, and the primary key should be AB

1.2 BCNF normalisation

Inputs: a schema R, a set Functional Dependencies(FDs)
Output: a set Res of BCNF schemas

BCNF requirement:
for all FDs X —> Y
either
X —> Y is trivial
or
X is a (super)key

method:
Res = {R}
while (any schema S in Res is not in BCNF) {
choose an fd X —> Y on S that violates BCNF
Res = (Res - S) U (S - Y) U XY
}
Res - S含义是出去S关系的剩余schema，S - Y 同理

例题：
R = ABCDEFGH
FDs = ABH—>C, A—>DE, BGH—>F, F—>ADH, BH—>GE

解法：
（1）找key
关系中F—>ADH是影响最广的，而推出F的是BGH，在BGH三者间，BH—>GE，所以BH更重要，而BH二者间没有推出关系，所以尝试从BH作为key推导。
BH—>GE(BEGH), BGH—>F(BEFGH), F—>ADH(ABDEFGH), ABH—>C(ABCDEFGH)，推导出完整的R，所以key是BH。
（2）根据FDs寻找违背BCNF的关系
A—>DE最明显，因为A不是key，BCNF不允许A和其他element建立关系。根据上述方法
Res = {ABCDEFGH}

Fix A—>DE
XY = ADE, S - Y = ABCFGH

Res = {ADE, ABCFGH}
ADE commits with BCNF
（3）根据FDs寻找违背BCNF的关系
R1 = ABCFGH
FDs = ABH—>C, BGH—>F, F—>AH, BH—>G（因为DE被放置在上一步的符合关系中，所有右侧涉及DE的都删除，剩余关系成立）
再次选择key，尝试继续从BH推导：
BH—>G(BGH), BGH—>F(BFGH), F—>AH(ABFGH), ABH—>C(ABCFGH)，成立
Key = BH

这次发现F—>AH违背BCNF关系
Fix F—>AH
XY = FAH, S - Y = BCFG

Res = {ADE, FAH, BCFG}
ADE, FAH commit with BCNF
对于剩下的BCFG而言，剩下的所有关系都不成立，ABH—>C中H不在，BGH—>F中H不在，F—>AH中AH都不在，BH—>G中H不在
所以BCFG commits with BCNF

最后结果是{ADE, FAH, BCFG}

1.3 3NF normalisation

Inputs: a schema R, a set F of Functional Dependencies(FDs)
Output: a set Res of 3NF schemas

3NF requirement:
for all FDs X—>Y
either
X—>Y is trivial
or
X is a (super)key
or
Y is a single attribute and Y is in the key

methods:
F’ = condensed minimal cover of F
Res = {}
for each fd X—>Y in F’ {
if (no schema S in Res contains XY)
Res = Res U XY
}
if (no schema S in Res contains key(R))
Res = Res U key(R)

例题：
R = ABCDEFGH
F = ABH—>C, A—>D, C—>E, BGH—>F, F—>AD, E—>F, BH—>E
(1)minimal cover:
ABH—>C, A—>D, C—>E, BGH—>F, F—>A, F—>D, E—>F, BH—>E
BH—>C, A—>D, C—>E, BH—>F, F—>A, F—>D, E—>F, BH—>E
BH—>C, BH—>E, F—>A, F—>D, E—>F
(2)key:
key = BHG
(3)F’:
F’ = BH—>CE, E—>F, F—>AD

BHCE(key = BH), EF(key = E), FAD(key = F), BHG(key = BHG)

2. Week9 Continue

继续上周的normal forms，重点是BCNF和3NF。简述如下：

2.1 BCNF

对于所有的FDs X-->Y
（1）要么X是superkey
（2）要么Y是X的子集
做BCNF decomposition的时候，从全集R开始，对于任意一个不符合BCNF的FD关系，拆解出来，更新FDs，直到全部符合BCNF。

2.2 3NF

对于所有的FDs X-->Y
（1）要么X是superkey
（2）要么Y是X的子集
（3）或者Y是candidate key的一个attribute
做3NF decomposition的时候，从空集开始。先把所有的FDs做minimal cover，过程是先拆解右侧FDs，再拆解左侧FDs，最后检查有没有任何FD是冗余的。将所有minimal cover组合成table，如果其中未包括candidate key，要加入candidate key形成一个table。

3. DBMS Architecture

DBMS应用中常见的technique有两类：query processing（QP）用来提升查询效率，transaction processsing（TxP）用来保障数据库安全。

在数据库效率考量中：
join比子查询效率高，如果subquery中包括外层查询的信息，那么效率更低；
要尽可能先筛选filter后输出返回；
减少在where和group by语句中的函数使用；
如果已有的数据很少做更新，那么可以建立indexes提升查询效率，代价是每次update都要同时update真实关系和index。

例如：

select name from Employee
where  salary > 50000 and
       empid in (select empid from WorksIn
                 where  dept = 'Sales')

使用了subquery效率低，可以更新为：

SalesEmps = (select empid from WorksIn where dept='Sales')
foreach e in Employee {
    if (e.empid in SalesEmps && e.salary > 50000)
        add e to result set
}

或者写成：

select name
from   Employee join WorksIn using (empid)
where  Employee.salary > 5000 and
       WorksIn.dept = 'Sales'

还可以更新为：

SalesEmps = (select * from WorksIn where dept='Sales')
foreach e in (Employee join SalesEmps) {
    if (e.salary > 50000)
        add e to result set
}

DBMS的构成如下：
--File manager: manages allocation of disk space and data structures
--Buffer manager: manages data transfer between disk and main memory
--Query optimiser: translates queries into efficient sequence of relational operations
--Recovery manager: ensures consistent database state after system failures
--Concurrency manager: controls concurrent access to database
--Integrity manager: verifies integrity constraints and user privileges

DBMS Architecture

4. Relational Algebra

Relational Algebra (RA)通常包括：
--operands: relations, or variables representing relations
--operators that map relations to relations
--rules for combining operands/operators into expressions
--rules for evaluating such expressions

4.1 Selection

可以理解为筛选tuples，减少rows，课上使用的表达法是Sel[expr](Rel)，标准表达法是σexpr(Rel)。
例如：SelC，从关系r(R)中选择一个条件为C的子集tuples。

4.2 Projection

可以理解为筛选attributes，减少columns，课上使用的表达法是Proj[A, B, C](Rel)，标准表达法是ΠA,B,C(Rel)。
例如：Proj[X](r)，从关系r(R)中选择attributes X。

4.3 Product

课上使用的表达法是Rel1 Join[expr] Rel2。
例如：r ✖️ s，要注意r和s可能有attribute命名相同，为了避免名字的冲突，要对一些attribute重命名。

4.4 Join

（1）Natural Join: r Join s，选取r和s中attribute一致的所有tuples。
r Join s = Proj[R U S](Sel[match](r ✖️ s))
（2）Theta Join: r JoinC s，在条件C下join，如果条件C是True，则与笛卡尔积一致(r ✖️ s)

4.5 Union

r1 U r2，与集合关系一致，但是和sql不同，RA不允许重复element。

4.6 Intersection

r1 ∩ r2，与集合关系一致，但是和sql不同，RA不允许重复element。

4.7 Difference

r1 - r2，代表只属于r1，不属于r2的所有elements。

4.8 Division

r/s，显示的column是r - s的attributes，显示的tuples是所有在r中包含全部s的tuples value的value。

RA Division

4.9 Rename

课上使用的表达法是Rename[schema](Rel)。
例如：如果expression E returns a relation R(A1, A2...An)，那么Rename[S(B1, B2...Bn)](E)。tuples还是E，但是Ai改名为Bi

COMP9311 Database Systems 传送门
COMP9311 Database Systems WEEK1：什么是database，database中的重要概...
COMP9311 Database Systems WEEK10
1.补充视频 1.1 Computing minimal cover Steps to produce Funct...
COMP9311 Database Systems WEEK4
SQL和Regular Expression专题 1. SQL 1.1简介 SQL包括多种sub-language...
COMP9311 Database Systems WEEK2
1.Assignment 1 分为两个部分：先做ER图，建立conceptual model，这部分不算分，Wee...
COMP9311 Database Systems WEEK3
1. ER-to-relational Mapping Review ER的entity对应relational的...
COMP9311 Database Systems WEEK1
1.课程简介（1）教师是Rachid Hamadi，名字发音是Ra’shid。（2）课程主要是理论，少量编程，使...
COMP9311 Database Systems Lab1
1. 主要内容配置PostgreSQL环境 2. 配置步骤 2.1 远程连接 TaggiMac在terminal...
COMP9311 Database Systems WEEK6
1. Additional video from last week 2. Extending SQL 通常可以创...
COMP9311 Database Systems WEEK12
1.Future of Database 希望能够进一步解决大数据、复杂语言逻辑情形的数据处理问题，提升查询的速度...
COMP9311 Database Systems Lab6
练习使用plpgsql建立trigger规则。使用的材料与lab5一致： schema文件如下： What is ...

COMP9311 Database Systems WEEK10

1.补充视频

1.1 Computing minimal cover

1.2 BCNF normalisation

1.3 3NF normalisation

2. Week9 Continue

2.1 BCNF

2.2 3NF

3. DBMS Architecture

4. Relational Algebra

4.1 Selection

4.2 Projection

4.3 Product

4.4 Join

4.5 Union

4.6 Intersection

4.7 Difference

4.8 Division

4.9 Rename

相关文章

COMP9311 Database Systems 传送门

COMP9311 Database Systems WEEK10

COMP9311 Database Systems WEEK4

COMP9311 Database Systems WEEK2

COMP9311 Database Systems WEEK3

COMP9311 Database Systems WEEK1

COMP9311 Database Systems Lab1

COMP9311 Database Systems WEEK6

COMP9311 Database Systems WEEK12

COMP9311 Database Systems Lab6

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读