Database

mySQL：

设计场景

Foreign key:

A FOREIGN KEY is a key used to link two tables together
A FOREIGN KEY is a field (or collection of fields) in one table that refers to the PRIMARY KEY in another table.
比如我有个parent表，里面是parent的信息，还有个children表，里面是children信息。
这个parent表里面有个field，叫做children id，这个id是children表里的primary key，那这个id也是parent的表中的foreign key。因为children id 是其它表里的primary key。

fk 可以防止invalid data 插入到数据库中，因为fk必须是一个其它表中的pk。

Left Join

The LEFT JOIN keyword returns all records from the left table (table1), and the matched records from the right table (table2). The result is NULL from the right side, if there is no match.

left join就是join两个表的内容，按照pk来join
存在于left和right表的record如果有相同的pk，就join成一个新的record。
如果left表的pk不存在于right表中，那么新的record就只存left表的record。
如果right表的pk不存在于left表中，那么就忽略这条数据。

Inner Join

The INNER JOIN keyword selects records that have matching values in both tables.
inner join只会join两个表中具有相同pk的record

ETL (Extract, Transform, and Load）process

多对多的关系怎么设计（比如好友关系）

SQL 与NoSQL的区别

什么是Transaction

什么是ACID

什么是index

a database index is an auxiliary data structure which allows for faster retrieval of data stored in the database. They are keyed off of a specific column so that queries like "Give me all people with a last name of "Obama" are fast.

Postgres will automatically generate the indices for the primary key.

hash index
比如上面查询名字的例子，key是name field，values 是 pointers to database row。但是hash 有个缺点是只能处理相等问题equality，比如告诉我所有年龄小于45岁的人

B-tree index or B+ tree
In a B tree search keys and data stored in internal or leaf nodes. But in B+-tree data store only leaf nodes.
最常用的数据结构是B-trees， logarithmic time selections, insertions, deletions
For example, if we have an index on an age column, the value in the B-tree might be something like (34, 0x875900). 34 is the age and 0x875900 is a reference to the location of the data, rather than the data itself.

How and why are database indexes good and/or bad?

index 可以提高读取速度，但是会降低写入速度因为每次插入新的数据，都要更新index的表。另外index会产生额外的内存消耗。

database normalization

a process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity.
数据库规范化，就是整理数组库的表，减少数据重复，提高一致性的方法。