一、outline
what is distrubuted system
- parallelism
- fault tolerance
- physical
- security
challenges
- concurrency
- partial failture
- performance
Infraustructure ---abstractions
- storage (look like non-distributed)
- communication
- computation
Implementation
remote procedure core(RPC), threads, concurrency CH(locks)
performance
-
scalability -> 2x computers == 2x throughput
image.png
fault tolerance
- availability
- recoverability
- non volatile storage
- replication
consistency
- put(k,v)
-
get(k) -> v
image.png
- strongly
- weak
二、Mapreduce
![](https://img.haomeiwen.com/i17624987/aea1452c5127648e.png)
![](https://img.haomeiwen.com/i17624987/1c9b6b00c58f12ac.png)
三、dicussion
- iteration
- stream data
- GFS
- aviod network
网友评论