Distributed Systems
- Distributed system is a software system in which components located on networked computers communicate and coordinate their actions by passing messages.
Scalability
- Scalability is the ability of a system, network, or process, to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.Two particularly relevant aspects
• Performance
• Availability
Permformance aspects
• Short response time/low latency for a given piece of
work
• High throughput (rate of processing work)
• Low utilization of computing resource(s)
Availability
- Availability = uptime / (uptime + downtime)
- Availability % Downtime per year
- 90 % ("one nine") More than one month
- 99.9 % ("three nines") Less than 9 hours
- 99.9999% ("six nines") 31 seconds
Fault tolerance
- Fault tolerance ability of a system to behave in a well-defined manner once faults occur.
- Failures are norm.
Replication
- Replication is making copies of the same data on
multiple machines. - image.png
- image.png
Consistency
- image.png
Why strong consistency is hard to achieve
Nodes
• each node executes a program concurrently
• knowledge is local
• global state is potentially out of date
• nodes can fail and recover from failure independently
• messages can be delayed or lost
• clocks are not synchronized across nodes
Links
• Asynchronous system model.
• No timing assumptions.
• No bound on message transmission delay
• Useful clocks do not exist.
ACID
• Atomic
• Consistent
• Isolated
• Durable
BASE
• Basic Availability
• Softstate
• Eventual consistency
CAP theorem
- It is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
• Consistency (all nodes see the same data at the same time)
• Availability (every request received by a nonfailing
[database] node in the system must result in a [nonerror]
response)
• Partition tolerance (the system continues to operate despite
arbitrary partitioning due to network failures) - image.png
- image.png
- image.png
- image.png
CP systems
- Protocols:
• Strict quorum protocols (paxos, raft, zab)
• 2PC - Storages:
• MongoDB
• HBase
• Zookeeper
AP System
- image.png
- image.png
AP systems
- Protocols:
• Partial quorum protocols - Storages:
• Couch DB
• Cassandra
• Amazon Dynamo
CAP fifteen years later
●Partitions are rare, there is little reason to forfeit C or
A when the system is not partitioned.
●Choice between C and A can occur many times
within the same system at very fine granularity.
●All three properties are more continuous than binary.
●Most software doesn’t neatly fit CP/AP definition.
网友评论