Tight Latency and Availability requirements
In this section, we summarise evidence of the negative effects of users perceiving high latencies. Shopzilla, a shopping website, reported that a 5 second speed up (from around 7 to 2 seconds) to execute a purchase resulted in a 25% increase in page views, a 7-12% increase in revenue, a 50% reduction in hardware costs and a 120% increase in traffic from Google . Amazon finds that every 100ms of latency costs them 1% in sales . Google finds that adding 500 milliseconds to search page generation time decreases traffic by 20% .
Wall Street traders place orders of a small amount, and erase them after less than 500 ms. This is done to observe how slower traders react. This way, they obtain a competitive advantage over their competitors. « High-frequency traders generated about $21 billion in profits using this method in 2009 » .
Psychological studies show that slow responses make users experience increased frustration . Moreover, users tend to perceive slow websites as not trustworthy  and poor quality . On the contrary, fast websites are perceived to be more interesting  and attractive .
The effects of downtimes
In this section, we summarise evidence of the negative effects of service downtimes. A 2004 study found that the tolerable wait time on non-working links without feedback peaked at between 5 to 8 seconds . Amazon has reported that « even the slightest service outage has significant financial consequences and impacts customer trust  ». When the Amazon.com site went down for approximately 49 minutes in January of 2013, it cost the company an estimated $4 million in lost sales. Another outage in August of the same year lasted 30 minutes and cost an estimated of almost $ 2 million. Google’s five-minute outage in 2013 caused an estimated loss of $500.000 . Because of the effects of high latencies and downtimes, system designs must foster latency and availability. As we will see in the following section, this comes at the cost of ease of programming.
Geo-distribution and the CAP theorem
Cloud services rely on geo-distribution to minimise user-perceived latencies and to maximise availability: a user connects to his closest data centre, thus avoiding wide-area network delays. In case of full-data-centre failures, he can be redirected to a healthy data centre. Nevertheless, this architecture poses a design choice to these services, known as the CAP theorem . Long-distance inter-data-centre network links exhibit latencies in the order of the tens to hundreds of milliseconds, which are orders of magnitude higher than their under-millisecond intra-data-centre counterparts. Moreover, as network partitions between data centres do occur in production systems and are more complex to handle than those between co-located servers , geo-replicated systems must be designed with network partitions (P) in mind. This forces a choice between (low-latency, weakly-consistent) highly-available (AP) and (high-latency, unavailable-under-partition) strongly-consistent (CP) designs: ensuring both strong consistency and high availability under network partitions is impossible.
A strongly-consistent design simplifies the task of developing the application logic. It provides the abstraction of a single sequential system as it hides the complexity of replication by keeping replicas synchronised at all times. Nevertheless, it exposes users to the high latencies and downtimes of the network links between replicas as operations have to be synchronised across data centres before finishing execution.
A highly-available design provides an « always-on » experience and excellent responsiveness by synchronising replicas lazily, out of the critical path of an operation . Users can execute operations entirely at a single data centre, avoiding the need to wait for data centres to syn-chronise, which happens in the background, after replying to the client. Nevertheless, it exposes concurrency issues that render application-logic development hard and error prone . We introduce such issues in Section 3.3.3.
Motivated by the tight latency and availability requirements of these systems, in this work, we focus mainly on AP designs.
In the following sections, we introduce the guarantees provided by many AP and CP isolation models. We explain how these guarantees affect performance and ease of programming.
Moving away from, back to ACID
Transactional semantics simplify the task of programming the application logic: the ACID properties eliminate the concerns of handling concurrent access to data consistently. Nevertheless, the advent of distributed and replicated architectures has pushed designs, called NoSQL, to completely eschew query languages and transactional isolation [33, 37, 39, 43, 73]. The reason behind this trend is availability and latency. Isolation mechanisms were originally designed for single-machine architectures, and porting them as is to a cloud environment results in storage that is slow and unavailable under certain kinds of failures not present in the single-machine environment (e.g., network partitions between data centres). Therefore, developing the application logic on top of these stores is a hard task.
Recently, the complexity of developing applications over stores with no transactional isolation has motivated the development of distributed transactional isolation. This has resulted in production systems and research prototypes providing a wide variety of guarantees for cloud environments [7, 40, 60, 70, 71, 75, 81]. We focus, in the following sections, in the inherent trade-offs of implementing atomicity and isolation in cloud deployments and recently proposed isolation models.
Atomicity of Updates (or All-or-Nothing)
All-or-Nothing ensures that, at any point in time, either all of a transaction’s writes are in the store, or none of them is. They are instrumental for ensuring state transitions consistently with respect to certain invariants. Examples include foreign key constrains to represent relationships between data records (e.g., the symmetry of the friendship or the like relationships in a social network application : if Bob belongs to Alice’s friends, then Alice belongs to Bob’s and, if Bob likes a photo, the photo is liked by Bob), secondary indexing and materialised view maintenance (e.g., keeping a list of comments of a certain post and a comments count can be done by updating, with each comment creation, the comments count instead of computing this count on reads) . Moreover, all-or-nothing updates simplify rolling back inconsistent intermediate state of failed transactions. Consider, for instance, a bank application where a transfer withdraws money from an account and deposits it into another. If the withdrawal succeeds but not the deposit, programmers using storage lacking atomic updates need to develop mechanism to detect and roll back the withdrawal, such as a compensating deposit.
In transactional storage, atomically-updating data in parallel (or distributed ) environments is achieved by an atomic commitment protocol such as Two-Phase Commit , where all updated entities must agree on applying their individual updates, before the transaction is effectively committed.
In the following section, we present existing read isolation properties. Unless stated otherwise, they assume update atomicity.
Transactional Isolation Levels (Consistency Criteria)
The stronger the level of isolation a system implements, the more its behaviour resembles that of a sequential system. However, implementing strong isolation in a highly-parallel setting requires concurrency control, i.e., techniques that limit the possible interleaving of a transaction’s operations. This introduces an overhead over a transaction’s execution. Weakening isolation exposes anomalous effects of concurrency, called anomalies. The presence of anomalies increases the complexity of developing correct applications. On the positive side, reducing concurrency control boosts performance by imposing fewer restrictions on how operations interleave, which enables a variety of optimisations.
A large body of research, both from the parallel and the distributed programming communities, has focused on proposing isolation levels (also called consistency criteria) along the design space created by the isolation-performance trade-off. In this section, we introduce isolation levels and concurrency control techniques. In Chapter 12, we propose new isolation levels and, in Chapters 5 and 13, new implementations thereof. In Chapter 16, we elaborate on how existing systems and research prototypes implement isolation. A list of many commercial database systems and their default and maximum offered isolation level can be found elsewhere .
Multi-version concurrency control (MVCC)
MVCC is a form of optimistic concurrency control (OCC) that relies on keeping multiple versions of each object. Transactions are allowed to execute optimistically under the assumption that no concurrency issues will happen, and only checked for concurrency issues at the end.
Optimistic Execution. Under MVCC, transactions read object versions from a database snapshot. A snapshot represents a view of the state of the store composed by a version of each object. Each isolation property defines the rules that object versions must satisfy to belong to a snapshot. For instance, a requirement of strong isolation models (e.g., Serialisability and Snapshot Isolation) is that all atomically-created updates are in a snapshot, or none is. This is not a requirement of weak isolation (e.g., Read Committed and Read Uncommitted isolation).
Concurrency checks. At commit time, if necessary, the read and/or update operations of a transaction undergo checks to verify the transaction has not interleaved with other concurrent transactions in ways forbidden by the target isolation property. If the certification passes, the transaction commits. Otherwise, it aborts . For instance, under Snapshot Isolation, a trans-action reading an updating a certain object must certify that no other transaction has modified the same object since the object was read. In case a concurrent modification is detected, the transaction aborts (Section 220.127.116.11).
Choosing a technique.
MVCC does not incur the overheads of acquiring and removing locks. MVCC is useful to im-plement strong isolation when the levels of contention are low. High contention can cause interleaving transactions to access the same objects, which can lead to transactions being aborted and retried frequently. MVCC is also the only technique used to implement weak isolation where transactions do not require exclusive access to objects and, therefore, they never abort due to concurrency issues.
Lock-based concurrency control is suited for strong isolation, which requires exclusive ac-cess to objects. In particular, this technique is applied to workloads that exhibit high levels of contention, as it never causes a transaction to abort.
Table of contents :
List of Tables
List of Figures
1.1.1 Part I – Cure: Strong semantics meets high availability and low-latency
1.1.2 Part II – The three-way trade-off for transactional reads
2 System Model
2.1 Cloud Service Architecture
2.2 Tight Latency and Availability requirements
2.2.1 The effects of latency
2.2.2 The effects of downtimes
2.3 Geo-distribution and the CAP theorem
2.3.1 CP designs
2.3.2 AP designs
3 Storage Semantics
3.1 Transactions – ACID properties
3.1.1 Moving away from, back to ACID
3.2 Atomicity of Updates (or All-or-Nothing)
3.3 Transactional Isolation Levels (Consistency Criteria)
3.3.2 Concurrency control
18.104.22.168 Lock-based concurrency control
22.214.171.124 Multi-version concurrency control (MVCC)
126.96.36.199 Choosing a technique.
188.8.131.52 Mixing them.
184.108.40.206 Dirty read.
220.127.116.11 Non-repeatable read.
18.104.22.168 Lost update.
22.214.171.124 Write skew.
126.96.36.199 Non-monotonic snapshots.
188.8.131.52 Read Skew.
184.108.40.206 Real-time violation.
220.127.116.11 Order Violation.
3.3.4 CP (Strong) Isolation
18.104.22.168 Strict Serialisability (SS) – no anomalies
22.214.171.124 Serialisability (S) – relaxing real-time ordering
126.96.36.199 Snapshot Isolation (SI) – removing serialisability checks from read operations
188.8.131.52 Update Serialisability (US) – non-monotonic snapshots
184.108.40.206 Parallel Snapshot Isolation (PSI)
3.3.5 AP (Weak) Isolation
220.127.116.11 Transactional Causal Consistency (TCC)
18.104.22.168 Read Atomic (RA)
22.214.171.124 Read Committed (RC and RCÅ)
126.96.36.199 No Isolation (NI)
3.3.6 Summary of anomalies allowed/disallowed by Isolation levels
3.4 Session guarantees
3.5 Single-object Consistency and Isolation
3.5.1 CP Consistency
3.5.2 AP Consistency
188.8.131.52 Causal consistency
184.108.40.206 Eventual Consistency
220.127.116.11 Ensuring Convergence
I Cure: strong semantics meets high availability and low latency
4 Introduction to Part I
5 Overview of Cure
5.1 Transactional Programming Model
5.2 Programming interface
5.3 Design – causal consistency
5.3.1 Updates applied in causal order for high availability.
5.3.2 Dependency stabilisation for scalability.
5.3.3 Vector clocks for serving fresh data.
6 Protocol description
6.1 Notation and definitions
6.2 Transaction Execution
6.3 Replication and stable snapshot computation
6.5.1 Session Guarantees
6.5.2 Efficient SS computation
6.5.3 Garbage Collection
6.5.4 Support for CRDTs
7 Evaluation of Cure
7.2 Cure’s scalability
7.3 Comparison to other systems
8 Conclusion of Part I
II The three-way trade-off: Read Isolation, Latency and Freshness
9 Introduction to Part II
10.2 Snapshot guarantees
10.2.1 Committed Visibility
10.2.2 Order-Preserving Visibility
10.2.3 Atomic Visibility
10.3.1 Minimal Delay
10.3.2 Bounded delay
10.3.3 Mutex reads/writes (or unbounded delay)
10.4.1 Latest Freshness
10.4.2 Stable Freshness
10.4.3 Concurrent Freshness
10.5 Optimal reads
11 The three-way trade-off
11.1 Notation and Definitions
11.2 Impossibility of optimal reads under ordered visibility
11.3 What freshness is compatible with minimal delay?
11.3.1 Optimal reads under Committed Visibility
11.3.2 Order-Preserving visibility and concurrent freshness
11.3.3 Minimal-delay Atomic Visibility requires stable freshness
11.4 What is possible under latest freshness?
11.5 Isolated reads with bounded delays and concurrent freshness.
12 Unexplored Isolation Levels
12.0.1 CV-US and OP-US
13 Minimal-delay protocol design
13.1 Cure recapitulation
13.2 Changes to the base protocol
13.3 Transaction execution overview
13.3.1 Transaction Coordinator (Algorithm 13.1)
13.3.2 Partitions (Algorithm 13.2)
13.4 Protocol particularities and correctness
13.5 Stabilisation protocol
13.6 Causal consistency: session guarantees
13.6.1 Read your writes
13.6.2 Monotonic Reads
14.3.1 Single-shot read-only transactions
14.3.2 Facebook-like read-only transactions.
15 Conclusion of Part II
16 Causal Consistency for Cloud Deployments
16.1 Causally-Consistent Systems
16.1.1 Single-machine Causal Consistency
16.2 Strongly-Consistent Systems Enforcing Causal Consistency
17 Comparison of Transactional Systems
17.0.1 Weakly-consistent systems
17.0.2 Strongly-consistent systems