Download Description of laws in networking and more Schemes and Mind Maps Law in PDF only on Docsity!
Chapter 6: Message Ordering and Group
Communication
Ajay Kshemkalyani and Mukesh Singhal
Distributed Computing: Principles, Algorithms, and Systems
Cambridge University Press
Outline and Notations
Outline
I Message orders: non-FIFO, FIFO, causal order, synchronous order
I Group communication with multicast: causal order, total order
I Expected behaviour semantics when failures occur
I Multicasts: application layer on overlays; also at network layer
Notations
I Network (N, L); event set (E , ≺)
I message mi^ : send and receive events si^ and r i
I send and receive events: s and r.
I M, send(M), and receive(M)
I Corresponding events: a ∼ b denotes a and b occur at the same process
I send-receive pairs T = {(s, r ) ∈ Ei × Ej | s corresponds to r }
Causal Order: Definition
Causal order (CO)
A CO execution is an A-execution in which, for all (s, r ) and (s′, r ′) ∈ T ,
(r ∼ r ′^ and s ≺ s′) =⇒ r ≺ r ′
If send events s and s′^ are related by causality ordering (not physical time
ordering), their corresponding receive events r and r ′^ occur in the same order
at all common dests.
If s and s′^ are not related by causality, then CO is vacuously satisfied.
s^1
r s r
r
(a) (b) (c) (d)
s r r
r
s s s
s
s s
s
s
s
r
r
r r
r
m (^) m m
m
m
m
m
m
m m
m
P 1
P 2
P 3 1 2
3
31 3 2 2 1
r^313 1 3 (^3 1 3 ) 3 1 1 3 3
2
2 2 2 2 2
2
2 m 2
1 1
Figure 6.2: (a) Violates CO as s^1 ≺ s^3 ; r 3 ≺ r 1 (b) Satisfies CO. (c) Satisfies CO. No send
events related by causality. (d) Satisfies CO.
Causal Order: Definition from Implementation Perspective
CO alternate definition
If send(m^1 ) ≺ send(m^2 ) then for each common destination d of messages m^1 and
m^2 , deliverd (m^1 ) ≺ deliverd (m^2 ) must be satisfied.
Message arrival vs. delivery:
I message m that arrives in OS buffer at Pi may have to be delayed until the
messages that were sent to Pi causally before m was sent (the “overtaken”
messages) have arrived!
I The event of an application processing an arrived message is referred to as a
delivery event (instead of as a receive event).
no message overtaken by a chain of messages between the same (sender,
receiver) pair. In Fig. 6.1(a), m 1 overtaken by chain 〈m 2 , m 3 〉.
CO degenerates to FIFO when m 1 , m2 sent by same process
Uses: updates to shared data, implementing distributed shared memory, fair
resource allocation; collaborative applications, event notification systems,
distributed virtual environments
Causal Order: Other Characterizations (2)
s^1
r s r r
(a) (b) (c) (d)
s r r r
s s s
s s s
s s
s r
r r r
r m
m (^) m
m
m m m
m m m
m
P 1 P 2 P 3 1 2
3
31 3 2 2 1
r^313 1 3 (^3 1 3 1 ) 3 1 3 3 2
2 2 2 2 2
2
2 m 2 1 1
Figure 6.2: (a) Violates CO as s^1 ≺ s^3 ; r 3 ≺ r 1 (b) Satisfies CO. (c) Satisfies CO. No send
events related by causality. (d) Satisfies CO.
Empty-Interval (EI) property
(E , ≺) is an EI execution if for each (s, r ) ∈ T , the open interval set
{x ∈ E | s ≺ x ≺ r } in the partial order is empty.
Fig 6.2(b). Consider M^2. No event x such that s^2 ≺ x ≺ r 2. Holds for all messages ⇒ EI For EI 〈s, r 〉, there exists some linear extension 1 < | such the corresp. interval {x ∈ E | s < x < r } is also empty. An empty 〈s, r 〉 interval in a linear extension implies s, r may be arbitrarily close; shown by vertical arrow in a timing diagram. An execution E is CO iff for each M, there exists some space-time diagram in which that message can be drawn as a vertical arrow. (^1) A linear extension of a partial order (E , ≺) is any total order (E , <)| each ordering relation of the partial order is preserved.
Causal Order: Other Characterizations (3)
CO 6 =⇒ all messages can be drawn as vertical arrows in the same space-time
diagram (otherwise all 〈s, r 〉 intervals empty in the same linear extension;
synchronous execution).
Common Past and Future
An execution (E , ≺) is CO iff for each pair (s, r ) ∈ T and each event e ∈ E ,
Weak common past: e ≺ r =⇒ ¬(s ≺ e)
Weak common future: s ≺ e =⇒ ¬(e ≺ r )
If the past of both s and r are identical (analogously for the future), viz.,
e ≺ r =⇒ e ≺ s and s ≺ e =⇒ r ≺ e, we get a subclass of CO executions,
called synchronous executions.
Synchronous Executions: Definition
Causality in a synchronous execution.
The synchronous causality relation on E is the smallest transitive relation that
satisfies the following.
S1. If x occurs before y at the same process, then x y
S2. If (s, r ) ∈ T , then for all x ∈ E , [(x s ⇐⇒ x r ) and
(s x ⇐⇒ r x)]
S3. If x y and y z, then x z
Synchronous execution (or S-execution).
An execution (E , ) for which the causality relation is a partial order.
Timestamping a synchronous execution.
An execution (E , ≺) is synchronous iff there exists a mapping from E to T (scalar
timestamps) |
for any message M, T (s(M)) = T (r (M))
for each process Pi , if ei ≺ e′ i then T (ei ) < T (e i′ )
Asynchronous Execution with Synchronous Communication
Will a program written for an asynchronous system (A-execution) run correctly if
run with synchronous primitives?
Process i Process j
... ... Send(j) Send(i) Receive(j) Receive(i) ... ...
Figure 6.4: A-execution deadlocks when using synchronous primitives.
An A-execution that is realizable under synchronous communication is a realizable
with synchronous communication (RSC) execution.
P 3
P
P
1
2
3
s
s
s s
s
r s
r
r
r r
r
r
1
1
3 2
3
2
2 2
2
2
1
1 1
(a) (b) (c)
r^3
s s^13
m
m m m m 2 1 2 m^ m^2 (^1) m 1
3
Figure 6.5: Illustration of non-RSC A-executions.
Crown: Definition
Crown
Let E be an execution. A crown of size k in E is a sequence 〈 (si^ ,r i^ ), i ∈ { 0,.. ., k-1 }
〉 of pairs of corresponding send and receive events such that:
s^0 ≺ r 1 , s^1 ≺ r 2 ,...... sk−^2 ≺ r k−^1 , sk−^1 ≺ r 0.
P 3
P
P
1
2
3
s
s
s s
s
r s
r
r
r r
r
r
1
1
3 2
3
2
2 2
2
2
1
1 1
(a) (b) (c)
r^3
s s^13
m
m m m m 2 1 2 m^ m^2 (^1) m 1
3
Figure 6.5: Illustration of non-RSC A-executions and crowns.
Fig 6.5(a): crown is 〈(s^1 , r 1 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 1 Fig 6.5(b) (b) crown is 〈(s^1 , r 1 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 1 Fig 6.5(c): crown is 〈(s^1 , r 1 ), (s^3 , r 3 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 3 and s^3 ≺ r 2 and s^2 ≺ r 1 Fig 6.2(a): crown is 〈(s^1 , r 1 ), (s^2 , r 2 ), (s^3 , r 3 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 3 and s^3 ≺ r 1.
Crown: Characterization of RSC Executions
P 3
P
P
1
2
3
s
s
s s
s
r s
r
r
r r
r
r
1
1
3 2
3
2
2 2
2
2
1
1 1
(a) (b) (c)
r^3
s s^13
m
m m m m 2 1 2 m^ m^2 (^1) m 1
3
Figure 6.5: Illustration of non-RSC A-executions and crowns.
Fig 6.5(a): crown is 〈(s^1 , r 1 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 1 Fig 6.5(b) (b) crown is 〈(s^1 , r 1 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 1 Fig 6.5(c): crown is 〈(s^1 , r 1 ), (s^3 , r 3 ), (s^2 , r 2 )〉 as we have s^1 ≺ r 3 and s^3 ≺ r 2 and s^2 ≺ r 1 Fig 6.2(a): crown is 〈(s^1 , r 1 ), (s^2 , r 2 ), (s^3 , r 3 )〉 as we have s^1 ≺ r 2 and s^2 ≺ r 3 and s^3 ≺ r 1.
Some observations
In a crown, si^ and r i+1^ may or may not be on same process
Non-CO execution must have a crown
CO executions (that are not synchronous) have a crown (see Fig 6.2(b))
Cyclic dependencies of crown ⇒ cannot schedule messages serially ⇒ not RSC
Hierarchy of Message Ordering Paradigms
(a)
SYNC
CO
FIFO
A
A
FIFO
CO
SYNC (b)
Figure 6.7: Hierarchy of message ordering paradigms. (a) Venn diagram (b) Example
executions.
An A-execution is RSC iff A is an S-execution.
RSC ⊂ CO ⊂ FIFO ⊂ A.
More restrictions on the possible message orderings in the smaller classes.
The degree of concurrency is most in A, least in SYN C.
A program using synchronous communication easiest to develop and verify. A
program using non-FIFO communication, resulting in an A-execution, hardest
to design and verify.
Simulations: Async Programs on Sync Systems
RSC execution: schedule events as per a
non-separated linear extension
adjacent (s, r ) events sequentially
partial order of original A-execution
unchanged
If A-execution is not RSC:
partial order has to be changed; or
model each Ci,j by control process Pi,j
and use sync communication (see Fig
m’
P
P
i
P
j
i,j
j,i
P
m
m
m’
Figure 6.8: Modeling channels as processes to
simulate an execution using asynchronous primitives on an synchronous system.
Enables decoupling of sender from
receiver.
This implementation is expensive.
Sync Program Order on Async Systems
Deterministic program: repeated runs produce same partial order
Deterministic receive ⇒ deterministic execution ⇒ (E , ≺) is fixed
Nondeterminism (besides due to unpredictable message delays):
Receive call does not specify sender
Multiple sends and receives enabled at a process; can be executed in
interchangeable order
∗[G 1 −→ CL 1 || G 2 −→ CL 2 || · · · || Gk −→ CLk ]
Deadlock example of Fig 6.
If event order at a process is permuted, no deadlock!
How to schedule (nondeterministic) sync communication calls over async system?
Match send or receive with corresponding event
Binary rendezvous (implementation using tokens)
Token for each enabled interaction
Schedule online, atomically, in a distributed manner
Crown-free scheduling (safety); also progress to be guaranteed
Fairness and efficiency in scheduling
Bagrodia’s Algorithm for Binary Rendezvous (1)
Assumptions
Receives are always enabled
Send, once enabled, remains enabled
To break deadlock, PIDs used to introduce asymmetry
Each process schedules one send at a time
Message types: M, ack(M), request(M), permission(M)
Process blocks when it knows it can successfully synchronize the current message
P M (^) ack(M)
permission(M) request(M) M
(a) (b)
higher priority
lower priority Pj
i
Fig 6.: Rules to prevent message cyles. (a) High priority process blocks. (b) Low
priority process does not block.