Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Space Time Diagram - Distributed Software Develop | CS 682, Study notes of Software Engineering

Material Type: Notes; Class: Distributed Software Develop; Subject: Computer Science; University: University of San Francisco (CA); Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-9e1
koofers-user-9e1 🇺🇸

10 documents

1 / 10

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Distributed Software Development
Consensus and Agreement
Chris Brooks
Department of Computer Science
University of San Francisco
Departmentof Computer Science University of San Francisco p. 1/??
Previously on CS 682
Time is a big challenge in systems that don’t share a
clock.
Insight: we often don’t need to know the exact time
that events occur.
Instead, we need to know the order in which they
happened.
Departmentof Computer Science University of San Francisco p. 2/??
Cause and Eff
Cause and effect can be used to produce a partial
ordering.
Local events are ordered by identifier.
Send and receive events are ordered.
If p1sends a message m1to p2,send(m1)must
occur before receive(m1).
Assume that messages are uniquely identified.
If two events do not influence each other, even
indirectly, we won’t worry about their order.
Departmentof Computer Science University of San Francisco
Happens before
The happens before relation is denoted .
Happens before is defined:
If ek
i, el
iand k < l, then ek
iel
i
(sequentially ordered events in the same process)
If ei=send(m)and ej=receive(m), then eiej
(send must come before receive)
If eeand ee′′, then ee′′
(transitivity)
If e6→ eand e6→ e, then we say that eand eare
concurrent. (e||e)
These events are unrelated, and could occur in either
order.
Departmentof Computer Science University of San Francisco p. 4/??
Happens before
Happens before provides a partial ordering over the
global history. (H, )
We call this a distributed computation.
A distributed computation can be represented with a
space-time diagram.
Departmentof Computer Science University of San Francisco p. 5/??
Space-time diagram
p1
p2
p3
e1
1e2
1e3
1e4
1
e1
2e2
2
e3
2e4
2
e1
3e2
3e3
3e4
3
Departmentof Computer Science University of San Francisco
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Space Time Diagram - Distributed Software Develop | CS 682 and more Study notes Software Engineering in PDF only on Docsity!

Distributed Software Development

Consensus and Agreement

Chris Brooks

Department of Computer Science

University of San Francisco

Department of Computer Science — University of San Francisco – p. 1/

??

Previously on CS 682

Time is a big challenge in systems that don’t share aclock.

Insight: we often don’t need to know the exact timethat events occur.

Instead, we need to know the order in which theyhappened.

Department of Computer Science — University of San Francisco – p. 2/

??

Cause and Ef

Cause and effect can be used to produce a partialordering.

Local events are ordered by identifier.

Send and receive events are ordered.

If

p

1

sends a message

m

1

to

p

2

,^

send

( m

1 )^

must

occur before

receive

( m

1 )

Assume that messages are uniquely identified.

If two events do not influence each other, evenindirectly, we won’t worry about their order.

Department of Computer Science — University of San Fra

Happens before

The

happens before

relation is denoted

Happens before is defined:

If

e

ki^ , e

li^

and

k < l

, then

e ki^

e

li

(sequentially ordered events in the same process)

If

e

i^

=

send

(m

)^

and

e

j^

=

receive

(m

) , then

e

i^

e

j

(send must come before receive)

If

e

e

′^

and

e

′^

e

′′

, then

e

e

′′

(transitivity)

If

e

6 →

e

′^

and

e

′^ 6 →

e

, then we say that

e and

e

′^

are

concurrent. (

e ||

e

These events are unrelated, and could occur in eitherorder.

Department of Computer Science — University of San Francisco – p. 4/

??

Happens before

Happens before provides a partial ordering over theglobal history.

(H,

)

We call this a distributed computation.

A distributed computation can be represented with aspace-time diagram.

Department of Computer Science — University of San Francisco – p. 5/

??

Space-time diag

p^1 p^2 p^3

e 11

(^2) e 1

e 31

e 41

e 12

(^2) e 2

e 32

e 42

e 13

e 23

e 33

(^4) e 3

Department of Computer Science — University of San Fra

Space-time diagram

Arrows indicate messages sent between processes.

Causal relation between events is easy to detect

Is there a directed path between events?

e 11

e

43

e 21 ||

e 13

Department of Computer Science — University of San Francisco – p. 7/

??

Monitoring a distributed computation

Recall that we want to know what the global state ofthe system is at some point in time.

Active monitoring won’t work

-^

Updates from different processes may arrive out oforder.

We need to restrict our monitor to looking at consistent cuts

A cut is consistent if, for all events

e

and

e

-^

e ∈

C

and

e

′^

e

e

′^

C

In other words, we retain causal ordering andpreserve the ’happens before’ relation.

Department of Computer Science — University of San Francisco – p. 8/

??

Synchronous communica

How could we solve this problem with synchronouscommunication and a global clock?

Assume FIFO delivery, delays are bounded by

δ

send

( i)

send

( j ) ⇒

deliver

( i)

deliver

(j

)

Receiver must buffer out-of-order messages.

Each event

e

is stamped with the global clock:

RC

( e

When a process notifies

p

0

of event

e , it includes

RC

(e

)^

as a timestamp.

At time

t

p

0

can process all messages with

timestamps up to

t^

δ

in increasing order.

No earlier message can arrive after this point.

Department of Computer Science — University of San Fra

Why does this work?

If we assume a delay of

δ

, at time

t

, all messages

sent before

t

δ

have arrived.

By processing them in increasing order, causality ispreserved.

e →

e

′^

RC

(e

)^

< RC

( e ′)

But we don’t

have

a global clock!!

Department of Computer Science — University of San Francisco – p. 10/

??

Logical clocks

Each process maintains a logical clock. (

LC

Maps events to natural numbers. (0,1,2,3,...).

In the initial state, all LCs are 0.

Each message

m

contains a timestamp indicating the

logical clock of the sending process.

After each event, the logical clock for a process isupdated as follows:

-^

LC

( e ) =

LC

  • 1

if

e

is a local or send event.

-^

LC

( e ) =

max

(LC, T S

( m

)) + 1

if

e

=

receive

( m

The LC is updated to be greater than both theprevious clock and the timestamp.

Department of Computer Science — University of San Francisco – p. 11/

??

Logical clock exam

p^1 p^2 p^3

1 1 1

2

2

3

4 4

5

5

6 5

7

6

7 Department of Computer Science — University of San Fran

Applying causal delivery

Causal delivery gives us almost all of the functionalitythat we need from a global clock.

We can build on top of this to solve more complexcoordination problems.

Coordination often requires not only that allprocesses agree on state, but that all processes canensure that every other process sees the same state.

Department of Computer Science — University of San Francisco – p. 19/

??

Consensus and agreement

A fundamental problem in distributed systems isgetting a set of processes or nodes to agree on oneor more values.

-^

Is a procedure continuing or aborted?

-^

What value is stored in a distributed database?

-^

Which process is serving as coordinator?

-^

Has a node failed?

There are a set of related problems that require a setof processes to coordinate their states or actions.

Department of Computer Science — University of San Francisco – p. 20/

??

Coordination via em

An example:

Two people (A and B) want to meet at dusktomorrow evening at a local hangout.

Each wants to show up only if the other one will bethere.

They can send email to each other, but email maynot arrive.

Can either one guarantee that the other will bethere?

Department of Computer Science — University of San Fran

Failure models

We’ll want to distinguish what sorts of failures thesealgorithms can tolerate.

No failure

Some of the algorithms we’ll see can’t tolerate afailure.

Crash failure

This means that a node stops working and fails torespond to all messages.

Byzantine failure

A node can exhibit arbitrary behavior.

This makes things pretty hard for us ...

Department of Computer Science — University of San Francisco – p. 22/

??

Failure detection

How can we detect whether a failure has happened?

A simple method:

-^

Every

t

seconds, each process sends an “I am

alive” message to all other processes.

-^

Process

p

knows that process

q

is either

unsuspected

suspected

, or

failed

If

p sees

q ’s message, it knows

q

is alive, and sets its

status to unsuspected.

What if it doesn’t receive a message?

Department of Computer Science — University of San Francisco – p. 23/

??

Failure detec

Depends on our communication model.

Synchronous communication: if after

d seconds

(where

d is the maximum delay in message delivery)

we haven’t received a message from

p

,^

p

has failed.

Ansychronous or unreilable communication: if themessage is not received, we can say that

p

is

suspected of failure.

Department of Computer Science — University of San Fran

Failure detection

Other problems:

What if

d

is fairly large?

We can think processes are still running that havein fact crashed.

This is what’s called an

unreliable

failure detector.

It will make mistakes, but, given enough information,it may still be of use.

Can provide hints and partial information.

As we look at different algorithms, we’ll need to thinkabout whether we can detect that a process hasfailed.

Department of Computer Science — University of San Francisco – p. 25/

??

Multicast: a brief digression

The Couloris chapter talks quite a bit about how toachive different properties with multicastcommunication.

-^

Reliable multicast

-^

Ordered multicast^ •

FIFO ordering

-^

Total ordering

-^

Causal ordering

The punchline: Totally ordered multicast is equivalentto the consensus problem.

Department of Computer Science — University of San Francisco – p. 26/

??

What is multica

Consider that a process needs to send a message toa

group

of other processes.

It could:

Send a point-to-point message to every otherprocess. •

Inefficient, plus need to know all other processesin group.

Broadcast to all processes in subnet. •

Wasteful, won’t work in wide-area network.

Multicast allows the process to do a single send.Packet is delivered to all members of the group.

Department of Computer Science — University of San Fran

Multicast groups

Notice that multicast is a packet-orientedcommunication.

Same send/receive semantics as UDP

A process joins a multicast group (designated by anIP address)

It then receives all messages sent to that IP address.

Groups can be closed or open.

Multicast can be effectively used to do sharedwhiteboards, video or audio conferencing, or tobroadcast speeches or presentations.

Middleware needed to provide ordering.

Department of Computer Science — University of San Francisco – p. 28/

??

Mutual exclusion

Mutual exclusion is a familiar problem from operatingsystems.

-^

There is some resource that is shared by severalprocesses.

-^

Only one process can use the resource at a time.

-^

Shared file, database, communications medium

Processes request to enter their

critical section

, then

enter, then exit.

In a centralized system, this can be negotiated withshared objects. (locks or mutexes).

Distributed systems rely only on message passing!

Department of Computer Science — University of San Francisco – p. 29/

??

Mutual exclus

Our goals for mutual exclusion:

safety: Only one process uses the resource at atime.

liveness: everyone eventually gets a turn. •

This implies no deadlock or starvation.

ordering: if process

i

’s request to enter its CS

happens-before (in the causal sense) process

j

’s,

then process

i

should enter first.

Department of Computer Science — University of San Fran

Mutual exclusion: multicast

Example: consider

p

1

,^

p

2

,^

p 3

p 3

doesn’t need CS.

T

(

p 1 )

T

( p

2 )

p 1

and

p

2

request CS.

p 3

replies immediately to both.

When

p

2

gets

p

1 ’s request, it queues it.

p 1

replies to

p 2

immediately.

Once

p

2

exits, it replies to

p

1

Department of Computer Science — University of San Francisco – p. 37/

??

Mutual exclusion: multicast

Provides liveness, safety

Also provides ordering

-^

That’s the reason for logical clocks.

Still can’t deal with failure.

Also scaling problems.

Optimization: can enter the CS when a majority ofreplies are received.

Department of Computer Science — University of San Francisco – p. 38/

??

Dealing with failu

If a failure occurs, it must first be detected.

As we’ve seen, this can be difficult.

Once failure is detected, a new group can be formedand the protocol restarted.

Group formation involes a two-phase protocol.

Coordinator broadcasts group change to allmembers.

Once all reply, a commit is broadcast to allmembers.

Once all members reply to the commit, a newgroup is formed.

Department of Computer Science — University of San Fran

Election algorithms

How can we decide which process should play therole of server or coordinator?

We need for all processes to agree.

We can do this by means of an election.

Any process can start an election

for example, if it notices that the coordinator fails.

We would still like safety (only one process is chosen)and liveness (the election process is guaranteed tofind a winner).

Even when more than one election is startedsimultaneously.

Department of Computer Science — University of San Francisco – p. 40/

??

Choosing a leader

Assume each process has an identifying value.

Largest value will be the new leader.

-^

We could use load, or uptime, or a randomnumber.

Department of Computer Science — University of San Francisco – p. 41/

??

Ring-based election algorith

Assume processes are arranged in a logical ring.

A process starts an election by placing its identifierand value in a message and sending it to its neighbor.

Department of Computer Science — University of San Fran

Ring-based election algorithms

When a message is received:

If the value is greater than its own, it saves theidentifier and forwards the value to its neighbor.

Else if the receiver’s value is greater and thereceiver has not participated in an election already,it replaces the identifer and value with its own andforwards the message.

Else if the receiver has already participated in anelection, it discards the message.

If a process receives its own identifer and value itknows it is elected. It then sends an electedmessage to its neighbor.

When an elected message is received, it isforwarded to the next neighbor.

Department of Computer Science — University of San Francisco – p. 43/

??

Ring-based election algorithms

Safety is guaranteed - only one value can be largestand make it all the way through the ring.

Liveness is guaranteed if there are no failures.

Inability to handle failure once again ...

Department of Computer Science — University of San Francisco – p. 44/

??

Bully algori

The

bully

algorithm can deal with crash failures.

Assumption: synchronous, reliable communication

When a process notices that the coordinator hasfailed, it sends an election message to allhigher-numbered processes.

If no one replies, it declares itself coordinator andsends a new-coordinator message to all processes.

If someone replies, its job is done.

When process

q

receives an election message from a

lower-numbered process:

Return a reply.

Start an election.

Department of Computer Science — University of San Fran

Bully algorithm

Guarantees safety and liveness.

Can deal with crash failures

Assumes that there is bounded message delay

Otherwise, how can we distinguish between acrash and a long delay?

Department of Computer Science — University of San Francisco – p. 46/

??

Consensus

All of these algorithms are examples of theconsensus problem.

-^

All processes must agree on a state

Let’s take a step back and think about when theconsensus problem can be solved.

Department of Computer Science — University of San Francisco – p. 47/

??

Consen

We’ll start with a set of processes

p

1

,^

p 2

p

n

All processes can propose a value, and everyonemust agree at the end.

We’ll assume that communication is reliable.

Processes can fail.

Both Byzantine and crash failures.

We’ll also specify whether processes can digitallysign messages.

This limits the damage Byzantine failures can do.

We’ll specify whether communication is synchronousor asynchronous.

Department of Computer Science — University of San Fran

Summarizing

We can survive 1/3 Byzantine failures in asynchronous system with reliable delivery.

In an asynchronous system, we can’t guaranteeconsensus after a single crash failure.

Without reliable communication, consensus isimpossible to guarantee.

In general, we can trade off process reliability fornetwork reliability.

Department of Computer Science — University of San Francisco – p. 55/

??

Summary

Consensus can take a nuber of forms:

-^

Mutual exclusion

-^

Leader election

-^

Consensus

Many special-purpose algorithms exist.

General results about what is possible can help indesigning a system or deciding how (or whether) totackle a problem.

Department of Computer Science — University of San Francisco – p. 56/

??