Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

CS 258 Parallel Computer Architecture Lecture 1 Introduction ..., Lecture notes of Computer Architecture and Organization

Dartmouth College Computer Architecture and Organization

A parallel computer is a collection of processing elements that cooperate to solve large problems. • Some broad issues:.

Typology: Lecture notes

2022/2023

Uploaded on 05/11/2023

captainamerica 🇺🇸

4.4

(13)

250 documents

1 / 15

This page cannot be seen from the preview

Don't miss anything!

CS 258

Parallel Computer Architecture

Lecture 1

Introduction to Parallel Architecture

January 23, 2008

Prof John D. Kubiatowicz

Lec 1.21/23/08 Kubiatowicz CS258 ©UCB Spring 2008

Computer Architecture Is …

the attributes of a [computing] system as seen

by the programmer, i.e., the conceptual

structure and functional behavior, as distinct

from the organization of the data flows and

controls the logic design, and the physical

implementation.

Amdahl, Blaaw, and Brooks, 1964

SOFTWARE

Lec 1.31/23/08 Kubiatowicz CS258 ©UCB Spring 2008

The Instruction Set: a Critical Interface

instruction set

software

hardware

•Properties of a good abstraction

–Lasts through many generations (portability)

–Used in many different ways (generality)

–Provides convenient functionality to higher levels

–Permits an efficient implementation at lower levels

–Changes very slowly! (Although this is increasing)

•Is there a solid interface for multiprocessors?

–No standard hardware interface

Lec 1.41/23/08 Kubiatowicz CS258 ©UCB Spring 2008

What is Parallel Architecture?

•A

parallel computer

is a collection of processing

elements that cooperate to solve large problems

•Some broad issues:

–Models of computation: PRAM? BSP? Sequential Consistency?

–Resource Allocation:

»how large a collection?

»how powerful are the elements?

»how much memory?

–Data access, Communication and Synchronization

»how do the elements cooperate and communicate?

»how are data transmitted between processors?

»what are the abstractions and primitives for cooperation?

–Performance and Scalability

»how does it all translate into performance?

»how does it scale?

Partial preview of the text

Download CS 258 Parallel Computer Architecture Lecture 1 Introduction ... and more Lecture notes Computer Architecture and Organization in PDF only on Docsity!

CS 258

Parallel Computer Architecture

Lecture 1

Introduction to Parallel Architecture

January 23, 2008

Prof John D. Kubiatowicz

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Computer Architecture Is …

the attributes of a [computing] system as seenby the programmer, i.e., the conceptualstructure and functional behavior, as distinctfrom the organization of the data flows andcontrols the logic design, and the physicalimplementation.

Amdahl, Blaaw, and Brooks,

SOFTWARESOFTWARE

Lec 1.

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

The Instruction Set: a Critical Interface

instruction set

software hardware

Properties of a good abstraction

Lasts through many generations (portability)

Used in many different ways (generality)

Provides convenient

functionality to higher levels

Permits an efficient implementation at lower levels

Changes very slowly! (Although this is increasing)

Is there a solid interface for multiprocessors?

No standard hardware interface

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

What is Parallel Architecture?

A

parallel computer is a collection of processing

elements that cooperate to solve large problems

Some broad issues:

Models of computation: PRAM? BSP? Sequential Consistency?

Resource Allocation:

how large a collection?

how powerful are the elements?

how much memory?

Data access, Communication and Synchronization

how do the elements

cooperate and communicate?

how are data transmitted between processors?

what are the abstractions and primitives for cooperation?

Performance and Scalability

how does it all translate into performance?

how does it scale?

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Topologies of Parallel Machines

Symmetric Multiprocessor

Multiple processors in box with sharedmemory communication

Current MultiCore chips like this

Every processor runs copy of OS

Non-uniform shared-memory withseparate I/O through host

Multiple processors

Each with local memory

general scalable network

Extremely light “OS” on node providessimple services

Scheduling/synchronization

Network-accessible host for I/O

Cluster

Many independent machine connected withgeneral network

Communication through messages

P

Bus

Memory

P/M

Host

Network

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Conventional Wisdom (CW) inComputer Architecture

Old CW: Power is free, but transistors expensive

New CW is the “

Power wall”:

Power is expensive, but transistors are “free”

Can put more transistors on a chip than have thepower to turn on

Old CW: Only concern is dynamic power

New CW: For desktops and servers, static power dueto leakage is 40% of total power

Old CW: Monolithic uniprocessors are reliableinternally, with errors occurring only at pins

New CW: As chips drop below 65 nm feature sizes,they will have high soft and hard error rates

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Conventional Wisdom (CW)in Computer Architecture

Old CW: By building upon prior successes, continueraising level of abstraction and size of HW designs

New CW: Wire delay, noise, cross coupling, reliability,clock jitter, design validation, …stretch development time and cost of large designs at ≤

65 nm

Old CW: Researchers demonstrate newarchitectures by building chips

New CW: Cost of 65 nm masks, cost of ECAD,and design time for GHz clocks ⇒

Researchers no longer build believable chips

Old CW: Performance improves latency & bandwidth

New CW: BW improves > (latency improvement)

2

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Conventional Wisdom (CW)in Computer Architecture

Old CW: Multiplies slow, but loads and stores fast

New CW is the “

Memory wall”:

Loads and stores are slow, but multiplies fast

200 clocks to DRAM, but even FP multiplies only 4 clocks

Old CW: We can reveal more ILP via compilersand architecture innovation

Branch prediction, OOO execution, speculation, VLIW, …

New CW is the “

ILP wall”:

Diminishing returns on finding more ILP

Old CW: 2X CPU Performance every 18 months

New CW is

Power Wall + Memory Wall + ILP Wall =

Brick Wall

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Déjà vu all over again?

Multiprocessors imminent in 1970s, ‘80s, ‘90s, …

“… today’s processors … are nearing an impasse as technologiesapproach the speed of light..”

David Mitchell,

The Transputer: The Time Is Now (1989)

Transputer was premature ⇒

Custom multiprocessors strove to lead uniprocessors

Procrastination rewarded: 2X seq. perf. / 1.5 years

“We are dedicating all of our future product development to

multicore designs. … This is a sea change in computing”

Paul Otellini, President, Intel (2004)

Difference is all microprocessor companies switch tomultiprocessors (AMD, Intel, IBM, Sun; all new Apples 2 CPUs) ⇒

Procrastination penalized: 2X sequential perf. / 5 yrs

Biggest programming challenge: 1 to 2 CPUs

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

CS258: Information

Instructor: Prof John D. Kubiatowicz

Office: 673

Soda Hall

Phone: 643-

Email: kubitron@cs.berkeley.edu

Office Hours:

Wed 1:00 - 2:00 or by appt.

Class: Mon, Wed 2:30-4:00pm

310 Soda Hall

Web page: http://www.cs/~kubitron/courses/cs258-S08/

Lectures available online <Noon day of lecture

Email: cs258@kubi.cs.berkeley.eduClip signup link on web page (as soon as it is up)

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Computer Architecture Topics (252+)

Instruction Set ArchitecturePipelining, Hazard Resolution,Superscalar, Reordering,Prediction, Speculation,Vector, Dynamic Compilation

Addressing,Protection,Exception Handling

L1 Cache

L2 Cache

DRAM

Disks, WORM, Tape

Coherence,Bandwidth,Latency

Emerging TechnologiesInterleavingBus protocols

RAID

VLSI

Input/Output and Storage

MemoryHierarchy

Pipelining and InstructionLevel Parallelism

Network

Communication

Other Processors

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Computer Architecture Topics (258)

M

Interconnection Network

S

P M P M P M P

Shared Memory,Message Passing,Data ParallelismTransactional MemoryCheckpoint/RestartNetwork InterfacesTopologies,Routing,Bandwidth,Latency,Reliability

Processor-Memory-Switch

MultiprocessorsNetworks and InterconnectionsProgramming Models/Communications StylesReliability/Fault ToleranceEverything in previous slide but more so!

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

What will you get out of CS258?

In-depth understanding of the design and engineeringof modern parallel computers

technology forces

Programming models

fundamental architectural issues

naming, replication, communication, synchronization

basic design techniques

cache coherence, protocols, networks, pipelining, …

methods of evaluation

from moderate to very large scale

across the hardware/software boundary

Study of REAL parallel processors

Research papers, white papers

Natural consequences??

Massive Parallelism

Reconfigurable computing?

Message Passing Machines

NOW

Peer-to-peer systems?

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Will it be worthwhile?

Absolutely!

Now, more than ever, industry trying to figure out how to buildthese new multicore chips….

The fundamental issues and solutions translateacross a wide spectrum of systems.

Crisp solutions in the context of parallel machines.

Pioneered at the thin-end of the platform pyramidon the most-demanding applications

migrate downward with time

Understand implicationsfor software

Network attachedstorage, MEMs, etc?

SuperServers

Departmenatal Servers

Workstations

Personal Computers

Workstations

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Role of a computer architect:

To design and engineer the various levels of a computersystem to maximize

performance and

programmability within

limits of

technology and

cost.

Parallelism: •

Provides alternative to faster clock for performance

Applies at all levels of system design

Is a fascinating perspective from which to viewarchitecture

Is increasingly central in information processing

How is instruction-level parallelism related to course-grainedparallelism??

Why Study Parallel Architecture?

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Is Parallel Computing Inevitable?

This was certainly not clear just a few years agoToday, however:

YES!

Industry is desperate for solutions!

Application demands: Our insatiable need forcomputing cycles

Technology Trends: Easier to build

Architecture Trends: Better abstractions

Current trends:

Today’s microprocessors are multiprocessors and/or havemultiprocessor support

Servers and workstations becoming MP: Sun, SGI, DEC,COMPAQ!...

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

TextBook: Two leaders in fieldText: Parallel Computer Architecture:

A Hardware/Software Approach,

By: David Culler & Jaswinder SinghCovers a range of topicsWe will not necessarily coverthem in order.

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

How will grading work?

No TA This Term!Rough Breakdown: •

20% Paper Summaries/Presentations

30% One Midterm

40% Research Project (work in pairs)

meet 3 times with me to see progress

give oral presentation

give poster session

written report like conference paper

6 weeks work full time for 2 people

Opportunity to do “research in the small” to help make transitionfrom good student to research colleague

10% Class Participation

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Application Trends

Application demand for performance fuels advances inhardware, which enables new appl’ns, which...

Cycle drives exponential increase in microprocessor performance

Drives parallel architecture harder

most demanding applications

Programmers willing to work really hard to improvehigh-end applications

Need incremental scalability:

Need range of system performance with progressively increasing cost

New Applications

More Performance

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Metrics of Performance

Compiler

Programming

Language Application

Datapath

Control

Transistors Wires Pins

ISA

Function Units

(millions) of Instructions per second: MIPS(millions) of (FP) operations per second:MFLOP/s

Megabytes per secondCycles per second (clock rate) Answers per monthOperations per second

And What about: Programmability, Reliability, Energy?

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Speedup

Speedup (p processors) =

Common mistake:

Compare parallel program on 1 processor to parallel programon p processors

Wrong!:

Should compare uniprocessor program on 1 processor toparallel program on p processors

Why? Keeps you honest

It is easy to parallelize overhead.

Time (1 processor)Time (p processors)

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Amdahl's Law

Speedup due to enhancement E:

ExTime w/o E

Performance w/

E

Speedup(E)

ExTime w/

E

Performance w/o E

Suppose that enhancement E accelerates a

fraction F of the task by a factor S, andthe remainder of the task is unaffected

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Amdahl’s Law for parallel programs?

(

)

par

overhead

par

parallel

ExTime

stuff

p,

ExTime

p

Fraction

Speedup

Best you could ever hope to do:

(

)

par

maximum

Fraction

Speedup

stuff

p,

ExTime

p

Fraction

ExTime

overhead

par

ser

par

×

Worse: Overhead may kill your performance!

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Where is Parallel Arch Going?

Application Software

System

Software

SIMD

Message Passing

Shared Memory

Dataflow

SystolicArrays

Architecture

Uncertainty of direction paralyzed parallel software development!

Old view: Divergent architectures, no predictable pattern of growth.

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Granularity: •

Is communication fine or coarse grained?

Small messages vs big messages

Is parallelism fine or coarse grained

Small tasks (frequent synchronization) vs big tasks

If hardware handles fine-grained parallelism, theneasier to get incremental scalability

Fine-grained communication and parallelism harderthan coarse-grained:

Harder to build with low overhead

Custom communication architectures often needed

Ultimate course grained communication:

GIMPS (Great Internet Mercenne Prime Search)

Communication once a month

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Current Commercial Computing targets

Relies on parallelism for high end

Computational power determines scale of business that can behandled

Databases, online-transaction processing, decisionsupport, data mining, data warehousing ...

Google, Yahoo, ….

TPC benchmarks (TPC-C order entry, TPC-D decisionsupport)

Explicit scaling criteria provided

Size of enterprise scales with size of system

Problem size not fixed as p increases.

Throughput is performance measure (transactions per minute or tpm)

Lec 1.

Kubiatowicz CS258 ©UCB Spring 2008

Scientific Computing Demand

1/23/

Kubiatowicz CS258 ©UCB Spring 2008

Engineering Computing Demand

Large parallel machines a mainstay in manyindustries

Petroleum (reservoir analysis)

Automotive (crash simulation, drag analysis, combustionefficiency),

Aeronautics (airflow analysis, engine efficiency, structuralmechanics, electromagnetism),

Computer-aided design

Pharmaceuticals (molecular modeling)

Visualization

in all of the above

entertainment (films like Toy Story)

architecture (walk-throughs and rendering)

Financial modeling (yield and derivative analysis)

etc.

Lec 1.

Can anyone afford high-end MPPs???

ASCI (Accellerated Strategic Computing Initiative)ASCI White: Built by IBM

12.3 TeraOps, 8192 processors (RS/6000)

6TB of RAM, 160TB

Disk

2 basketball courts in size

Program it??? Message passing

1/23/

Need New class of applications

Handheld devices with ManyCoreprocessors!

Great Potential, right?

Human Interface applications veryimportant:“The Laptop/handheld is the Computer”

’07: HP number laptops > desktops

1B+ Cell phones/yr, increasing in function

Obtellini demoed “Universal Communicator”(Combination cell phone, PC, and Video Device)

Apple iPhone

User wants Increasing Performance, Weeksor Months of Battery Power

Lec 1.

1980

1985

1990

1995

1 MIPS 10 MIPS 100 MIPS

1 GIPS

200 Sub-BandSpeech Coding

Words

Isolated SpeechRecognition

SpeakerVeri¼

cation

CELPSpeech Coding

ISDN-CD StereoReceiver

5,000 WordsContinuousSpeechRecognition

HDTVReceiverCIF Video

1,

Words

ContinuousSpeechRecognition

TelephoneNumberRecognition

10 GIPS

Also CAD, Databases,...

Applications: Speech and Image Processing

1/23/

Compelling Laptop/Handheld Apps

Meeting Diarist

Laptops/ Handheldsat meeting coordinateto create speakeridentified, partiallytranscribed textdiary of meeting

Teleconference speaker identifier,

speech helper

L/Hs used for teleconference, identifies who isspeaking, “closed caption” hint of what being said

Lec 1.

1

2

4

8

16

32

64

128

256

512

1 10 100 1000

2003

2005

2007

2009

2011

2013

2015

Why Target 100+ Cores? •

5-year research program aim 8+ years out

Multicore: 2X / 2 yrs

64 cores in 8 years

Manycore: 8X to 16X multicore

Automatic

Parallelization,Thread Level

Speculation

1/23/

4 Themes of View 2.0/ Par Lab

Applications

Compelling apps drive top-down research agenda

Identify Common Computational Patterns

Breaking through disciplinary boundaries

Developing Parallel Software with Productivity,Efficiency, and Correctness

2 Layers + Coordination & Composition Language+ Autotuning

OS and Architecture

Composable primitives, not packaged solutions

Deconstruction, Fast barrier synchronization, Partitions

Lec 1.

Personal

Health

Image

Retrieval

Hearing,

Music

Speech

Parallel Browser

Motifs/Dwarfs

Sketching

Legacy

Code

Schedulers

Communication &Synch. Primitives

Efficiency Language Compilers

Par Lab Research Overview

Easy to write correct programs that run efficiently on manycore

Legacy OS

Multicore/GPGPU

OS Libraries & Services

RAMP Manycore

Hypervisor

OS

Arch.

Productivity

Layer

Efficiency

Layer

Correctness

Applications

Composition & Coordination Language (C&CL)

Parallel Libraries

Parallel

Frameworks

Static

Verification

DynamicChecking

Debuggingwith Replay

Directed

Testing

Autotuners

C&CL Compiler/Interpreter

Efficiency Languages

Type

Systems

1/23/

How do compelling apps relate to 13 motif/dwarfs?

“Motifs" Popularity

(Red Hot

Blue CoolBlue Cool)

Embed

SPEC

DB

Games

ML

HPC

Health

Image

Speech

Music Browser

1 Finite State Mach.2 Combinational3 Graph Traversal4 Structured Grid5 Dense Matrix6 Sparse Matrix7 Spectral (FFT)8 Dynamic Prog9 N-Body 10 MapReduce11 Backtrack/ B&B12 Graphical Models13 Unstructured Grid

Lec 1.

Developing Parallel Software

2 types of programmers

2 layers

Efficiency Layer

(10% of today’s programmers)

Expert programmers build Frameworks & Libraries,Hypervisors, …

“Bare metal” efficiency possible at Efficiency Layer

Productivity Layer

(90% of today’s programmers)

Domain experts / Naïve programmers productively buildparallel apps using frameworks & libraries

Frameworks & libraries composed to form app frameworks

Effective composition techniques allows the efficiencyprogrammers to be highly leveraged

Create language for Composition and Coordination (C&C)

1/23/

Architectural Trends

Greatest trend in VLSI generation is increase inparallelism

Up to 1985: bit level parallelism: 4-bit -> 8 bit -> 16-bit

slows after 32 bit

adoption of 64-bit now under way, 128-bit far (notperformance issue)

great inflection point when 32-bit micro and cache fit on achip

Mid 80s to mid 90s: instruction level parallelism

pipelining and simple instruction sets, + compiler advances(RISC)

on-chip caches and functional units => superscalarexecution

greater sophistication: out of order execution, speculation,prediction

to deal with control transfer and latency problems

Next step: ManyCore.

Also: Thread level parallelism? Bit-level parallelism?

Lec 1.

0

1

2

3

4

5

6+

5 0 30 25 20 15 10

z

0

5

10

15

0

1

2

3

Fraction of total cycles (%)

Number of instructions issued

Speedup

Instructions issued per cycle

Can ILP go any farther?

Infinite resources and fetch bandwidth, perfectbranch prediction and renaming

real caches and non-zero miss latencies

1/23/

No. of processors in fully configured commercial shared-memory systems

Thread-Level Parallelism “on board”

Micro on a chip makes it natural to connect many toshared memory

dominates server and enterprise market, moving down to desktop

Alternative: many PCs sharing one complicated pipe

Faster processors began to saturate bus, then bustechnology advanced

today, range of sizes for bus-based systems, desktop to large servers

Proc

MEM

CS 258 Parallel Computer Architecture Lecture 1 Introduction ..., Lecture notes of Computer Architecture and Organization

Related documents

Partial preview of the text

Download CS 258 Parallel Computer Architecture Lecture 1 Introduction ... and more Lecture notes Computer Architecture and Organization in PDF only on Docsity!

CS 258

Parallel Computer Architecture

Lecture 1

Introduction to Parallel Architecture

January 23, 2008

Prof John D. Kubiatowicz

Computer Architecture Is …

the attributes of a [computing] system as seenby the programmer, i.e., the conceptualstructure and functional behavior, as distinctfrom the organization of the data flows andcontrols the logic design, and the physicalimplementation.

Amdahl, Blaaw, and Brooks,

SOFTWARESOFTWARE

The Instruction Set: a Critical Interface

Properties of a good abstraction

Is there a solid interface for multiprocessors?

What is Parallel Architecture?

A

parallel computer is a collection of processing

elements that cooperate to solve large problems

Some broad issues:

Topologies of Parallel Machines

Symmetric Multiprocessor

Non-uniform shared-memory withseparate I/O through host

Cluster

Old CW: Power is free, but transistors expensive

New CW is the “

Power wall”:

Power is expensive, but transistors are “free”

Can put more transistors on a chip than have thepower to turn on

Old CW: Only concern is dynamic power

New CW: For desktops and servers, static power dueto leakage is 40% of total power

Old CW: Monolithic uniprocessors are reliableinternally, with errors occurring only at pins

New CW: As chips drop below 65 nm feature sizes,they will have high soft and hard error rates

Old CW: By building upon prior successes, continueraising level of abstraction and size of HW designs

New CW: Wire delay, noise, cross coupling, reliability,clock jitter, design validation, …stretch development time and cost of large designs at ≤

65 nm

Old CW: Researchers demonstrate newarchitectures by building chips

New CW: Cost of 65 nm masks, cost of ECAD,and design time for GHz clocks ⇒

Researchers no longer build believable chips

Old CW: Performance improves latency & bandwidth

New CW: BW improves > (latency improvement)

Old CW: Multiplies slow, but loads and stores fast

New CW is the “

Memory wall”:

Loads and stores are slow, but multiplies fast

200 clocks to DRAM, but even FP multiplies only 4 clocks

Old CW: We can reveal more ILP via compilersand architecture innovation

Branch prediction, OOO execution, speculation, VLIW, …

New CW is the “

ILP wall”:

Diminishing returns on finding more ILP

Old CW: 2X CPU Performance every 18 months

New CW is

Power Wall + Memory Wall + ILP Wall =

Brick Wall

Déjà vu all over again?

Multiprocessors imminent in 1970s, ‘80s, ‘90s, …

“… today’s processors … are nearing an impasse as technologiesapproach the speed of light..”

David Mitchell,

The Transputer: The Time Is Now (1989)

Transputer was premature ⇒

Custom multiprocessors strove to lead uniprocessors

Procrastination rewarded: 2X seq. perf. / 1.5 years

“We are dedicating all of our future product development to

multicore designs. … This is a sea change in computing”

Paul Otellini, President, Intel (2004)

Difference is all microprocessor companies switch tomultiprocessors (AMD, Intel, IBM, Sun; all new Apples 2 CPUs) ⇒

Procrastination penalized: 2X sequential perf. / 5 yrs

Biggest programming challenge: 1 to 2 CPUs

CS258: Information

Instructor: Prof John D. Kubiatowicz

Office: 673

Soda Hall

Phone: 643-

Email: kubitron@cs.berkeley.edu

Office Hours:

Wed 1:00 - 2:00 or by appt.

Class: Mon, Wed 2:30-4:00pm