Download C952 Computer Architecture Study guide Review and more Study Guides, Projects, Research Computer Engineering and Programming in PDF only on Docsity!
C952 Computer Architecture Study guide Review
1. Active matrix display: display using a transistor to control the transmission of light
at each individual pixel.
2. Address: value used to delineate the location of a specific element within data array
3. Address Interleaving: Instead of just a faster row buffer, the DRAM can be internally
organized to read or write from multiple banks, with each having its own row buffer. Sending an address to several banks permits them all to read or write simultaneously. For example, with four banks, there is just one access time and then accesses rotate between the four banks to supply four times the bandwidth.
4. Address mapping: Another term for address translation
5. Address mode: one of several addressing regimes delimited by their varied use of
operands and/or addresses
6. Address translation: The process by which a virtual address is mapped to an
address used to access memory
7. Algorithmic Logic Unit (ALU): Hardware that performs addition, subtraction and
usually logic operations like AND, OR
8. Aliasing: A situation in which two addresses access the same object; it can occur in
virtual memory when there are two virtual addresses for the same physical page
9. Amdahl's Law: A formula used to find the maximum improvement possible by
improving a particular part of a system. In parallel computing, It is mainly used to predict the theoretical maximum speedup for program processing using multiple processors
10. AND: A logical bit- by-bit operation with two operands that calculates a 1 only if
there is a 1 in both operands
11. anti dependence/name dependence: An ordering forced by the reuse of a name,
typically a register, rather than by a true dependence that carries a value between two instructions.
12. Application Binary Interface (ABI): The user portion of the instruction set plus the
operating system interfaces used by application programmers. It defines a standard for binary portability across computers.
13. architectural registers: The instruction set of visible registers of a processor; for
example, in LEGv8, these are the 32 integer and 32 floating-point registers.
14. Assembly Language: symbolic language that can be translated into binary
machine language
19. Base/Displacement Addressing: operand is at the memory location whose
address is the sum of a register and a constant in the instruction
20. Basic Block: a sequence of instructions w/o branches (except maybe at end), w/
branch targets or branch labels (except maybe at beginning)
21. Biased notation: A notation that represents the most negative value by 00 ... 000two
and the most positive value by 11 ... 11two, with 0 typically having the value 10 ... 00two, thereby biasing the number such that the number plus the bias has a non-negative representation
22. Big Endian: A CPU or memory architecture in which the most significant byte is
stored at the lowest memory address.
23. Binary Digit: one of the two numbers in base 2, 1 and 0
24. Block (or line): The minimum unit of information that can be either present or not
present in a cache
25. Blocking: can help reduce cache miss rate
26. Branch Address Table: table of addresses of alternative instruction sequences
27. Branch history table: Also called branch prediction buffer
28. Branch not taken/untaken branch: A branch where the branch condition is false
and the program counter (PC) becomes the address of the instruction that sequentially follows the branch
29. Branch Prediction: A method of resolving a branch hazard that assumes a given
outcome for the conditional branch and proceeds from that assumption rather than waiting to ascertain the actual outcome.
30. Branch prediction buffer: A small memory that is indexed by the lower portion of
the address of the branch instruction and that contains one or more bits indicating whether the branch was recently taken or not
31. Branch taken: A branch where the branch condition is satisfied and the program
counter (PC) becomes the branch target. All unconditional branches are these
32. Branch target address: The address specified in a branch, which becomes the new
program counter (PC) if the branch is taken. In the LEGv8 architecture, this is given by the sum of the offset field of the instruction and the address of the branch
33. Branch target buffer: A structure that caches the destination PC or destination
instruction for a branch. It is usually organized as a cache with tags, making it more costly than a simple prediction buffer
34. Branch-and-link instruction: An instruction that branches to an address and
37. Cache miss: A request for data from the cache that cannot be filled because the
data are not present in the cache
38. cache ready signal: set in the Compare Tag state if requested read or write is a hit
39. Callee: A procedure that executes a series of stored instructions based on
parameters provided by the caller and then returns control to the caller.
40. Caller: The program that instigates a procedure and provides the necessary
parameter values.
41. Capacity miss: A cache miss that occurs because the cache, even with full
associativity, cannot contain all the blocks needed to satisfy the request
42. CDC 6600: This system is widely considered to have been the first supercom- puter.
Also first load-store architecture
43. Clocking methodology: The approach used to determine when data is valid and
stable relative to the clock
44. Cluster: A set of computers connected over a local area network that function as a
single large multiprocessor
45. Coarse-grained multi threading: implies switching between threads only after
significant events, such as a last-level cache miss
46. Coherence: ensures that a read of a data item returns the most recently written
value of that data item. It defines what values can be returned by a read
47. Combinational element: An operational element, such as an AND gate or an ALU
48. commit unit: The unit in a dynamic or out-of-order execution pipeline that decides
when it is safe to release the result of an operation to programmer-visible registers and memory.
49. complementary metal-oxide semiconductor (CMOS): Dominant technology for
integrated circuits
50. Compulsory miss: Also called cold-start miss. A cache miss caused by the first
access to a block that has never been in the cache
51. Conditional Branch: tests a value and that allows for a subsequent transfer of
control to a new address in the program based on the outcome of the test
52. Conflict miss: Also called collision miss. A cache miss that occurs in a set-as-
sociative or direct-mapped cache when multiple blocks compete for the same set and that are eliminated in a fully associative cache of the same size
53. context switch: A changing of the internal state of the processor to allow a different
process to use the processor that includes saving the state needed to return to the
55. Control Hazard/Branch Hazard: When the proper instruction cannot execute in the
proper pipeline clock cycle because the instruction that was fetched is not the one that is needed; that is, the flow of instruction addresses is not what the pipeline expected
56. Control signal: Used for multiplexor selection or for directing the operation of a
functional unit
57. Coprocessor: an additional chip that accelerates a portion of the work of a
processor; in this case, it accelerated floating-point computation
58. Correlating predictor: Combines local behavior of a particular branch and global
information about the behavior of some recent number of executed branches
59. CPU Time Formula: (Instructions) x (CPI) x (Clock Cycle Time)
60. Data hazard: When a planned instruction cannot execute in the proper clock cycle
because data that is needed to execute the instruction are not yet available
61. data race: two memory accesses for a data race if they are from different threads to
the same location, at least one is a write, and they occur after one another
62. Data signal: contains information that is operated on by a functional unit
63. Data transfer instruction: A command that moves data between memory and
registers.
64. Data-level parallelism: Parallelism achieved by performing the same operation on
independent data
65. Datapath: The component of the processor that performs arithmetic operations
66. Datapath element: A unit used to operate on or hold data within a processor
67. Deasserted: The signal is logically low or false
68. dependent variable: symbol representing an output
69. Design Principle 1: Simplicity favors regularity
70. Design principle 2: Smaller is faster
71. Design Principle 3: Good design demands good compromises
72. destination register: register that receives the result of an operation
73. Die: The individual rectangular sections that are cut from a wafer, more informal- ly
known as chips.
74. Digital Equipment Corporation (DEC): A major American company in the
computer industry from the 1950s to the 1990s
75. Direct-mapped cache: Each memory location is mapped to exactly one location in the
cache. Only one choice of what to replace.
79. Don't Care Term: An element of a logical function in which the output does not
depend on the values of all the inputs. Don't-care terms may be specified in different ways
80. double precision: A floating-point value represented in 64-bit words.
81. Doubleword: Another natural unit of access in a computer, usually a group of 64
bits; corresponds to the size of a register in the LEGv8 architecture
82. Dynamic branch prediction: Prediction of branches at runtime using runtime
information
83. dynamic multiple issue: An approach to implementing a multiple-issue proces- sor
where many decisions are made during execution by the processor.
84. dynamic pipeline scheduling: Hardware support for reordering the order of
instruction execution to avoid stalls.
85. Dynamically Linked Libraries (DLL): Library routines that are linked to a
program during execution
86. Edge-triggered clocking: A clocking scheme in which all state changes occur on a
clock edge
87. EOR: exclusive or, 1 if the two values are different from one another
88. Error detection code: A code that enables the detection of an error in data, but not
the precise location and, hence, correction of the error
89. Exception: Also called interrupt. An unscheduled event that disrupts program
execution; used to detect overflow
90. Exception enable: A signal or action that controls whether the process re- sponds to
an exception or not; necessary for preventing the occurrence of exceptions during intervals before the processor has safely saved the state needed to restart
91. Exception Link Register (ELR): 64-bit. Used to hold the address of the affected
instruction. (needed even when exceptions are vectored.)
92. Exception Syndrome Register or ESR: Used to record the cause of the exception. In
the LEGv8 architecture, this is 32 bits, although some bits are currently unused. Assume there is a field that encodes the three possible exception sources mentioned above, with 8 representing an undefined instruction, 10 representing arithmetic overflow or underflow, and 12 representing hardware malfunction
93. Executable file: A functional program in the format of an object file that contains no
unresolved references. It can contain symbol tables and debugging information. A "stripped executable" does not contain that information.
97. FADDS, FSUBS: Single-precision arithmetic
98. false sharing: When two unrelated shared variables are located in the same cache
block and the full block is exchanged between processors even though the processors are accessing different variables.
99. FCMPS, FCMPD: Single- and double-precision comparison
100. fields: a machine instruction is composed of fields, each having several bits and
representing some part of the instruction
101. Fine-grained multithreading: implies switching between threads after every
instruction
102. Finite-state machine: A sequential logic function consisting of a set of inputs and
outputs, a next-state function that maps the current state and the inputs to a new state, and an output function that maps the current state and possibly the inputs to a set of asserted outputs.
103. Flash memory: A nonvolatile semiconductor memory. It is cheaper and slower than
DRAM but more expensive per bit and faster than magnetic disks. Access times are about 5 to 50 microseconds and cost per gigabyte in 2012 was $0.75 to $1.00.
104. Floating point: Computer arithmetic that represents numbers in which the
binary point is not fixed
105. Flush: To discard instructions in a pipeline, usually due to an unexpected event
106. forwarding (bypassing): A method of resolving a data hazard by retrieving the
missing data element from internal buffers rather than waiting for it to arrive from programmer-visible registers or memory
107. fraction: the value ,generally be tween 1 and 0, placed in the fraction field, anka
mantissa
108. frame buffering: A portion of RAM containing a bitmap that drives a video
display. It is a memory buffer containing a complete frame of data.
109. Frame Pointer: value denoting the location of the saved registers and local vars
for a given procedure
110. Fully associative cache: structure in which a block can be placed in any
location in the cache
111. fused multiply add: A floating-point instruction that performs both a multiply and
an add, but rounds only once after the add.
112. Gaurd: first two bits on the right used during intermediate floating point calcu-
lations, improves accuracy
116. Hardware multithreading: Increasing utilization of a processor by switching to
another thread when one thread is stalled
117. Hexadecimal: Numbers in base 16 (0-9 then A-F)
118. High-level-language computer architecture: Proposed in the 1960s, this failed to
make much of a commercial impact. Better compilers and programming languages, and growing memory sizes led to its demise.
119. Hit rate: The fraction of memory accesses found in a level of the memory
hierarchy
120. Hit time: The time required to access a level of the memory hierarchy, including the
time needed to determine whether the access is a hit or a miss
121. Hot-swapping: Replacing a hardware component while the system is running
122. IBM 360/91: Introduced many new concepts, including dynamic detection of
memory hazards, generalized forwarding, and reservation stations.
123. IBM 7030: Produced with the goal of being 100 times faster than the previous IBM
124. immediate addressing: operand is a constant within the instruction itself
125. Implementation: Hardware that obeys the architecture abstraction
126. imprecise interrupt: Also called imprecise exception. Interrupts or exceptions in
pipelined computers that are not associated with the exact instruction that was the cause of the interrupt or exception. The unpopularity of these led to the standard of commit units in dynamically scheduled pipelined processors
127. in order commit: A commit in which the results of pipelined execution are
written to the programmer visible state in the same order that instructions are fetched.
128. independent variable: symbol representing an input
129. Ingot: A rod composed of a silicon crystal that is between 8 and 12 inches in
diameter and about 12 to 24 inches long.
130. Instruction decode (ID): Pull apart the instruction, set up the operation in the ALU,
and compute the source and destination operand addresses
131. Instruction Fetch (IF): Move instruction from memory to the control unit
132. Instruction format: A form of representation of an instruction composed of
fields of binary numbers
133. Instruction level parallelism: The parallelism among instructions.
134. Instruction set: The vocabulary of commands understood by a given architec- ture.