
















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Cao notes for student docent useful to preparation exams
Typology: Cheat Sheet
1 / 24
This page cannot be seen from the preview
Don't miss anything!
MIPS is implementation of a RISC architecture MIPS R2000 ISA Designed for use with high-level programming languages o small set of instructions and addressing modes, easy for compilers Minimize/balance amount of work (computation and data flow) per instruction o allows for parallel execution Load-store machine o large register set, minimize main memory access fixed instruction width (32-bits), small set of uniform instruction encodings o minimize control complexity, allow for more registers MIPS Instructions MIPS instructions fall into 5 classes: o Arithmetic/logical/shift/comparison o Control instructions (branch and jump) o Load/store o Other (exception, register movement to/from GP registers, etc.) Three instruction encoding formats R-type (6-bit opcode, 5-bit rs, 5-bit rt, 5-bit rd, 5-bit shamt, 6-bit function code) I-type (6-bit opcode, 5-bit rs, 5-bit rt, 16-bit immediate) J-type (6-bit opcode, 26-bit pseudo-direct address)
Basic MIPS Implementation For all instructions, the first two steps are common: Set the program counter (PC) to the memory location that contains the code and fetch the instruction from that memory location. Read one or two registers, using the fields of the instruction. Fig: An abstract view of the implementation of the MIPS subset showing the major functional units and the major connections between them. After these two steps, the actions required to complete the instruction depends on the instruction class. Three commonly used instruction classes are
The result from the ALU or memory is written back into the register file. Branches require the use of the ALU output to determine the next instruction address. The thick lines interconnecting the functional units represent buses, which consist of multiple signals. Figure B shows the datapath of Figure A with the three required multiplexors added, as well as control lines for the major functional units. A control unit, uses the instructions as an input, to determine how to set the control lines for the functional units and two of the multiplexors. The regularity and simplicity of the MIPS instruction set means that a simple decoding process can be used to determine how to set the control lines. The top multiplexor (“Mux”) controls what value replaces the PC (PC + 4 or the branch destination address); the multiplexor is controlled by the gate that “ANDs” together the Zero output of the ALU and a control signal used to indicate if the instruction is a branch. The middle multiplexor, whose output returns to the register file, is used to steer the output of the ALU (in the case of an arithmetic-logical instruction) or the output of the data memory (in the case of a load) for writing into the register file. Finally, the bottommost multiplexor is used to determine whether the second ALU input is from the registers (for an arithmetic-logical instruction or a branch) or from the offset field of the instruction. The added control lines are straightforward and determine the operation performed at the ALU, whether the data memory should read or write, and whether the registers should perform a write operation.
The data path is separated into five pieces, with each piece named corresponding to a stage of instruction execution:
The following figure shows the reading and writing operations done during a stage ( here instruction decode stage). It show that reading operation is performed from IF/ID register to get input for the stage and after completion of execution of that stage the results are written in ID/EX register. Note: The read and write operations are shaded differently. Similarly other stages follows with the same read / write operations with respect to the execution of each stage and pipelined buffers.
Now in this section we add control to the pipelined data path. The PC is written on each clock cycle, so there is no separate write signal for the PC. There are no separate write signals for the pipeline registers (IF/ID, ID/EX, EX/MEM, and MEM/WB), since the pipeline registers are also written during each clock cycle. Each control line is associated with a component, that is active in only a single pipeline stage. we can divide the control lines into five groups according to the pipeline stage.
1. Instruction fetch: The control signals to read instruction memory and to write the PC are always asserted, so there is nothing special to control in this pipeline stage. 2. Instruction decode/register file read: As in the previous stage, the same thing happens at every clock cycle, so there are no optional control lines to set. 3. Execution/address calculation: The signals to be set are RegDst, ALUOp, and ALUSrc. The signals select the Result register, the ALU operation, and either Read data 2 or a sign- extended immediate for the ALU. 4. Memory access: The control lines set in this stage are Branch, MemRead, and MemWrite. The branch equal, load, and store instructions set these signals, respectively. Recall that PCSrc in the figure selects the next sequential address unless control asserts Branch and the ALU result was 0. 5. Write-back: The two control lines are MemtoReg , which decides between sending the ALU result or the memory value to the register file, and Reg-Write , which writes the chosen value.
“a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Pipelining is an implementation technique in which multiple instructions are overlapped in execution.” EXAMPLE: LAUNDRY EXAMPLE The non-pipelined approach to laundry wouls be as follows STAGES: 1. Place one dirty load of clothes in the washer.
Total time taken for execution is 1400 ps, while in Non-pipelined approach is 2400 ps. But as per formula Time taken for pipelined approach = time taken (non-pipelined approach)/ No of stages = 2400 ps / 5 = 480 ps But the practical results show, it is 1400 ps. So only when the No. of instructions in pipelined execution is high enough, the theoretical execution speed can be achieved or nearly achieved. Pipelining improves performance by increasing instruction throughput. Designing instruction sets for pipelining
There are situations in pipelining when the next instruction cannot execute in the following clock cycle. These events are called hazards. There are 3 types of Hazards: Structural hazard: When a planned instruction cannot execute in the proper clock cycle because the hardware does not support the combination of instructions that are set to execute. Data hazard / pipeline data hazard: When a planned instruction cannot execute in the proper clock cycle because data that is needed to execute the instruction is not yet available. Control hazard / branch hazard: When the instruction cannot execute in the proper pipeline clock cycle because the instruction that was fetched is not the one that is needed; that is, the flow of instruction addresses is not what the pipeline expected. Structural hazard The hardware cannot support the combination of instructions that we want to execute in the same clock cycle. Assume that we had a single memory instead of two memories. If the pipeline had a fourth instruction, then in the same clock cycle the first instruction is accessing data from memory while the fourth instruction is fetching an instruction from that same memory. Without two memories, our pipeline could have a structural hazard. Data Hazards: Data hazards occur when the pipeline must be stalled because one step must wait for another to complete. Data hazards arise when an instruction’s execution dependence on the data provided by another instruction which is still in the pipeline. For example, suppose we have an add instruction followed immediately by a subtract instruction add $s0 ,$t0,$t 1 sub $t2, $s0 , $t The add instruction doesn’t write its result until the fifth stage, meaning that we would have to waste three clock cycles in the pipeline. To resolve the data hazard, as soon as the ALU creates the sum for the add instruction, it can be given as an input for the subtract before writing it into s0. This is called forwarding or bypassing. Figure below shows the forwarding path from the output of the EX stage of add to the input of the EX stage for sub, replacing the value from register $s0 read in the second stage of sub.
Pipeline showing stalling on every conditional branch as solution to control hazards. This example assumes the conditional branch is taken, and the instruction at the destination of the branch is the OR instruction. There is a one-stage pipeline stall, or bubble, after the branch. In reality, the process of creating a stall is slightly more complicated. The cost of this option is too high for most computers to use and so a second solution to the control hazard is to predict. This option does not slow down the pipeline when you are correct. One simple approach is to predict always that branches will not be taken. If everything goes right, the pipeline proceeds at full speed. Only when branches are taken, then the pipeline stalls (prediction goes wrong). In cases of loops , branches are predicted to be taken because they jump back to the top of the loop. In case of Conditional statements , branches are predicted to be not taken because they jump towards forward direction with reference to the program flow. There are two types of branch prediction. Static branch prediction Based on typical branch behavior of the branching statements. Example: loop and if-statement branches o loop: predict taken o If: Predict not taken Dynamic branch prediction Hardware measures actual branch behavior of branches. o e.g., record recent history of each branch Assume future behavior will continue the trend and refers the stored history. o When history data goes wrong, its stall while re-fetching the new branch and updates the history accordingly
Consider this sequence: sub $2 , $1,$3 # Register $2 written by sub and $12 ,$2 ,$5 # 1st operand($2) depends on sub or $13,$6 ,$2 # 2nd operand($2) depends on sub add $14,$2 ,$2 # 1st($2) & 2nd($2) depend on sub sw $15,100 ($2 ) # Base ($2) depends on sub The last four instructions are all dependent on the result in register $2 of the first instruction. Register $2 had the value 10 before the subtract instruction and – 20 after subtraction The programmer intends that −20 will be used in the following instructions that refer to register $2. Pipelined dependences in a five-instruction sequence using simplified datapaths to show the dependences The above diagram shows the dependence of each instruction with respect to the first instruction SUB and the result stored in $2 register. As above, the and & or instructions would get the incorrect value 10(assumed value of $2 before execution of SUB instruction). Instructions that would get the correct value of – 20 are add & sw (since both the instruction will need the value from / after CC5(clock cycle).
To introduce stall when forwarding fails Similarly, consider an example where the first instruction is an LOAD instruction, and the second instruction is dependent and needs the result of LOAD instruction in its EX stage. Then the forwarding not possible as data cannot be forwarded in time backward. The following diagram will show the need for stall operation. Similarly as done before to detect the data hazard, the conditions for source and destination register in IF/ID and ID/EX registers are checked respectively, if the register number is same then stall is introduced. Steps to introduce the stall Force control values in ID/EX register to 0. Introduction of NOP, stages EX, MEM and WB do NOP (no-operation) , in this case stall is introduced from the second instruction. Prevent update of PC and IF/ID register (in this case 3rd instruction will not be loaded immediately) Same instruction is decoded again. (second instruction is decoded ,as per given example) and the following instruction is fetched again. (3rd instruction is fetched as per given example). Now, the first instruction is moved to MEM cycle, so the result can be forwarded to the second instruction (given example).