Assume a classic five-stage single-pipeline microarchitectur…

Assume a classic five-stage single-pipeline microarchitecture (fetch, decode, execute, memory, write back). Also, assume (1) one branch delay slot (originally with an NOP instruction in it) (2) adequate hardware resources, and (3) branch calculation always taken in the Decode stage. Also, the MULTI instruction is fully pipelined and takes two execution cycles. For the following loop,              Loop:            LW            R3, 0(R6)     ; load word at memory address [R6] + 0 to R3  LW            R1, 0(R3)     ; load word at memory address [R3] + 0 to R1  MULTI      R1, R1, #6    ; Multiplying [R1] by 6 and put in R1  SW            R1, 0(R3)     ; Store [R1] at memory address [R3] + 0  ADDI         R6, R6, #4  BNEQ        R6, R4, Loop  a.      How many pipeline stalls are there for one loop if there is no forwarding scheme implemented? Show the details. b.     Assuming perfect pipeline, unroll the loop two times and reschedule the instructions to reduce pipeline stalls. Show the details.

You need to run two applications, A and B, on a Pentium proc…

You need to run two applications, A and B, on a Pentium processor with two identical cores. The CPU time for the first application (A) is 3 times of the second application (B). A has 50% of its code parallelizable (i.e. instructions can run at two cores simultaneously) and B has 80% code parallelizable. B can only start after A completes. How much overall system speedup can you achieve if you parallelized both applications?

7. In class we discussed several strategies for joint replac…

7. In class we discussed several strategies for joint replacements. Select one discussed in class (spine arthroplasty, hip arthroplasty, knee arthroplasty) and describe the clinical problem (briefly) and the implant design, including the location and role of the polymer component. How would you improve upon the current design.