Dr. Pardach Singly

6/12/2012

## Department of Computer Sc. and Engineering National Institute of Technology Hamirpur

Architecture of Large Systems (CS-613) End Semester Exam

Max Marks: 50 Max Time: 3.00 Hrs.

Note: All Questions are compulsory. Non-Programmable calculator is allowed. Assume necessary data if something missing.

| Q.<br>No. | SECTION A                                                                                                                                                                                                                                                                                                               | Marks | 3  |
|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|----|
| I.        | <ul><li>a) Why cache hit ratios are not an objective measurement of cache effectiveness?</li><li>b) What is the difference between miss rate and miss penalty?</li></ul>                                                                                                                                                | 5     |    |
| II.       | <ul><li>a) What do u mean by Superlinear Speedup? Is it achievable?</li><li>b) Why we use Loop Fusion?</li></ul>                                                                                                                                                                                                        | 5     | •  |
| III.      | <ul><li>a) What is the significance of "Sign Extend" register in MIPS architecture?</li><li>b) What is the value of R0 register? Why is it so?</li></ul>                                                                                                                                                                | 5     |    |
| IV.       | You know that lower the CPI, the better it is, and higher the MIPS, the better it is. Since the CPI of the optimized program increases, and the MIPS decreases, was this really an "optimization"? Explain your answer.                                                                                                 | 5     |    |
| V.        | You are given a non-pipelined processor design which has a cycle time of 10ns and average CPI of 1.4.                                                                                                                                                                                                                   | 5     |    |
|           | Calculate the latency speedup in the following questions.                                                                                                                                                                                                                                                               |       |    |
|           | (Note: The solutions given assume the base $CPI = 1.4$ throughput. Since the question is ambiguous, you could assume pipelining changes the $CPI$ to 1. The method for computing the answers still                                                                                                                      |       |    |
|           | apply.)                                                                                                                                                                                                                                                                                                                 |       |    |
|           | a) If each pipeline stage added also adds 20ps due to register setup delay, what is the best speedup you can get compared to the original processor?                                                                                                                                                                    |       |    |
|           | b) The pipeline from <i>a</i> ) stalls 20% of the time for 1 cycle and 5% of the time for 2 cycles (these occurrences are disjoint). What is the new CPI? What is the speedup compared to the original processor?                                                                                                       |       |    |
| VI.       |                                                                                                                                                                                                                                                                                                                         | 5     |    |
| ٧1.       | o In one case, there are more data elements per block and fewer blocks o In another case, there are fewer elements per block but more blocks  However, in both cases – i.e. larger blocks but fewer of them OR shorter blocks, but more of them – the cache's total capacity (amount of data storage) remains the same. |       |    |
|           | What are the pros and cons of each organization? Support your answer with a short example assuming that the cache is direct mapped.                                                                                                                                                                                     |       |    |
| VII.      | Assuming that N instructions are executed, and all N instructions are add instructions, what is the speedup of a pipelined implementation when compared to a multi-cycle implementation? <i>Your answer should be an expression that is a function of N.</i>                                                            | 5     |    |
|           | SECTION B                                                                                                                                                                                                                                                                                                               | 245   |    |
| VIII.     | a) Bisection bandwidth?                                                                                                                                                                                                                                                                                                 | 3*5=  | 15 |

Network Diameter.

Pipeline efficiency

Hardwired vs Micro-coded control

HIT ratio

b) c)

d)

e)