## Advanced and parallel architectures Prof. A. Massini June 13, 2017 Part A Student's Name Matricola number

| Exercise 1 (5 points) |  |
|-----------------------|--|
| Exercise 2 (4 points) |  |
| Exercise 3 (4 points) |  |
| Exercise 4 (5 points) |  |
| Exercise 5 (7 points) |  |
| Exercise 6 (3 points) |  |
| Exercise 7 (4 points) |  |
| Total (32 points)     |  |

#### Exercise 1 (3 + 2 points) – Instruction pipeline

a) Consider the following loop expressed in a high level language:

The program has been written in MIPS assembly code, assuming that registers \$t6 and \$t7 have been initialized with values 0 and 4N respectively. The symbols VECTA, VECTB and VECTC is a 16-bit constant.

Let us consider the loop executed by 5-stage pipelined MIPS processor WITHOUT any optimisation in the pipeline.

- 1. Identify the Hazard Type (Data Hazard or Control Hazard) in the last column
- 2. In the first column identify the number of stalls to be inserted before each instruction (or between stages IF and ID of each instruction) necessary to solve the hazards
- 3. For each hazard, add an ARROW to indicate the pipeline stages involved in the hazard

| Num.   | INSTRUCTION            | <b>C1</b> | C2 | С3 | C4 | <b>C5</b> | <b>C7</b> | C6 | C8 | <b>C9</b> | C10 | C11 | C12 | C13 | C14 | Hazard |
|--------|------------------------|-----------|----|----|----|-----------|-----------|----|----|-----------|-----|-----|-----|-----|-----|--------|
| Stalls |                        |           |    |    |    |           |           |    |    |           |     |     |     |     |     | Туре   |
|        | FOR: beq \$t6,\$t7,END | IF        | ID | EX | ME | WB        |           |    |    |           |     |     |     |     |     |        |
|        | lw \$t2,VECTA(\$t6)    |           | IF | ID | EX | ME        | WB        |    |    |           |     |     |     |     |     |        |
|        | addi \$t2,\$t2,4       |           |    | IF | ID | EX        | ME        | WB |    |           |     |     |     |     |     |        |
|        | sw \$t2,VECTA(\$t6)    |           |    |    | IF | ID        | EX        | ME | WB |           |     |     |     |     |     |        |
|        | lw \$t3,VECTB(\$t6)    |           |    |    |    | IF        | ID        | EX | ME | WB        |     |     |     |     |     |        |
|        | addi \$t3,\$t3,2       |           |    |    |    |           | IF        | ID | EX | ME        | WB  |     |     |     |     |        |
|        | sw \$t3,VECTB(\$t6)    |           |    |    |    |           |           | IF | ID | EX        | ME  | WB  |     |     |     |        |
|        | addi \$t6,\$t6,4       |           |    |    |    |           |           |    | IF | ID        | EX  | ME  | WB  |     |     |        |
|        | blt \$t6,\$t7,FOR      |           |    |    |    |           |           |    |    | IF        | ID  | EX  | ME  | WB  |     |        |

b) Specify the number of stalls actually inserted taking into account that solving some hazards can help to solve those that follow

# Exercise 2 (4 points) - Number representation Represent the natural number range [0; 359] using the residue number system, considering different choices of the moduli set (at least three different choices). Compare the different choices with respect to the number of bits necessary for the representation. Consider also the number of bits needed for representing the range [0; 359] with the conventional binary system.

Represent A= 45 and B=67 using the considered different choices of the moduli sets, and show how to compute the sum A+B.

### Exercise 3 (4 points) - Circuit time and area

Compute the time (propagation delay) and area required by the 4-bits Carry-Save-Adder, that is an adder for three values A, B and C, shown here below.

Compute the speedup of 4-bits Carry-Save-Adder with respect to the standard binary ripple-carry adder.



### **Exercise 4 (5 points) Pipelined operations**

Given the values A=6 and B=5, show the scheme and the execution of the pipelined multiplications AxB. Verify the results.

| Exercise 5 (4 + 3 points) – Number representation |                                                                                                                                                              |  |  |  |  |  |
|---------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| a)                                                | Given the values A= 00 00 00 11 10 and B = 01 00 10 11 01 in the signed RB (Redundant Binary) representation, convert A and B in decimal. Show the execution |  |  |  |  |  |
|                                                   | of operation A+B.                                                                                                                                            |  |  |  |  |  |
|                                                   |                                                                                                                                                              |  |  |  |  |  |

b) Show the procedure to verify if the result is 0.  $\,$ 

#### Exercise 6 (3 points) – Loop dependences

In the following loop, find all the true dependences, output dependences and antidependences. Eliminate the output dependences and antidependences by renaming.

#### Exercise 7 (4 points) – Arithmetic operations

Show the execution of the multiplication AxB, where A=10110001 and B=011101110, in the standard way and using the Booth recoding. Explain what is the gain when Booth recoding is applied.