Computer Architecture: Problem Set 4

Attempt by: Thursday, December 13, before class
Submit by: Tuesday, December 18, at start of class


Instructions

Your solution should be typed and printed with plenty of spacing within and between problems in black ink on single-sided, otherwise blank paper that is stapled (assuming it is more than one page).

Read the entire problem set thoroughly before beginning to write your solutions.

Show your work.


  1. Consider the following MIPS fragment:

          addi $v0, $zero, 0
          addi $t0, $zero, 0
    Loop: add  $t1, $a0, $t0
          lb   $t1, 0($t1)
          bne  $t1, $a2, Skip
          or   $zero, $zero, $zero
          addi $v0, $v0, 1
    Skip: addi $t0, $t0, 1
          bne  $t0, $a1, Loop
          or   $zero, $zero, $zero
    
    Assume the fragment corresponds to the body of a C function whose prototype declaration is:
      f(char s[], int n, char c);
    
    where s is a character string of length n (assumed to be greater than zero). Assume that $a0, $a1, $a2 correspond to s, n, c, respectively.

    1. In a clear and concise sentence, what does $v0 contain (in terms of s and c) after the code has executed?

    2. Assume the code is executed on a MIPS machine with a pipeline like described in Section 6.1 of Patterson and Hennessy. Between which instructions will there be pipeline hazards and of which kind? Explain.

    3. What is the purpose of the or instructions in this fragment?

    4. (Again, assume the code is to be executed on a MIPS machine with a pipeline like described in Section 6.1 of Patterson and Hennessy.) Rewrite the code without using any additional registers to make more efficient use of the pipeline and delayed branches.


  2. Do exercise 7.5 in PH, p. 556.

  3. For the following C code fragment, indicate which forms, if any, of locality (temporal or spatial, for instruction or for data) you think the corresponding machine code would exhibit upon execution:

      void trim(char *s) {
        while (*s != 0 && *s != '\n') {
          s++;
        }
        *s = 0;
      }
    
    Explain your reasoning.

  4. Consider an architecture with an address space and word size that are each eight bits. Suppose the architecture makes use of a direct-mapped cache that stores sixteen bytes of data where each block contains two words.

    1. What is the total number of bits required to implement one block of the cache? If that is rounded up to the nearest byte, how many bytes are required to implement the entire cache? Explain.
    2. Assume a freshly initialized cache. Suppose during the execution of a program the following sequence of address references are made:

      40, 80, 44, 24, 48, 24, 52, 44, 56, 80, 24, 80, 64, 80
      
      Indicate which references are hits and which are misses. Show the contents of the cache upon completion of this sequence of references.
  5. To what extent are cache misses handled by the operating system and to what extent by the hardware? What about for page faults? Explain.

  6. The year is 2013. The coolest, most awesome computer game (known as "CMAC") has just been released. Your best friend has just given you a copy of CMAC (legally, of course), but you are finding that your 2012 computer is not quite up to the task - you can play, but the game appears sluggish. You have saved up for this day. Unfortunately due to inflation, you can only afford to purchase one upgrade to your computer: a faster processor, a larger cache, more main memory, or a larger hard drive. (Assuming we still use hard drives.) On what do you spend your life savings and why?