## Homework 3: Memory Management

1. (10 pts) If the virtual address space supported is 2**64 bits (note bits not bytes), the page size is 1Kbyte, the size of the physical memory is 64Kbyte, the size of a PTE is two bytes, and the addressing is at the byte level, calculate the size of the page table required for both standard and inverted page tables.

Standard page table:

number of pages = 2^61 / 1K = 2^51;
Page Table Size = 2^51 * (PTE size) = 2^52 bytes

Inverted page table:

Total frames = 64k / 1K = 64;
Page Table Size = 64* (PTE size) = 128 bytes

1. (10 pts) What is the difference between inverted and hashed page tables?

Inverted: contains one entry per physical frame as opposed to one for each virtual page as in the case of standard page tables. This makes it much smaller compared to standard page tables. However, the lookup time can be very high, because the whole table has to be linearly searched typically.

Hashed: maintains a hash-table of virtual-to-physical translations. Has the advantage of inverted tables (smaller size compared to standard tables) while the lookup is much faster compared to inverted tables: typically 1 or 2 accesses.

1. (10 pts) Consider a paging system with the page table stored in memory.
• (4pts) If a memory access takes 200 nanoseconds, how long does a paged memory reference take?

2 memory accesses: page lookup followed by actual access => 2*200ns = 400ns

• (4 pts.) If we add associative registers (a TLB), and 75 percent of all page table references are found in associative registers, what is the effective memory reference time? Assume that finding a page-table entry in the TLB takes zero time.

75% * TLB hit-time + 25% * TLB miss-time = 75% * 200ns + 25% * 400ns = 250ns

• (2 pts.) A typical program has 20% memory instructions. Assume there are 5% data TLB misses, each requiring 100 cycles to handle.
Assume each instruction requires 1 cycle to execute, each memory operation in the cache 1 cycle, 10% of data accesses are cache misses, each cache miss is 15 cycle. How long would take to execute 1000 instructions?

# cache misses = 1000 * 20% * 10% = 20

# TLB misses = 1000 * 20% * 5% = 10

Total cycles      = 1000cycles + (overhead due to cache misses) + (overhead due to TLB misses)

= 1000 + (20 * 14) + (10 * 99) = 1000 + 280 + 990 = 2270 cycles

1. Extra Credit (20 pts) Compare the address translation mechanisms of the Alpha 21164 with the PowerPC 604. Read sections from the paper I distributed.
• Page table type, show if you can an entry from the page table (i.e., the PTE)

Alpha: a standard page table with software-managed TLB.

PowerPC: uses inverted and hashed page tables with hardware-managed TLB.

• How is the TLB content updated?

Alpha: The TLB is managed by the OS directly through PALcode. PTEs are loaded from hardware registers into the TLB. The not-most-recently-used replacement policy is used.

PowerPC : A paged segmentation system is used. The hardware loads an entire PTE group into the TLB.

• Size of pages and address space supported

Alpha: The OS supports an 8Kbytes page size and 64-bit addressing.

PowerPC: A 52-bit segmented virtual address space for 32-bit implementations of the architecture or an 80-bit segmented virtual address space for 64-bit implementations of the architecture.

• How is protection provided?

Alpha: ASID

PowerPC: segmentation

• Can you have a cache miss and a TLB miss at the same time? Why?

Yes: If an address can never be accessed or if the page is not in physical memory, it couldn’t be in either TLB or the cache.

• Can you have a cache hit and a TLB miss? Why?

Yes: If the size of cache is larger than the total size of pages that have translation entries in the TLB.

• Is there a combination of TLB (hit/miss) and cache access (hit/miss) that cannot happen?

The case when a TLB miss and cache hit happens at the same time would be one, but it is typically avoided.

• What happens during a TLB miss?

Alpha: The PAL loads the PTEs from a hardware register into the TLB.

PowerPC: The hardware loads an entire PTE group and performs a linear search for a matching virtual address. If the PTE is not found in the first PTE group, the hardware performs a second lookup to a different PTE group, based on a secondary hash.