Department of Computer Science
Course: CS 3725
Horizontal Bar

next up gif
Next: Cache memory Up: The Memory Architecture Previous: The Memory Architecture

The memory hierarchy

In principle, for a simple single processor machine, the memory architecture is quite simple -- the memory is connected to the memory address lines and the memory data lines (and to a set of control lines; e.g. memory read and memory write) so that whenever an address is presented to the memory the data corresponding to that address appears on the data lines. This is adequate for processors which can address a relatively small address space (like the PDP-8 with 4 K words or the many 8-bit microprocessors with 64 K bytes of memory). However, for larger systems with 32-64 address lines these machines can address tex2html_wrap_inline7690 (4000 M) bytes to tex2html_wrap_inline7692 tex2html_wrap_inline7694 bytes of memory it is not practical to provide all the memory that the machine can address.

Logically, memory is structured as a linear array of locations, with addresses from 0 to the maximum memory size for the processor. Figure gif shows the organization of a 16Mbyte block of memory for a processor with a 32 bit word length, like the MIPS R2000/R3000:

   figure27
Figure: Memory organization

Note that the position of each word in the linear array is indicated by an address. Each byte can be addressed individually (although a whole word is read from memory at one time.) Consequently, word addresses increase by 4, starting from 0.

In general, the faster a memory is in general the more expensive it is per bit of storage. On systems which have a large amount of memory, there is usually a hierarchy of memories, each with different access speeds and storage capacities. Typically, a large system has a small amount of very high speed memory, called a cache where data from frequently used memory locations may be temporarily stored. This cache is connected to a much larger ``main memory'' which is a medium speed memory, currently likely to be ``dynamic memory'' with access time from 100-1000 ns. Cache memory access times are typically 10 to 20 times faster than main memory access times, typically from 5 - 40 ns. (In some very large computer systems, the main memory is organized into two or more ``banks'', each of which contains adjacent memory words which can be addressed individually and simultaneously. A memory organized in this way is called an ``interleaved'' memory.)

The largest block of ``memory'' in a modern computer system is usually one or more large magnetic disks, on which data is stored in fixed size blocks of from 256 to 8192 bytes. This disk memory is usually connected directly to the main memory, and has a variable access time depending on how far the disk head must move to reach the appropriate track, and how much the disk must rotate to reach the appropriate sector for the data. (Some very large systems have multiple head disks which can read from several tracks at once). A modern Winchester disk has a track-to-track latency of about 1-2 ms., and the disk rotates at a speed of 3600 RPM. The disk therefore makes one revolution in 1/60th of a second, or 16.7 ms. The average rotational latency is therefore about 8.4 ms. Faster disks (using smaller diameter disk plates) can rotate even faster.

A typical memory system, connected to a medium-to-large size computer (a desktop or server configuration) might consist of the following:

128K-2M bytes of cache memory (5-50ns)

64-256 M bytes of main memory (400-2000ns)

2000-20,000 M bytes of disk storage

A typical memory configuration might be as shown in Figure gif.

   figure316
Figure: A typical memory configuration


next up gif
Next: Cache memory Up: The Memory Architecture Previous: The Memory Architecture

Paul Gillard
Mon Nov 24 20:44:06 NST 1997