© 2003 by Charles C. Lin. All rights reserved.
Registers
- A basic CPU contains registers, an ALU (to be explained later), busses, and some combinational logic devices like multiplexers (to be explained later).
Registers are basically very fast memory. However, CPUs can only hold a limited number of registers. There are several reasons for it, ranging from how fast a CPU can be with too many registers, to the more important issue of how to specify which register you want to work with.
In the CPU we're building, we'll assume each register can store 32 bits. However, in our examples, we'll use 4 bits, only because it's easier to draw with 4 bits than 32.
Unlike programming languages, registers don't really have a type. Registers can store ASCII characters, memory addresses, signed integers, unsigned integers, etc. A register stores a 32 bit bitstring, and what that bitstring represents depends on the assembly language instructions that manipulate it.
One reason we talked about a clock is because a register is a clocked device. We'll explain more in the next section.
Parallel Load Register
- A 4-bit parallel load register is a device with 4 bits of data inputs, one control bit, and 4 bits of output.
Here's a diagram of a parallel load register.
The register is the outer box. It has four bits of data inputs labelled b3..0. It has four bits of output labelled z3..0. It has a single control bit telling the parallel load register what to do. It has a clock input that tells it when to do it.
Inside the parallel load register, we see what appears to be an array. However, unlike the arrays you see in a programming language, the array is numbered from right to left. The rightmost element is indexed 0, while the leftmost box is indexed 3.
In reality, registers don't store "arrays", but to make the explanation easier to understand, just think of it as an array.
The following is a chart indicating how to use the control bit (labeled as c) to tell the parallel load register what to do.
Clock | Control | Operation |
Pos. Edge | c = 1 | Parallel Load (i.e., z3..0 = b3..0) |
Pos. Edge | c = 0 | Hold (i.e., z3..0 unchanged) |
NOT Pos. Edge | c = don't care | Hold (i.e., z3..0 unchanged) |
A parallel load registers has two operations. It can hold. It can parallel load.
- hold The register keeps the same value in the array. It ignores the input values b3..0.
- parallel load The register reads in the input values, b3..0, and overwrites the values in the array. Thus, if b3..0 = 0000 and the current value in the register is z3..0 = 0110, then parallel loading the input would cause the register's content to become 0110, and the old values of the register would disappear.
The reason it is called a parallel load is because the bits are loaded in parallel. The other way to load bits is one at a time, i.e., sequentially. This is what happens when you copy an array. It is copied one element at a time.
The control bit, c, is used to tell the register whether to hold or to parallel load. If c = 0, the register holds. If the c = 1, the register parallel loads. However, this is only done when the clock signal is at a positive edge.
If you look at the third row of the chart above, the register also holds its value. When the clock signal is not at a positive edge, the register's value remains the same. It is holding its value.
Remember, It's Continuous
- If you think like a programmer, you think discretely. You think of arguments being passed.
However, when you deal with hardware, you need to think continously (and discretely too). In general, the register is always receiving some value for c. Again, think of a pipe being sent into the register.
There's always some red or green soda flowing in. That is, there's always a 0 or 1 being sent to the register using the c control input.
Most of the time, this control input is being ignored. However, when a positive edge occurs, the control input is "read" by the register (to determine which operation to perform), and either a parallel load or hold occurs.
Why Hold?
- You might wonder why registers have two operations. In particular, you might wonder why it's necessary to have a "hold" operation. After all, most of the time, the register is holding. Why wouldn't we want to parallel load at each clock edge?
Think of a register as a variable in a programming language. Do we assign to a variable all the time? No. We just assign to the variable when it needs to be assigned. Otherwise, variables would be changing all the time.
We don't always want a register to read the value all the time for the same reason we don't always want a variable to update all the time. Sometimes, we just want to keep the value of the register unchanged.
Here's another analogy. Many years ago, there were milk men who would deliver bottles (yes!) of milk to your house. They would come to your house once a day. However, depending on how fast you drink milk, or whether you're on vacation, you might not want to have milk delivered to you each day.
A system can be worked out between you and the milkman. If you've drunk all the milk, then you can put the empty bottle outside your door. The milkman sees an empty bottle, picks it up, and puts down a new bottle of milk.
On the other hand, if you haven't completed the milk, you don't leave the bottle outside. When the milkman drops by, he doesn't see a bottle of milk outside. So, he doesn't drop a new bottle of milk.
The milkman comes by once a day, which is similar to a positive edge occuring once a period. The milkman has two choices. Drop a new bottle of milk, or don't. If you didn't have this policy, then the milkman would keep dropping off more and more bottles of milk, and it would all go bad. So, there's some justification for not wanting milk to be dropped off each day.
What if you wanted milk to be delievered more often? Like twice a day, instead of once a day? This is analogous to speeding up the clock, and usually we don't have too much control over how fast the clock runs. Generally, you have the clock set at one rate, and that's that.
It's Always Outputting a Value!
- Whatever value is in the register, it's continously being output to the outside world. Thus, once you load a parallel load register, z3..0 is set to the value just loaded, and this is sent continuously.
You just have to get used to the idea that a register is always outputting a value. The reason this doesn't cause a problem is because the rest of the CPU doesn't have to actually read the value from the register. Other devices can selectively ignore or read the values of the registers at selected times.
Why Use a Clock?
- Why does a register need a clock? Why don't we just update the register whenever we feel like it? The main reason has to do with what registers are used for.
A register outputs a value. This value is needed by other devices in the CPU. For example, we may want to add the value of this register to the value of another register. A device called the ALU can perform this addition.
However, this addition isn't infinitely fast. Even though it's quick, it's not instantaneous. We must wait for the computation to complete. At that point, the result of the computation can be stored. This result is typically store in a register, possibly the same register that produced the value!
So, the clock is primarily used to allow the devices that need to perform computations the time it needs to do so, before updating the registers.
How Does a Register Compare to Memory?
- Modern day PCs have clock rates around 3 GHz. This is the rate of the clock on the CPU. In principle, the CPU can execute one instruction per clock cycle. In reality, this happens only if the data is in the registers already.
If you need to access memory (and you'll need to do this), you'll find memory is much, much, much slower to access than registers. Memory is perhaps 400 times as slow to access than registers.
To give you an idea of how much slower that is, suppose someone tells you they are going to come back in a minute. If they come back 400 times as slow, it will take them nearly 7 hours to return.
Clearly, you want to use registers when you can, because they are much faster than memory.
However, there is a problem. CPUs can only hold a limited number of registers. For the CPU we're going to consider, we have 32 32-bit registers. That's 128 bytes.
Typical RAMs on a PC have somewhere between 128 MB up to 1 G. This is at least 100,000 times more memory than registers. However, all that memory comes at a price. It's slow to access.
This is one of the facts about memory. The faster you want memory, the more it costs, and the less of it you can have. To think about why this might occur, let's use an analogy. Think of a city. Suppose it has 100 streets (with traffic lights) going north-south, and 100 streets going east-west. Imagine how long it might take to travel from one corner to the other.
Now imagine that you only have two streets going north-south, and two streets going east-west. It's much faster to travel in this small town, because there are far fewer streets, which you can place quite close together.
Similarly, the more registers you have, the more space it takes up on the CPU, and the more time it takes to access them merely because of the physical space it takes up. When you're trying to run a CPU as fast as possible, these things become important.
Summary
- A parallel load register is a sequential device. This means, it uses a clock. Specifically, a parallel load register can only change its value when the clock signal is at a positive edge. When the clock signal is not at a positive edge, the register holds its value.
You can tell a register whether to parallel load or to hold a value by setting the value of c the control bit to 1 or 0, respectively.
However, this bit is only read at the positive clock edge. If you do not have the value of c set to the value you want when the positive edge occurs
(Actually you need to have the value a short time before the edge occurs, and this value must persis to a short time after the edge occurs. The actual amount of time needed depends on the register itself, and is usually part of the technical specifications for the register)