Copyright © Department of Computer Science Rochester Institute of Technology
All Rights Reserved
Our earlier discussion of control unit (CU) design focused on hardwired control units - i.e., ones where the control functions for the various components were synthesized using combinational logic. In this document, we will consider an alternate design technique: the microprogrammed CU.
Recall that a combinational network generates a Boolean function set whose values are used to control a set of micro-ops: one Boolean value means ``don't do the operation'', the other value means ``do the operation'' We can get the same effect by connection the control inputs to the bits of a register, and putting a bit pattern in the register which has the right values to operate the necessary control functions.
A word which contains this type of control function pattern is called a control word (CW). A CU whose control information comes from such an arrangment is called a microprogrammed control unit (MCU).
Clearly, performing sequences of micro-ops will require that we move a sequence of CWs into the control register. In essence, each CW is seen by the MCU as an ``instruction''; these are known as microinstructions (microinstructions). A series of microinstructions is called a microprogram (microprogram). Often, we find that the microprogram will be burned into a read-only memory (ROM), resulting in a static microprogram.
Sometimes, though, an MCU is designed in such a way that its microprogram can be extended or modified by the customer. In this case, some or all of the microprogram will be kept in RAM. This is known as a dynamic microprogram.
It is important to realize that a microprogrammed computer is really two computers: the one seen by the high-level-language or machine-language programmer, consisting of the program-accessible registers, the PC, main memory, etc.; and the MCU itself, consisting of the CM, CAR, etc. The MCU executes the microprogram found in the CM, which in turn causes the processing of information found in the ``machine-level'' computer.
The relationship between the micro-level and machine-level components can be illustrated this way:
Data paths are shown as solid lines; control paths, as dashed lines. The machine-level computer consists of everything outside the box in the lower lefthand corner; everything inside that box is the micro-level computer.
Because the CAR and CDR are registers, they can be used and modified in parallel. Thus, the CDR can be causing the execution of a collection of micro-ops at the same time that it's being used to generate the next address (via the sequencer) for the CAR.
Each machine instruction is executed through the application of a sequence of microinstructions. Clearly, we must be able to sequence these; the collection of microinstructions which implements a particular machine instruction is called a routine.
The MCU typically determines the address of the first microinstruction which implements a machine instruction based on that instruction's opcode. Upon machine power-up, the CAR should contain the address of the first microinstruction to be executed.
The MCU must be able to execute microinstructions sequentially (e.g., within routines), but must also be able to ``branch'' to other microinstructions as required; hence, the need for a sequencer.
The microinstructions executed in sequence can be found sequentially in the CM, or can be found by branching to another location within the CM. Sequential retrieval of microinstructions can be done by simply incrementing the current CAR contents; branching requires determining the desired CW address, and loading that into the CAR.
Conditional branches are necessary in the microprogram. We must be able to perform some sequences of micro-ops only when certain situations or conditions exist (e.g., for conditional branching at the machine instruction level); to implement these, we need to be able to conditional execute or avoid certain microinstructions within routines.
Subroutine branches are helpful to have at the microprogram level. Many routines contain identical sequences of microinstructions; putting them into subroutines allows those routines to be shorter, thus saving memory.
Mapping of opcodes to microinstruction addresses can be done very simply. When the CM is designed, a ``required'' length is determine for the machine instruction routines (i.e., the length of the longest one). This is rounded up to the next power of 2, yielding a value k such that 2 k microinstructions will be sufficient to implement any routine.
The first instruction of each routine will be located in the CM at multiples of this ``required'' length. Say this is N. The first routine is at 0; the next, at N; the next, at 2*N; etc. This can be accomplished very easily. For instance, with a four-bit opcode and routine length of four microinstructions, k is two; generate the microinstruction address by appending two zero bits to the opcode:
Alternately, the n-bit opcode value can be used as the ``address'' input of a 2n x M ROM; the contents of the selected ``word'' in the ROM will be the desired M-bit CAR address for the beginning of the routine implementing that instruction. (This technique allows for variable-length routines in the CM.) >pp We choose between all the possible ways of generating CAR values by feeding them all into a multiplexor bank, and implementing special branch logic which will determine how the muxes will pass on the next address to the CAR. As there are four possible ways of determining the next address, the multiplexor bank is made up of N 4x1 muxes, where N is the number of bits in the address of a CW. The branch logic is used to determine which of the four possible ``next address'' values is to be passed on to the CAR; its two output lines are the select inputs for the muxes.
As with hardwired CU design, the design of the MCU must be done hand-in-hand with the design of the rest of the CPU.
Consider the simple machine architecture introduced earlier. In order to implement this design, we need to specify more completely the collection of registers to be used at each level of the machine.
The function of the
AR, PC, DR, and AC
are as you would expect.
Words are 16 bits each, but there are only 2K of them in memory.
Data transfers are done through mux switching rather than a common bus.
Inputs to the
DR can come from the
PC, memory, the ALU, or the
AR can be loaded from the
DR or the
can only be loaded
We will define only a few instructions, as all instructions will be
EA is the final effective address of the memory operand, however it is calculated (i.e., direct or indirect addressing).
Microinstructions are somewhat different:
Most microinstructions will only specify a few micro-ops; to save memory, then, micro-ops will be grouped into collections of mutually exclusive operations (i.e., ones which won't be done by the same microinstruction). Each collection will be encoded with a three-bit field, of which there are three in the microinstruction. Reserving one code per field to indicate that no operation is to be done, a total of 21 micro-ops can be encoded, as follows:
The CD and BR fields determine when and how branches take place. BR controls the type of branch; CD, the condition under which the branch occurs. Note that branching at the microinstruction level is done to support the execution of machine-level instructions; thus, the conditions under which branches take place are based on those needs.
The microinstruction has access to three types of status information (equivalent to the
VAX ``condition code'' bits):
DR(15) (the sign bit of the data register);
AC(15) (sign bit of the
accumulator); and the ORing together of all bits in the
These are represented in microinstruction source code with the characters
(DR(15), the indirect bit of the machine instruction),
(AC(15), the sign bit of the accumulator), and
(AC=0, for checking zero results).
is used to indicate unconditional transfers.
Note that the CD and BR field contents, while they are interpreted together, are independent; i.e., any CD contents can be used with any BR contents. The bit combination specified by the CD field is used to determine what type of transfer occurs, as requested by the BR field. The appropriate condition is tested, yielding a 0 or 1 result; this, in turn, is used by the MCU hardware to determine how the CAR is loaded.
As it is structured, there is
to guarantee that an ``increment'' adjustment to
CAR will be done.
Thus, sequential microprogram execute must be done by embedding
a ``transfer to the next location in sequence'' unconditional jump into
the previous microinstruction!
Now that we know what microinstructions look like at the hardware level, let's talk about how we specify them at the software level. Microprograms are sequences of microinstructions; these, in turn, are written in a more human-readable form in a microassembly language, which is translated from source form into machine form by a microassembler.
Microassemblers are very much like an assemblers. Source statements consist of five fields:
:] micro-ops CD-spec BR-spec address
There is also one directive,
which serves the same purpose as in assembly language.
Unlike the hardwired design, the MCU is programmed, so all cycles
(fetch, decode, etc.) must exist as microprograms.
Once the micro-op sequence is determined, a microprogram is constructed
by writing microassembly statements which specify the desired operations.
The fetch cycle has the following micro-op sequence:
AR <- PC
DR <- M[AR], PC <- PC + 1
AR <- DR(10-0), CAR(5-2) <- DR(14-11), CAR(6,1,0) <- 0
This is coded as follows:
Why does this cycle begin at location 64?
We need space for instruction routines, which must begin at location 0
in the CM (mapping of opcode
With 16 instructions, and four microinstructions per routine, we will need 64 locations
at addresses 0..63; thus, the cycles must begin at location 64.
The resulting microcode looks like this:
After the fetch cycle, the CAR will contain an address of the form
bits are the bits of the opcode just retrieved.
At that location will be the first microinstruction of the routine implementing the
For memory-reference instructions, the AD field in the machine
instruction may actually be the address of an indirect word.
We need to be able to translate that into a final EA for the operand;
however, each machine instruction routine is only four microinstructions in length.
Fortunately, we have the ability to do subroutines.
The solution, then, is to have each machine instruction routine call
which performs the indirection.
We'll do a conditional
to the indirect routine from the machine instruction routine based on the
value in the instruction's I bit.
The indirect routine can be placed anywhere in CM except in the first
for simplicity, we can place it immediately after
the fetch cycle.
The indirect routine has the following micro-op sequence:
DR <- M[AR]
AR <- DR
This is coded as follows:
We begin by reading a machine word from memory. If we are here, we know that the machine instruction specified indirect addressing, so the AR contains the address of the indirect word (put there at the end of the fetch cycle). We retrieve that into the DR, and unconditionally jump to the next location.
Next, we copy the DR into the AR (which puts the final EA into the AR), and return to wherever we were called from.
Each machine instruction routine for memory-reference instructions begins with a conditional call to the indirection routine. These are coded according to the required micro-op sequences, as described earlier.
instruction has the following micro-op sequence:
If ( I = 1 ) Then (AR <- M[AR])
DR <- M[AR]
AC <- AC + DR
The first operation is already available as the
routine, so we begin by calling that.
The second operation is a
micro-op; the third, an
After these, we want to fetch the next machine instruction, which
involves going to the fetch cycle; as we got here via the
transfer, we must
Here is the implementation of the
Here are implementations of the other selected instructions:
Note that each begins with an
As these routines vary in length, some will occupy all four words in
the ``slot'' allocated to the machine instruction, while others won't.
To ensure that each begins at the appropriate place, we preface each one
to the appropriate point in the CM.
The machine instruction routines occupy locations 0..63, the fetch cycle occupies 64..66, and the indirect routine 67..68; thus, locations 69..127 are still empty. These can be used for other common routines required by additional machine instructions, or to hold subroutines for machine instruction routines which require more than four microinstructions to implement.
Here is the full microprogram in binary:
As with the hardwired CU implementation, we must implement the MCU by interpreting (actually, decoding) the fields within the microinstruction and using that information to control the operation of the various registers in the CPU.
The ALU and machine-level registers are designed as in the hardwired version. Translation of the various fields of the microinstruction into control signals can be done easily with a collection of decoders. Each Fi field can be translated with a 3x8 decoder; the outputs will specifically indicate which micro-ops are required, and can be used as control inputs to the rest of the implementation logic. Translation of the CD and BR fields could be done with decoders; however, as they are actually used to select from a number of possible choices, multiplexors are more commonly used.
The sequencer is the other major component of the MCU. Its purpose is to present an address to the CM so that the next microinstruction can be retrieved; as such, it must have input from the CD, BR, and AD fields of the microinstruction.
Diagram of microinstruction micro-op decoding:
Diagram of sequencer implementation:
The incrementer circuit can be constructed with two 4-bit adders.
Input A7 is 0; A6 through A0 come from CAR.
The B7 through B0 inputs are
Carry in to position 0 is 0; carry out from position 3 becomes carry-in to
Output 7 is ignored; outputs 6 through 0 are send to MUX 1.
Alternatively, a special-purpose ``add 1'' circuit can be designed from half-adders. Inputs to the low-order half-adder are CAR(0) and 1. For the higher-order half-adders, inputs are CAR(i) and the carry-out from the next lower half-adder.
The input logic circuit for MUX 1 has the following (simplified) truth table:
where I1 and I0 come from the BR field of the microinstruction, and T is the result of the condition test selection from CD, and the output L becomes the LOAD input of the SBR.