Computer Architecture - Common Project Questions

Last updated 2012/05/22 17:19:03

New and updated entries are highlighted.


Contents

Notes:

General comments.

  1. Clock cycle count differences
  2. Mixing C++ stream i/o and stdio
  3. Obtaining the sources for the library

Cautions:

Things to be careful about when using the ArchLib, and possible bugs in the library.

  1. Possible bug: StorageObject ( ) operators
  2. Short StorageObjects and value()
  3. Use of ?: and the library
  4. op_not bug in 2.2g and earlier
  5. Dynamic allocation of components

Questions:

Specific questions about using the ArchLib.

  1. Hidden tests
  2. Understanding 'try' difference listings
  3. Uninitialized Memory locations
  4. CPUObject::debug flags
  5. Can the MAR be incremented?
  6. Why is the PC latching directly from memory?
  7. Using library include files vs. copying them
  8. Why can't Constants connect to BusALU?
  9. Can I pull from and update a register simultaneously?
  10. Why doesn't the compiler understand incr2 and op_sub?
  11. Creating an array of StorageObjects
  12. Grouping StorageObjects into a struct
  13. Grouping StorageObjects into a class
  14. When can I use C++ operators, and when must I use library components?
  15. How does the BusALU::op_extendSign operator work?
  16. How can I connect a register file to a bus?

Notes

  1. It is normal for your solution to produce a "clock cycle" count which is slightly different from my solution. The clock cycle count is just that - a count of the number of times your solution executed a Clock::tick() call. If your design involves more data transfers than my design, your cycle count will be larger.

  2. Generally, mixing C-style stdio and C++-style iostream i/o operations in your programs is a Bad Idea. When output is sent to the screen, mixed i/o may appear to work; however, when output is redirected to a file (which is what happens when you submit your solutions for grading), stdio and iostream output are buffered separately, and are flushed to the file separately. This causes different parts of your output to appear at the wrong time. If you want to use the stdio printf() function to do output formatting, use sprintf() into a buffer, and then send that buffer to cout.

  3. For those who wish to experiment with the arch library on other systems, the sources are available in the directory /home/course/vcsg720/pub/misc as gzipped tar files, with names such as arch2-5a.tar.gz.


Cautions

  1. We have found what may be a bug in the arch library implementation of the StorageObject ( ) operators when used to manipulate a 32-bit register in versions 2.2g and earlier. Specifically, with a 32-bit register containing the value 0x234, accessing the entire contents with reg(31,0) sometimes gives a result of 0 instead of 0x234. To get around this situation, don't use ( ) to get the entire contents of the register. Use the value() or uvalue() accessor function instead.

    This problem was corrected in version 2.4a of the library. If you are using 2.4a (or later) and believe this problem still exists, please contact me as soon as possible.

  2. When using StorageObjects which are smaller than 32 bits, the value() method doesn't return a signed result. The method is typed as returning a long; however, in several versions of the library, the result is computed by ANDing the contents with a bit mask, retaining only the relevant bits; if the register length is shorter than 32 bits, the high-order bits become zeroes, and the value coming back will always appear to be non-negative. To check for a negative value in a StorageObject, use the ( ) operator to retrieve the most significant bit.

  3. There appears to be a bug in some C++ compilers which causes them to mishandle "?:" expressions sometimes. The construct

    ( (reg) ? r1 : r2 ).latchFrom( bus.OUT() );
    

    is a legal expression, and should latch either r1 or r2 according to the value of reg. If r1 and r2 are StorageObjects, this works fine. However, if they are instances of a subclass of StorageObject (such as Clearable or Counter), this selection fails, and no latch takes place. This is visible if you have turned on the CPUObject::trace flag in CPUObject::debug; no transfer takes place at the clock tick.

    Fortunately, there is a workaround: don't use the "?:" operator; instead, use an if() statement:

    if( reg )
       r1.latchFrom( bus.OUT() );
    else
       r2.latchFrom( bus.OUT() );
    

    This works correctly.

    We're still trying to figure out exactly what triggers this apparent bug, but for now, avoid this use of the "?:" construct.

  4. There is a bug in the BusALU class implementation in versions 2.2g and earlier for the op_not operation which causes it to require two operands. The op_not operation complements the OP1 input, and doesn't use the OP2 input; however, in library versions 2.2g and earlier, the implementation attempts to retrieve the OP2 input even though it isn't used.

    The workaround is to connect OP2 to something even though it won't be used.

    This problem was corrected in version 2.4a of the library.

  5. If you choose to implement components by dynamically allocating them at runtime, be sure that you deallocate them before your program exits, to ensure that the object's destructor and the virtual destructor in the top-level CPUObject class are invoked.

    In theory, the destructor for a class will be invoked whenever an instance of that class is destroyed. For global variables, this happens at program termination; for local variables and by-value parameters, it happens when the variable goes out of scope (i.e., when you return from the call to that function). This is because the destructor invocation is tied to the "destruction" of the variable which is the instance of the class.

    However, for dynamically allocated objects, there is no "variable" associated with the object - there is only a pointer to it, and that "variable" isn't an instance of the class (it's a pointer). This means that the destructors for such objects will only be invoked when the objects are explicitly destroyed (i.e., when the dynamically allocated space is deallocated via delete.)

    A clue that you may have forgotten to do this is that when your program exits, you do not get the typical end-of-program message from the library,

    Simulated time NN cycles
    
    LAST CPUObject DESTROYED; END OF SIMULATION
    

    The first message is printed when the Clock object is destroyed; the second is printed by the CPUObject virtual destructor when the last CPUObject is destroyed (which is what triggers the destruction of the Clock). Dynamically allocated objects are recorded by the virtual constructor for CPUObject, which increments a counter; but unless their destructors are invoked, the counter isn't decremented, and will never reach zero (which triggers the end-of-program output).


Questions

  1. > Are there hidden tests?

    It's never a bad idea to assume that there are hidden tests. Sometimes these are run by 'try'; other times, they're run afterwards by the grader.

  2. > I have submitted project 1 several times, but each time it said that
    > my output didn't match the correct output. I think that my output
    > satisfied the requirements. I don't understand what the differences
    > are between my output and the correct output.

    The information you want is in the "differences" section:

    Differences (OUT.1 is yours; answer.1 is correct):
    *** OUT.1 Fri Jan 14 13:55:17 2000
    --- answer.1 Wed Jan 5 12:16:00 2000
    ***************
    *** 12,17 ****
    --- 12,18 ----

    MACHINE HALTED due to halt instruction

    +
    Simulated time NNN cycles

    LAST CPUObject DESTROYED; END OF SIMULATION

    The '+' character tells you that this line is in one of the output files but not the other. It occurs in the section following the --- 12,18 --- line, which means it's part of the answer.1 file, which is my output; thus, this means that my output had an extra blank line which yours didn't have. This is a minor difference, and won't affect your grade.

    To find out more about these difference listings, you can read the online manual page for the diff command, with man diff.

    > In addition, I have send it for times because I modified the output, I
    > guess that the last one have covered these before it,right?

    Yes, only the last one is kept.

  3. > Let's say I have an OBJ file that contains:
    >
    > 100 1 9fff
    > 100
    >
    > Is it correct to assume (for project 1) that it is the operating
    > system's responsibility to handle attempts to execute instructions
    > that are beyond the program's address space? In other words, is the
    > following output correct (ignore spacing):
    >
    > mem sets starting address to 100
    > 0100: 3ffec0 = JMP 1fff AC=000000
    > 1fff: 000000 = HALT AC=000000
    > MACHINE HALTED due to halt instruction

    Yes, this is fine. You're executing the uninitialized word at 1FFF, which is interpreted as a HALT instruction. Note that 1FFF isn't beyond the address space, though; it's the last word, but it's in the address space.

  4. > [In 'dummest.cpp'] I just noticed the
    >
    > CPUObject::debug = CPUObject::trace | CPUObject::memload;
    >
    > line...if debug is the only public attribute listed, are we supposed
    > to access trace and memload? What does memload do?

    CPUObject::debug is a static data member of the CPUObject class; thus, it acts as a global variable, and all CPUObject instances use the same one.

    CPUObject::trace and CPUObject::memload are constants defined in the CPUObject class. They represent bits in the CPUObject::debug variable, and are used to indicate which debugging options you want to set.

    CPUObject::trace is used to turn on "signal tracing" - that is, for each clock pulse (Clock::tick()), all data movements and ALU/register actions are printed out. This allows you to see exactly what's going on in the various components.

    CPUObject::memload causes memory to dump out its contents after the load() function is called. In the case of 'dummest', using prog1.obj from that directory, you get:

    m[20] = 1234
    m[10] = 0
    m[11] = 1020
    m[12] = 2000

    The 'dummest' example sets both of these in its main() with the statement you described. It doesn't look at any command-line arguments as my sample program does; 'dummest' will always have these debugging options turned on. The sample program (sample.cpp) checks the command-line arguments to determine which, if any, debugging options to turn on, so it's more flexible than 'dummest'. Options are specified as single-character arguments on the command line following the object file; e.g., "dummest prog1.obj t l".

    There is one other debugging option, CPUObject::create. This option causes each CPUObject to announce its creation; the CPUObject constructor checks to see if this is set, and if so it prints a message saying that the object has been created. The destructor also checks this flag, and prints a destruction message when it's invoked.

  5. > Is there any way to increment the MAR directly, without loading the
    > value into the ALU and adding one to it there?

    Not the MAR; it's a StorageObject, which has no increment capability.

    > That's what I thought, but I am confused as to how you would get a
    > trace like this one:
    >
    >          _________
    > _____/0000005 \_____
    > Memory@100-->ffff-->MDR
    >          _________
    > _____/0000006 \_____
    > MDR incremented to 0 (overflow)

    This uses a Counter, not a StorageObject, to implement the MDR, specifically so that it would have an 'increment' capability. Counter is a subclass of StorageObject, so it can be used anywhere a StorageObject can be used.

    The MAR is part of (and defined in) Memory, and is defined as a StorageObject, so you can't increment it.

  6. > It looks like the PC is latching
    > directly from memory. I was thinking that the MAR would be a necessary
    > intermediary. If I am misguided, when would a data read or write be
    > required to go through the MAR?

    If you're referring to the first clock tick, which puts the initial PC value into the PC, that's from code of the form

    m.load(objectfilename);
    pc.latchFrom( m.READ() );
    Clock::tick();
    

    where 'm' is a Memory object and 'pc' is, of course, a StorageObject (or similar) which holds the PC. When Memory loads an object file, the entry point is immediately available on the READ() OutFlow without the use of the MAR, specifically so that the PC can be initialized right away. This is the only time Memory can be "read" without the use of the MAR.

  7. > One thing that is troublng me is that do we have to make our own
    > arch library files like Bus.h and BusALU.h and all of them or just include
    > them which are already been written earlier an just copy or inlude them in
    > our directory

    Just include them. Don't create your own, or make copies; just include them with (e.g.)

    #include <Memory.h>
    

    and use the header.mak file and 'makemake' to create your Makefile.

  8. > Constant zero( "ZERO", DATA_BITS, 0x0 ); // zero constant register
    > Constant one( "one", DATA_BITS, 0x1 ); // zero constant register
    >
    > How do I connect them to OP2 in the alu?
    >
    > connectTo does not Compile.

    You can't connect them to a BusALU, because Constants aren't StorageObjects, and thus can't be used that way. A Constant is actually a one-ended Bus; you can feed one into a StorageObject, but that's about all you can do with them. To do what you want to do, you need to use a StorageObject instead of a Constant.

  9. > Can I pullFrom the PS register in the same clock tick that I am writing to it
    > as follows:
    >
    > alu.OP2().pullFrom(ps);
    > alu.OP1().pullFrom(one);
    > ps.latchFrom(alu.OUT());
    > alu.perform(BusALU::op_sub);
    > Clock::tick();

    Yes. Remember that the PS, being a sequential circuit, will examine its inputs during the first half of the cycle, but won't change its outputs until the second half of the cycle, by which time the ALU has already acted.

  10. > I am able to get the counter to increment a single value by
    >
    > Ctr.incr();
    >
    > But when I try to increment by the following, I get a compiler error.
    >
    > Ctr.perform(incr2);

    You need a scope resolution operator here:

    Ctr.perform( Counter::incr2 );
    

    The increment will happen on the next clock pulse, so you can set up other actions before the Clock::tick().

    This must be done with any "nested" declarations - e.g., Operation in BusALU, and the debugging modes in CPUObject.

  11. > Is it possible to create an array of StorageObjects to simulate a
    > register file, and use the register number as an index?

    Yes, in most C++ implementations (see below regarding g++). You must provide constructor initialization expressions for each element of the array, because StorageObject doesn't have a default constructor.

    In most C++ compilers, something like this works well:

    StorageObject reg_file[32] = {
       StorageObject( "R0", WIDTH ),
       StorageObject( "R1", WIDTH ),
       ...
       StorageObject( "R30", WIDTH ),
       StorageObject( "R31", WIDTH )
    };
    

    You can also do this with classes which are derived from StorageObject; for example, if you want to make the registers Counters, do something like this:

    Counter reg_file[32] = {
       Counter( "R0", WIDTH ),
       Counter( "R1", WIDTH ),
       ...
       Counter( "R30", WIDTH ),
       Counter( "R31", WIDTH )
    };
    

    If you want to mix-and-match them, you need to get fancier with your declarations. Essentially, you'll need to make the array an array of pointers (so that you can use the polymorphic capabilities of C++), do the allocations dynamically, and may need to work a bit with casts to make the compiler happy.

    Unfortunately, though, when using g++, that method doesn't work, because g++ requires that the base class have a copy constructor, which the arch library classes don't have. (Technically, only one component, CPUObject, has a copy constructor; however, in order to make it impossible to pass components to functions as non-reference parameters, the copy constructor is tagged as private.) This generates a long list of typical C++ class-related error message (i.e., almost unreadable) when the compiler attempts to synthesize a copy constructor for the base class and can't.

    To do this in g++, you must create the array as an array of pointers to the component type. (The other option, an array of references to components, is illegal in C++.) The simplest approach is to create the components statically and initialize the array to contain their addresses:

    Counter reg_0( "R0", WIDTH );
    Counter reg_1( "R1", WIDTH );
    ...
    Counter reg_30( "R30", WIDTH );
    Counter reg_31( "R31", WIDTH );
    Counter *reg_file[32] = {
       &reg_0, &reg_1, ... &reg_30, &reg_31
    };
    

    Be sure to initialize the elements correctly (i.e., so that reg_file[n] really holds the address of reg_n instead of one of the other register variables). Alternatively, you could create the array without initializing it and then assign the addresses of the components to the array elements during execution.

    For a more "C++"-ish approach, you can avoid using arrays at all by creating an undimensioned std::vector of pointers to components and push the addresses of the individual components into it:

    #include <vector>
    
    std::vector<Counter*> reg_file;
    ...
    
    reg_file.push_back( &reg_0 );
    reg_file.push_back( &reg_1 );
    ...
    reg_file.push_back( &reg_30 );
    reg_file.push_back( &reg_31 );
    

    Again, you'll need to be careful to push them in the correct order to ensure that reg_file[n] is really reg_n, because you're not specifying vector positions for the components.

    Regardless of whether you're using an array of pointers or a vector of pointers, once you have the data structure set up, you select the element you want to use with subscript notation. However, you must either prefix the subscript expression with the star (dereference) operator or use the arrow operator (defreference and select) instead of the dot operator. To connect register 1 and register 15 to a data bus, for instance, here are examples of both methods:

    (*reg_file[1]).connectsTo( dbus.IN() );
    (*reg_file[1]).connectsTo( dbus.OUT() );
    
    reg_file[15]->connectsTo( dbus.IN() );
    reg_file[15]->connectsTo( dbus.OUT() );
    

    Beware: if you use the first form (star operator to dereference the pointer, then use the dot operator to select the member function), you need to use parentheses as shown above because the dot operator (like the subscript operator) has higher precedence than the star operator, and so *x[n].y would be parsed as *(x[n].y), which would be an error (because x[n] is a pointer and doesn't have a member named y).

    If you need to pass an element as a parameter to a function you must use the star operator, because the value of (e.g.) reg_file[1] is a pointer to a Counter, not an actual Counter (which is what is required to match the reference parameter specification). To copy register 1 into register 15 via the data bus, for instance, you would do this:

    dbus.IN().pullFrom( *reg_file[1] );
    (*reg_file[15]).latchFrom( dbus.OUT() );
    // alternate: reg_file[15]->latchFrom( dbus.OUT() );
    

    Alternatively, you can allocate the components using new; for example, using the array approach:

    Counter *reg_file[32];
    ...
    
    reg_file[0] = new Counter( "R0", WIDTH );
    reg_file[0] = new Counter( "R1", WIDTH );
    ...
    reg_file[0] = new Counter( "R30", WIDTH );
    reg_file[0] = new Counter( "R31", WIDTH );
    

    However, if you do this, you must remember to destroy each of the components explictly, or else their destructors won't be invoked; see Caution #5 above for an explanation.

  12. > Is it possible to create a struct that contains a set of StorageObjects?

    Again, in most C++ implementations, yes, although use of classes instead of structs is preferred in C++ (see question 13 for how to do that). This is very much like creating an array of StorageObjects. When declaring the struct, you don't specify any constructor information, because you're really just creating a pattern to show the compiler what these things look like:

    In most C++ implementations, this works:

    typedef struct MyStructure {
       StorageObject f1;
       Clearable f2;
       ...
       StorageObject fN;
    } MyTypeName;
    

    You can put the struct declaration in one of your include files - say, types.h. Next, when you declare an instance of this, you must provide constructor calls for all the fields which are library components:

    #include <types.h>
    MyTypeName the_data = {
       StorageObject( "F1", WIDTH ),
       Clearable( "F2", WIDTH ),
       ...
       StorageObject( "FN", WIDTH ),
    };
    

    To use this, just access the members the same way you would for a normal struct:

    #include <types.h>
    ...
    extern MyTypeName the_data;
    ...
    the_data.f1.latchFrom( the_bus.OUT() );
    the_data.f2.set();
    the_bus.IN().pullFrom( the_data.fN );
    

    As you might suspect, this does not work in g++, because of the issue with copy constructors mentioned in Question #11 above. Fortunately, there is a similar solution. Declare the structure type using pointers, declare the actual components as static variables, and initialize the structure variables using the addresses of the components:

    typedef struct {
       StorageObject *f1;
       Clearable *f2;
       ...
       StorageObject *fN;
    } MyTypeName;
    
    ...
    
    StorageObject var_f1( "F1", WIDTH );
    Clearable var_f2( "F2", WIDTH );
    ...
    StorageObject var_fN( "FN", WIDTH );
    
    MyTypeName the_data = {
       &var_f1, &var_f2, ... &var_fN
    };
    

    As with the array of components, accessing the elements now requires using one of the defererence operators:

    ifid.npc->connectsTo( dbus.IN() );
    ...
    dbus.IN().pullFrom( *ifid.npc );
    ifid.opc->latchFrom( dbus.OUT() );
    
  13. > Is it possible to create a class that contains a set of components?

    Yes. The only things to remember are that you must use subconstructors to provide constructor parameters to the components in your class, and you must either make the components public or provide accessor functions (which is what the arch library does).

    Using the same example we used in the previous question, we would declare the class this way:

    class MyClass {
    public:
       MyClass();
       StorageObject f1;
       Clearable f2;
       ...
       StorageObject fN;
    };
    

    This can go in one of your include files - say, types.h. In the implementation for the constructor, you simply use subconstructors to provide the constructor parameters for the components:

    #include <types.h>
    ...
    MyClass::Myclass() :
       f1( "F1", WIDTH ),
       f2( "F2", WIDTH ),
       ...
       fN( "FN", WIDTH )
    {
       // anything else you need to do in the constructor
    };
    

    To use this, you declare the variable and access the components as you would normally expect:

    #include <types.h>
    ...
    MyClass the_data;
    ...
    the_data.f1.latchFrom( the_bus.OUT() );
    the_data.f2.set();
    the_bus.IN().pullFrom( the_data.fN );
    

    See the arch library source code for examples of how to do this using accessor functions instead of making the components public.

  14. > In what cases are we permitted to use regular C++ operators. For
    > example, to simulate tying the High order 12 bits of the IR to the
    > low order 12 bits of the AC while preserving the sign, I was thinking
    > of using a line like:
    >
    > int i = j >> 12;
    > if ( i & 0x800 )
    > i |= 0xFFF000;
    >
    > But I don't know if that would be cheating because I'm not using the ALU.
    > Should we just not use the operators:
    > + - / * >> <<
    > at all in the program? Thanks.

    The best rule of thumb for this is that anything which needs to be kept in a component (a StorageObject, a Clearable, Memory, etc.) must be done within the library, and anything which is used for "decision making" can be done outside the library. Here are a few examples:

    inside:   generating addresses, moving addresses or data from one place to another, modifying contents of components
    outside:   decoding opcodes, input multiplexing (e.g., deciding which of several inputs to a bus will be used)

    Another way to think about this is that the library implements the sequential components of the architecture and a few combinational components (bus, ALU), and you use C++ to implement the other combinational components, with the restriction that all data movement must be done through the library.

    So, to move data from the upper half of IR into the lower half of AC, you must use library components; the easiest way would be to do a right arithmetic shift through the ALU.

  15. > Can you explain the op_extendSign function of the ALU?
    > I couldn't understand how it completely works..for 16 bit
    > extensions and 8 bit extensions

    Connect the original data to OP1. Connect a bit mask to OP2; this mask has a single '1' bit in it, at the position which you want to use as the sign bit. Perform the BusALU::op_extendSign operation.

    If you want to sign-extend an 8-bit value, the mask should have bit 7 set (e.g., 0x00000080); for a 16-bit value, it should be bit 15 (0x00008000); for a 5-bit value, it should be bit 4 (0x00000010); etc.

  16. > How do I connect a register file to an address bus or data bus?

    Whenever you want to connect registers to a bus, you must call the connectsTo() routine for each register. For an array of registers, the easiest way to do this is to use a loop to iterate across the array; in the body of the loop, you can connect the selected register to whatever it must be connected to. For example, to connect a register file to two buses and all parts of an ALU:

    for( i = 0; i < N_REGISTERS; ++i ) {
       register[i].connectsTo( bus1.IN() );
       register[i].connectsTo( bus1.OUT() );
       register[i].connectsTo( bus2.IN() );
       register[i].connectsTo( bus2.OUT() );
       register[i].connectsTo( alu.OP1() );
       register[i].connectsTo( alu.OP2() );
       register[i].connectsTo( alu.OUT() );
    }
    

    If using the "array of pointers" version, the code would look like this:

    for( i = 0; i < N_REGISTERS; ++i ) {
       register[i]->connectsTo( bus1.IN() );
       register[i]->connectsTo( bus1.OUT() );
       register[i]->connectsTo( bus2.IN() );
       register[i]->connectsTo( bus2.OUT() );
       register[i]->connectsTo( alu.OP1() );
       register[i]->connectsTo( alu.OP2() );
       register[i]->connectsTo( alu.OUT() );
    }