Computer Organization - Final Exam Topics
Exam date:
Wednesday, May 15, 2013, 12:30 - 2:30pm, GOL 3455
Last updated
2013/05/08 12:32:54
Important Notes
I will provide you with a handout containing the relevant material from
the MIPS reference card found at the front of your textbook to use during
this exam.
You do not need to bring your own copy of the card.
You are NOT allowed to use a calculator on any questions that
specifically ask you to convert numbers from one base to another, unless
the calculator is a simple four-function one.
As this is a comprehensive exam, you may also wish to review the
Exam 1 and
Exam 2
topic lists.
TEXT CHAPTERS
Chapters containing material that was new since the last exam:
- Chapter 4: The Processor
- Chapter 5: Large and Fast: Exploiting Memory Hierarchy
- Chapter 6: Storage and Other I/O Topics
(minimal coverage - very high-level concepts only)
- Appendix C: The Basics of Logic Design (on CD-ROM)
- Appendix D: Mapping Control to Hardware (on CD-ROM)
- Appendix E: A Survey of RISC Architectures (on CD-ROM)
Earlier material:
- Chapter 1: Computer Abstractions and Technology
- Chapter 2: Instructions: Language of the Computer
- Chapter 3: Arithmetic for Computers
- Appendix B: Assemblers, Linkers, and the SPIM Simulator
LECTURE NOTES
Material that is new since the last exam:
- Topic 9: Larger Digital Circuits
- Topic 10: The Arithmetic/Logic Unit
- Topic 11: CPU Design I: Datapath Design
- Topic 12: CPU Design II: Implementing the Control
- Topic 13: Pipelining
- Topic 14: The Memory Hierarchy
- Topic 15: Input and Output
Earlier material:
- Topic 1: Overview
- Topic 2: Information Representation I: Characters and Integers
- Topic 3: Instructions I: Arithmetic and Data Movement
- Topic 4: Instructions II: Control Flow
- Topic 5: Information Representation II: Instructions
- Topic 6: Information Representation III: Floating Point
- Topic 7: Program Translation
- Topic 8: Introduction to Digital Design
TOPICS NEW SINCE EXAM 2
Advanced Digital Design (Part 2)
- common sequential circuit types
- register
- collection of flip-flops to hold individual bits
- typically can load value into all flip-flops in parallel (e.g.,
N-bit parallel-load register)
- counter
- special type of register
- modifies its own contents under control of a clock
- designed to produce a sequence of values - binary, or some other
arbitrary sequence
- can have an enable input to turn counting on or off
- register file
- like an array of registers
- typically, one set of data inputs, one or two sets of outputs, and
control inputs to direct operation
- random-access memory (RAM)
- collection of bit cells
- select a set of bit cells by providing an address
- organized as a matrix of bit cells - rows are "words"
- address input is decoded to provide select controls for individual
bits/words
- dynamic RAM (DRAM)
- holds data using capacitors (implemented by transistors)
- one transistor per bit of data
- inexpensive, but loses contents over time due to voltage drain
- must periodically refresh contents to avoid losing information
- commonly used for main memory
- static RAM (SRAM)
- implemented using four to six transisters per bit
- significantly more expensive, but is "self-refreshing"
- holds data as long as it is powered - doesn't require separate refresh cycle
The ALU
- need to support all necessary operations
- arithmetic
- logical
- shift/rotate
- sign extension
- use of multiplexors in design
- direct implementation of truth tables without additional gates
- multiplexor folding (e.g., implement 3-var expr with 4x1 mux)
- use mux to select from multiple separate arithmetic/logic circuits
- design philosophies:
- rule #1:
- for simplicity, build 32 one-bit ALUs, then figure out how to connect them
- rule #2:
- build separate hardware blocks for each task
- perform all possible operations in parallel
- use a mux to choose which operation is actually desired
- important concept: use multiplexors whenever we need to select between alternative inputs or results
- simple one-bit ALU
- N different simple
- use Nx1 mux to select desired operation
- operations:
- AND, OR, XOR, etc. with individual gates
- addition with full adder
- subtraction with full adder and two's complement trick
- tailor ALU to needs of ISA
- e.g., for MIPS, need set-on-less-than, zero
- important details
- ALU is combinational - always active
- speed depends on size
- more complex ALU is larger, therefore slower
- can improve speed with carry-lookahead adders
- multiplication, division
- significantly more complicated than addition/subtraction
- takes many more CPU cycles to do
- communicate with buses
- method 1: use multiplexors to generate bus lines
- method 2: use tri-state devices
Datapath Design
- general concepts
- break up task of "executing an instruction" into stages
- connect the stages to create the entire datapath
- smaller stages are easier to design than entire "execution unit"
- easy to optimize/change one stage without touching the others
- typical stages (described with RTL)
- instruction fetch - all
- instruction decode / register fetch - all
- execution - varies according to need of instruction
- memory access - load and store only
- register write - ALU and load only
- standard components
- register file
- PC
- "add four" unit
- sign-extender
- instruction memory
- data memory
- ALU
- tie components together with multiplexors
- implementation choices
- single-cycle: perform all tasks in a single clock cycle
- multi-cycle: use a clock cycle for each stage
- control signals
- needed to direct operation of datapath components
- generate them combinationally based on opcode, function code
Control Implementation
- control signals depend upon
- instruction being executed
- which step is being performed
- use a finite state machine to design control
- set of states
- "next state" function
- output function
- generating states
- state register
- initially, 0
- combination of current state and opcode used to generate control signals
- current state is used to generate next state
- sequence generator
- a.k.a. counter
- feed output into a decoder to generate state signals
- combinational circuitry uses state signals to generate control signals
- problem: may have many unused states
- implementing control functions
- using a ROM - "address" is opcode and current state, "data" is control signal set
- using a PLA
- relative sizes of the two implementations
- microprogrammed control units
- background and basic concepts
- control word
- microinstruction, microprogram
- static vs. dynamic microprogram
- basic MCU configuration
- sequencing
- sequential execution
- branching: conditional, unconditional, call/return
- microinstruction formats
- vertical vs. horizontal vs. hybrid
- control fields
- grouping of micro-ops into control fields
- microinstruction source code
- symbolic microprogramming
- fetch cycle
- routines for machine instructions
- pros and cons of microprogramming
RISC Architectures
- issues in instruction set design
- evolution of ISAs and virtual machine view
- CISC architectures
- general characteristics
- common features
- many instructions, addressing modes
- runtime stack use
- specialized instructions
- great flexibility in operand choice for ALU instructions
- drawbacks
- examples: VAX, Intel x86
- RISC architectures
- primary goal: improved performance
- pipelining - concepts, application to computing
- common features
- simple instruction formats
- relatively few instructions, addressing modes
- dedicated registers (e.g., R0)
- register file
- delayed branching
- load/store architecture
- typical fetch/execute cycle: IF, ID, EX, MEM, WB
- example: MIPS R2000
- comparison of RISC and CISC programming
Pipelining
- basic concepts
- simple MIPS-like 5-stage pipeline
- pipeline registers
- hazards
- structural (physical structure of pipeline)
- data (need for results before they're ready)
- control (following branches)
- hazard resolution
- structural: stalls, change the pipeline
- data: forwarding
- control: branch prediction, delayed branch
- scheduling of instructions to avoid hazards (e.g., load, branch)
- static: by the compiler/assembler
- dynamic: by the pipeline hardware
- variations (general concepts only)
- pipeline
- super-pipeline
- superscalar
- VLIW (very long/large instruction word)
- vector processors
The Memory Hierarchy
- CPU/memory performance gap
- principle of locality
- terminology
- hit, hit rate, hit time
- miss, miss rate, miss penalty
- hierarchy: registers, cache, memory, secondary storage
- cache memory
- cache organization
- direct
- fully associative
- set-associative
- cache slot/line structure
- tag (unique to each block)
- index (identifies cache line containing this block)
- block offset (of first desired byte within the block)
- block size selection and tradeoffs
- cache misses
- compulsory
- conflict
- capacity
- coherence
- improving performance
- increase block size
- increase associativity
- multilevel cache
Input and Output
- important, but neglected
- design affected by many factors
- types/speeds of devices
- types/speeds of interfaces
- many different types of devices:
- device configuration
- types of device registers - control, data
- location of device registers - memory-mapped vs. specialized instructions
- multiplexed registers vs. separate addresses for each
- read-only and write-only registers
- methods
- polling
- interrupts
- fundamental differences between the two
- polled i/o
- basic concepts
- device configuration
- wait loops in i/o routines ("busy wait", "spinlock")
- interrupts
- sources
- hardware devices (i/o, timers, clocks, etc.)
- software-generated (syscall, exceptions/faults, etc.)
- ISR concept and variations
- single ISR for all interrupts
- vectored interrupts
- time-critical nature of ISRs
- hardware mechanism - interrupt vector, etc.
- prioritized interrupts, interrupt masking, etc.
- device configuration
- CPU configuration
- examples
- SIO-1
- SIO-2
- MIPS method (coprocessor registers, single ISR at 0x80000180, etc.)