Computer Architecture -- Exam 2 Topics Appendix A, Chapters 5 / Last Updated 2005/05/05 / ------------------------------------------------------------------------ Pipelining (Appendix A) Part 2: * hazards o RAW, WAR, WAW o structural - stalls o data - forwarding o control - delayed branch o hazard resolution + stalls + forwarding + delayed branch o software scheduling to avoid hazards (e.g., load, branch) o handling branch hazards + stall until direction is clear + predict ``not taken'' + predict ``taken'' + delayed branch # filling branch delay slot - before, from target, from fall-through * multicycle operations * variations o pipeline o super-pipeline o superscalar o VLIW (very long/large instruction word) o vector processors * scoreboarding and out-of-order execution * scoreboard stages o issue o read operands o execution o write result * scoreboard functional units and functional unit status info Memory (Chapter 5): * basic trends in technology * CPU/memory performance gap * principle of locality o temporal o spatial * terminology o hit, hit rate, hit time o miss, miss rate, miss penalty o hierarchy: registers, cache, memory, secondary storage * performance of main memory o latency: access time, cycle time o bandwidth o DRAM * cache memory o cache organization + direct + fully associative + set-associative o cache entry structure + tag + index + offset o block size selection and tradeoffs o cache misses + compulsory + conflict/collision + capacity + coherence/invalidation o cache design + set of operations + internal transfers + datapath + controller o improving cache performance + reduce miss rate (concept, methods) + reduce miss penalty (concept, methods) + reduce hit time (concept, methods) + Victim Cache + Trace Cache o memory hierarchy issues + block placement + block identification + block replacement # random # LRU + write strategy # write through vs. write back # write buffers + write-miss policies # write-allocate # write-noallocate o impact of hierarchy on algorithms + major code rewriting techniques to minimize cache misses + merging arrays, loop interchange, loop fusion, blocking o impact of cache on virtual memory + cache virtual or physical addresses? + synonyms/aliases + Translation Lookaside Buffer (TLB) + overlapped cache and TLB access Input/Output * Buses o Processor-Memory o I/O Bus o Backplane Bus * I/O Types o Polled o Interrupt o DMA Parallel Processing * Parallel Processing Categories o SISD o MISD o SIMD o MIMD * MIMD use of memory o Centralized Memory + Shared Memory Processor (SMP) o Decentralized Memory + Shared memory: Non-uniform Memory Access time (NUMA) + message passing * Cache Coherence o definition o Solutions + Snooping - Write Invalidate - Write Broadcast + Directory ------------------------------------------------------------------------ Talk topics 043 o Reliable Return Address Stack o Cache Reuse o Asynchronous DRAM o Spacial Computing o Matrix Reduction on Cache o Cache Contention