Held Mondays 11am in CAB 373.
All are invited to attend the whole series, or just the papers that
catch your imagination. I expect regular attendees to also act as
discussion leader. This discussion group is held in conjunction with
EE602
Computer Architecture.
http://www.ee.ualberta.ca/~elliott/ee602/
Discussion leaders will make an informal 30 minute presentation of the main work in a paper, including a critique. Finally, everyone will jump in with their thoughts on the paper. Leaders can pick a paper on Computer Architecture from my list or search for their own favourite.
People are also welcome to use this forum to get feedback on their presentations prior to conferences.
Please let me know if you'ld
like to be added to the mailing list or schedule a slot as discussion leader.
| Name | date | title | papers |
| Duncan Elliott | Feb 28 | Missing the Memory Wall: The Case for Processor/Memory Integration | pdf psps.Z |
| Tyler Brandon | Mar 13 | TBA (C-RAM applications) | |
| Michael Bazzarelli | Mar 20 | TBA | |
View the directory of all papers.
Duncan Elliott
Missing the Memory Wall: The Case for Processor/Memory Integration
Ashley Saulsbury * , Fong Pong, Andreas Nowatzyk
Sun Microsystems Computer Corporation
*Swedish Institute of Computer Science
Abstract
Current high performance computer systems use complex,
large superscalar CPUs that interface to the main memory through
a hierarchy of caches and interconnect systems. These CPU-cen-tric
designs invest a lot of power and chip area to bridge the wid-ening
gap between CPU and main memory speeds. Yet, many large
applications do not operate well on these systems and are limited
by the memory subsystem performance.
This paper argues for an integrated system approach that uses
less-powerful CPUs that are tightly integrated with advanced
memory technologies to build competitive systems with greatly
reduced cost and complexity. Based on a design study using the
next generation 0.25mm, 256Mbit dynamic random-access memory
(DRAM) process and on the analysis of existing machines, we
show that processor memory integration can be used to build
com-petitive,
scalable and cost-effective MP systems.
We present results from execution driven uni- and multi-proces-sor
simulations showing that the benefits of lower latency and
higher bandwidth can compensate for the restrictions on the size
and complexity of the integrated processor. In this system, small
direct mapped instruction caches with long lines are very effective,
as are column buffer data caches augmented with a victim cache.