Processor Simulation Research and Performance Evaluation in the LAVA Lab

Photo courtesy of The Hawaii Center for Volcanology

Research on Processor Simulation

Evaluating approaches like those described above is typically done using simulation. For our research we use the SimpleScalar toolkit from Wisconsin and some of our own derivative simulators, especially HydraScalar and Wattch. HydraScalar extends the detail of the branch handling and reimplements the pipeline and out-of-order execution to permit multipath execution. Wattch adds a model of dynamic power dissipation on top of SimpleScalar's sim-outorder simulator. For thermal simulation, we will soon be publicly releasing our thermal model, HotSpot, and we have just released our MRRL tools for fast warmup.

Regardless of the simulator, performance analysis is prone to numerous pitfalls. Some classic pitfalls include modeling perfect structures, and an especially severe pitfall is to assume perfect branch prediction. This exposes much more instruction-level parallelism (ILP) than any realistic processor could. As a result other optimizations like cache optimizations or greater degrees of out-of-order execution may look far more profitable than they realistically should. Another classic pitfall is to simulate only 50 million (M), 100 million, or even 1 billion instructions from the beginning of a program's execution. This is done because cycle-level simulations are slow, simulating only 100,000-500,000 instructions per second, while many programs run for billions or hundreds of billions of instructions. Unfortunately, many programs have very unrepresentative startup behavior that can last even for 1-2 billion instructions–much longer than the 50-100 M instructions sometimes simulated. We have shown that simulating only 50 M instructions does yield representative results, but only if taken from a representative part of the program's execution. If the simulation window is instead taken from the beginning of the program's execution, wildly inaccurate results can occur. We have also shown the state of large structures like caches can be maintained across fast-forward periods without the need for expensive warm up using a technique that we call MRRL. This technique works with both single-sample and multiple-sample simulation styles. We have just released a set of tools for easily combining MRRL with your choice of sampling regime.

Selected Publications

MRRL tools home page
K. Skadron, M. R. Stan, W. Huang, S. Velusamy, D. Tarjan, and K. Sankaranarayanan. “Temperature-Aware Microarchitecture.” In Proceedings of the 30th International Symposium on Computer Architecture, June 2003, to appear.
J.W. Haskins, Jr. and K. Skadron. “Memory Reference Reuse Latency: Accelerated Sampled Microarchitecture Simulation.” In Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software, pp. 195-203, Mar. 2003. (postscript | pdf | abstract)
J. W. Haskins, Jr. and K. Skadron. “Memory Reference Reuse Latency: Accelerated Sampled Microarchitecture Simulation.” Tech Report CS-2002-19, Univ. of Virginia Dept. of Computer Science, July 2002. (postscript | abstract)
J.W. Haskins and K. Skadron. "Minimal Subset Evaluation: Rapid Warm-up for Simulated Hardware State." In Proceedings of the 2001 International Conference on Computer Design, pp. 32-39, Sept. 2001. (postscript | pdf | abstract)
K. Skadron and P.S. Ahuja. "HydraScalar: A Multipath-Capable Simulator." In the Newsletter of the IEEE Technical Committee on Computer Architecture, pp. 65-70, Jan. 2001. (postscript | pdf | abstract)
D.A.B. Weikle, S.A. McKee, K. Skadron, and W.A. Wulf. "Caches as Filters: A Framework for the Analysis of Caching Systems." In Proceedings of the Third Grace Hopper Celebration of Women in Computing Conference 2000, Sept. 2000. (pdf | abstract)

postscript

abstract

D.A.B. Weikle, K. Skadron, S.A. McKee, and W.A. Wulf. "Caches As Filters: A Unifying Model for Memory Hierarchy Analysis." Tech Report CS-2000-16, Univ. of Virginia Dept. of Computer Science, June, 2000. (postscript | abstract)
K. Skadron, P.S. Ahuja, M. Martonosi, and D.W. Clark. "Branch Prediction, Instruction-Window Size, and Cache Size: Performance Tradeoffs and Simulation Techniques." IEEE Transactions on Computers, 48(11):1260-81, Nov. 1999. (postscript | pdf | abstract)
K. Skadron, P.S. Ahuja, M. Martonosi, and D.W. Clark, "Selecting a Single, Representative Sample for Accurate Simulation of SPECint Benchmarks." Tech Report TR-595-99, Princeton Dept. of Computer Science, Jan. 1999. (gzip'd postscript | pdf |abstract)

Back to Skadron's home page
Back to the LAVA home page
Last updated: 7 Mar., 2002

Photo courtesy of the USGS/Cascades Volcano Observatory

Department of Computer Science
School of Engineering and Applied Science
University of Virginia, Charlottesville, Virginia 22904
(434) 982-2200, FAX: (434) 982-2214