Research on Processor Simulation
Evaluating approaches like those described above is typically done using simulation. For our research we use the SimpleScalar toolkit from Wisconsin and some of our own derivative simulators, especially HydraScalar and Wattch. HydraScalar extends the detail of the branch handling and reimplements the pipeline and out-of-order execution to permit multipath execution. Wattch adds a model of dynamic power dissipation on top of SimpleScalar's sim-outorder simulator. For thermal simulation, we will soon be publicly releasing our thermal model, HotSpot, and we have just released our MRRL tools for fast warmup.
Regardless of the simulator, performance
analysis is prone to numerous pitfalls. Some classic pitfalls
include modeling perfect structures, and an especially severe pitfall
is to assume perfect branch prediction. This exposes much more
instruction-level parallelism (ILP) than any realistic processor could.
As a result other optimizations like cache optimizations or greater degrees
of out-of-order execution may look far more profitable than they realistically
should. Another classic pitfall is to simulate only 50 million (M),
100 million, or even 1 billion instructions from the beginning of
a program's execution. This is done because cycle-level simulations
are slow, simulating only 100,000-500,000 instructions per second, while
many programs run for billions or hundreds of billions of instructions.
Unfortunately, many programs have very unrepresentative startup behavior
that can last even for 1-2 billion instructions–much longer than the 50-100
M instructions sometimes simulated. We
have shown that simulating only 50 M instructions does yield representative
results,
but only if taken from a representative part of the program's
execution. If the simulation window is instead taken from
the beginning of the program's execution, wildly inaccurate results can
occur. We have also shown the state of large structures like caches
can be maintained across fast-forward periods without the need for expensive
warm up using a technique that we call MRRL.
This technique works with both single-sample and multiple-sample simulation
styles. We have just released a set of tools
for easily combining MRRL with your choice of sampling regime.
Back to Skadron's home page Back to the LAVA home page Last updated: 7 Mar., 2002 |
![]() |
Department of Computer Science
School of Engineering and Applied
Science
University of Virginia, Charlottesville,
Virginia 22904
(434) 982-2200, FAX: (434) 982-2214
Email web page comments to webman@cs.virginia.edu
Email CS admission inquiries to inquiry@cs.virginia.edu
© Kevin Skadron, 2001, 2002