With the conventional superscalar approach delivering diminishing
returns, alternate designs that make optimal use of the
increasing chip densities are actively being explored.
Unlike the conventional superscalar that expends all its
resources in exploiting instruction-level parallelism (ILP),
there has been much emphasis on architectures that can also
exploit thread-level parallelism from the application.
Chip-multiprocessor architectures (CMP) are one
promising approach in this direction that can also better
exploit the increasing transistor count on a chip.
In our work, we show that wide-issue dynamic
processors that will soon populate CMPs would make
fast communication at the register level a requirement for
high performance. Consequently, we propose an
effective but quite modest hardware that supports
communication and synchronization of registers between
on-chip processors. Furthermore, we
propose hardware support that handles
true memory dependence violations when the application is run
in a speculative execution mode. We also present the
compiler support that enables automatic identification of threads
from sequential binaries. We show how the software-hardware
approach enables effective speculative execution of a sequential
binary on a CMP architecture without source re-compilation.
Overall, we augment the CMP with just enough support,
while still maintaining the generic CMP architecture
to a reasonable degree.
Given that the amount of thread- and instruction-level
parallelism of applications vary widely, the
traditional CMP approach of statically
partitioning the chip resources between threads may lead to
wasted resources when one of the threads stalls
due to hazards or when the application lacks threads.
The
Simultaneous Multithreading (SMT)
architecture addresses this problem by allowing complete
flexibility in resource sharing. Unfortunately, this approach
like the conventional superscalar, is so centralized that
it may not be a feasible architecture. In our work, we
explore a hybrid approach, namely the clustered SMT architecture.
We show that this restricted level of simultaneous multithreading
is able to capture most of the performance benefits of the
fully centralized approach while, at the same time, allowing
the design to be decentralized.
Our work also focuses on simulation methodology.
Multiprocessor system evaluation has traditionally been based
on direct-execution based Execution-Driven Simulations (EDS).
In such environments, the processor component of the system
is not fully modeled.
With wide-issue superscalar processors being the
norm in today's multiprocessor nodes, there is an urgent
need for modeling the processor accurately. However, using
direct-execution to model a superscalar processor has
been considered an open problem.
In our work, we propose a novel direct-execution framework
that allows accurate simulation of wide-issue superscalar processors
without the need for code interpretation. Overall, this approach
enables detailed yet fast EDS of superscalar processors for both
a uni- and multi-processor configuration.
Publications:
-
Speculative Multithreading Architectures
by Venkata Krishnan
Ph.D Thesis,
September 1998.
-
A Chip Multiprocessor Architecture with Speculative Multithreading
by Venkata Krishnan and Josep Torrellas,
IEEE Transactions on Computers,
December 1999.
-
A Characterization of Parallel SPECint Programs in Simultaneous
Multithreading Architectures
by Daniel Ortega, Ivan Martel, Venkata Krishnan, Eduard Ayguade and Mateo Valero,
International Conference on Parallel Architectures and
Compilation Techniques (PACT), October 1999.
-
The Need for Fast Communication in
Hardware-Based Speculative Chip Multiprocessors
by Venkata Krishnan and Josep Torrellas,
International Conference on Parallel Architectures and
Compilation Techniques (PACT), October 1999.
-
A Direct-Execution Framework for
Fast and Accurate Simulation of Superscalar Processors
by Venkata Krishnan and Josep Torrellas,
International Conference on Parallel Architectures and Compilation
Techniques (PACT), October 1998.
-
Hardware and Software Support for
Speculative Execution of Sequential Binaries on a Chip-Multiprocessor
by Venkata Krishnan and Josep Torrellas,
International Conference on Supercomputing (ICS), July 1998.
-
Executing Sequential Binaries on a Multithreaded
Architecture with Speculation Support
by Venkata Krishnan and Josep Torrellas,
Workshop on Multi-Threaded
Execution, Architecture and Compilation (MTEAC'98), January 1998.
-
A Clustered Approach to Multithreaded Processors
by Venkata Krishnan and Josep Torrellas,
International Parallel Processing Symposium (IPPS), March 1998.
-
Efficient Use of Processing Transistors for Larger On-Chip Storage:
Multithreading
by Venkata Krishnan and Josep Torrellas,
Workshop on Mixing Logic and DRAM: Chips that Compute and Remember,
June 1997.