|
FASE: The Fast and Accurate Simulation Environment at the University of Florida
As processing power and networking capabilities increase, so do the needs of the applications that
push these technologies to the limit. In order to satisfy the almost insatiable need for more performance,
applications are parallelized and executed on parallel machines such as clusters - collections of computers
connected by conventional, or in many cases, high-speed networks. The potential heterogeneity of
high-performance systems gives rise to many issues that can dictate the efficiency of hardware usage, the
execution time for a job, and the overhead involved in running a parallel or distributed program. Finding
the optimum configuration of hardware and software components and task distribution for a specific
application is almost impossible and very impractical from an experimental standpoint. Therefore, simulation
tools can be employed to provide an affordable and effective means to determine system configurations for
particular applications. Currently, researchers and industry use these tools to determine high-level aspects
of systems such as resource allocation, network congestion, and job completion time. Many of these tools
sacrifice fidelity for speed or vice versa. This trade-off raises many questions and concerns when
developing a mission-level simulation environment. What level of fidelity should be used for modeling
components in a system to ensure reasonable execution times while providing an accurate portrayal of the
system? Is it feasible to use high-fidelity models for some components while sacrificing accuracy for others
to speed up simulations? What aspects of a system are the most important in determining the best
configuration on which to run an application? The FASE research aims to answer these questions while
providing an environment for the simulation of high-performance systems with mission-critical applications.
The ultimate goal of FASE is to provide a robust simulation environment that will allow a user to create
customized systems of any size and topology using a variety of components such as clusters comprised of
symmetric multiprocessors, reconfigurable devices, and high-speed interconnects.
The first iteration of FASE focuses on the design and implementation of key elements involved in the
execution of a parallel program. In general, a parallel application's execution time can be broken down into
computation and communication. Computation is abstracted through the use of simple timing functions activated
and deactivated between communication events. The times obtained are then scaled in order to model other
computational units. Communication, by contrast, is modeled at a higher fidelity. Parameters such as source,
destination, and message size are collected from each communication event during the application's execution.
Currently, the communication events must be a selected subset of the MPI library; however, future iterations
will include more extensive function support for MPI as well as possible support for SHMEM and UPC. These
events drive network models that accurately portray the actual interconnect. The current network library
consists of InfiniBand, RapidIO, Scalable Coherent Interface (SCI), Ethernet, and TCP/IP. These models have
been developed using the simulation tool
Mission-Level Designer (MLD)
, which provides the foundation for FASE.
MLD is a commercial block-oriented, discrete-event simulator based on C++ that allows virtually anything to be
modeled at any level of fidelity. Future iterations of FASE will focus on advancing the scalability of the
methods and models used to conduct system evaluations. This work will consist of determining different
pre-simulation methods to capture relevant details of the applications under study as well as the possibility
of a hierarchical simulation methodology.
The FASE work is being conducted under both the
MS and
ASC
groups of the HCS Lab. The foundation of FASE was built under the sponsorship of the DoD
while the use of FASE to model a dependable multiprocessor space system will be conducted with matching
funds from the University of Florida through the primary sponsors:
NASA
and
Honeywell Space Systems in
Clearwater, FL.
FASE-related Publications
Related Research Links
|
 |