The University of Florida 
High-performance Computing & Simulation Research Lab
home > project > hpcgroup

    submenu »     project home | overview | downloads | publications | related links

HPC Group / Home
Scalable and Dependable Applications and Infrastructure for High-Performance Computing

The high-performance computing (HPC) group primarily focusses on critical services for efficient, scalable and dependable high-performance computing. The HPC group is responsible for the GEMS (Gossip-Enabled Monitoring/Management System) project and related HPC projects in the HCS Lab at the University of Florida. GEMS project focuses on the development of key concepts and mechanisms in scalable failure detection, consensus, and resource performance monitoring and management for heterogeneous, distributed networks and systems and related HPC applications. The HPC group focuses on these issues for large-scale, heterogeneous clusters and grids, and key applications, and works with the international iVDGL group led by the Department of Physics at UF to adapt our methods for scalable resource health and performance monitoring for the needs of scientific data-intensive grids. Recently, the group has been focusing on two particular areas for IVDGL, these being hybrid forms of network monitoring and new computational grids based on multiparadigm resources including reconfigurable hardware. The HPC group also works to provide high-performance parallel solutions to complex problems, including simulation of joint mechanics where the group works closely with the Computational Biomechanics Laboratory in the Department of Mechanical and Aerospace Engineering.

Sponsor: NSF/iVDGL, NSF/UltraLight
Principal Investigator: Dr. Alan D. George
Spring 2006 meetings: 3pm (8th period) Mondays and 12:50pm (6th period) Thursdays, HCS conference room (LAR335)

Group Members
Raj Subramaniyan, PhD student, group leader
Ajit Apte, MS student
Adam Jacobs, PhD student, Alumni Fellow
Kyu Sang Park, PhD student
Sachin Sanap, MS student
Rahul Singh, BS student

Related Links
GEMS web page
Group materials maintained by Byung Il (password protected)
MonALISA - note: a version is MonALISA+GEMS is under construction for grids of large sites and clusters
2nd Sandia Workshop on Scalable Fault Tolerance for Distributed Computing, Albuquerque, NM, April 2002
1st Sandia Workshop on Scalable Fault Tolerance for Distributed Computing, Livermore, CA, April 2001