Dynamic Resource Management System (DRMS)

Get Complete Project Material File(s) Now! »

Chapter 2 Literature Review

Communication Software

A cluster is a collection of machines with their private memory and processor connected by an interconnection network. Nodes communicate by passing messages across this interconnection network. The complexity of message passing is abstracted into a
message passing software layer to enable programmers of parallel applications to conveniently employ the underlying system. PVM [12] and MPI [13] are two popular message passing libraries that provide a software layer to enable message passing.

Parallel Virtual Machine (PVM)

Historically, PVM was the first message passing library designed to work on a network of workstations (NOW). However, the popularity of PVM among developers of parallel applications has reduced over the years with the advent of MPI. Although PVM is
functionally equivalent to MPI, there are a few philosophical issues that have made MPI more popular. Control of PVM has always been primarily with the authors. A standards committee, on the other hand, governs MPI, and implementers of MPI are required to strictly follow the MPI standards guidelines. This has resulted in MPI being more portable to various hardware platforms. Moreover, MPI provides a better interface to parallel application developers by employing constructs such as communicators which PVM does not provide. Communicators are a convenient way of partitioning processors involved in a computation into subgroups and to reference these subgroups in order to perform a specific task. However, the PVM legacy continues to exist and PVM implementations are still used around the world. Evaluation notes were added to the output document. To get rid of these notes, please order your copy of ePrint IV now.

Message Passing Interface (MPI)

MPI is a message passing standard that is designed by a broad committee of vendors,implementers and users. Many free and vendor supplied implementations of the MPI standard are available today. The most popular free implementations of MPI are MPICH from the Argonne National Laboratory and LAM MPI from Indiana University. Recently, the MPI-2 standard [13] has been released, which among other features, includes support for dynamic process management. This functionality has been
incorporated in the latest release of LAM MPI [33]. MPICH [34], however, has not yet implemented this feature of the MPI-2 standard.The dynamic process creation and management functionality was added to the MPI-2 standard upon the request from the user community. This feature was desired because of the desire to exploit the dynamic process model discussed in Chapter 1, and was given impetus by the PVM research effort [12] that had demonstrated the benefits of dynamic resource management.
The MPI-2 standard enables process spawning through the API function call MPI_COMM_SPAWN. The MPI-2 standard, however, does not place restrictions or provide guidelines for how the actual spawning takes place, due to the fact that MPI needs to be portable across many different systems. Thus tasks such as process startup, reserving and scheduling resources, and returning information about available resources, are left to third party vendors to build on top of MPI.In this thesis, we have built a scheduling framework on top of the MPI implementation provided by LAM in order to provide the resource management functionalities required to enable the effective use of the dynamic process model of the MPI-2 standard.

Mathematical Libraries

A large majority of scientific and engineering applications that run on clusters involve number crunching numerical methods. Numerical libraries have been written that provide a unified API to application programmers and can be optimized for the specific hardware on which these applications are run. Basic Linear Algebra Subprograms (BLAS) are the building blocks of numerical software [14]. Level 1 BLAS provide vector-vector operations, Level 2 BLAS provide matrix-vector operations and Level 3 BLAS provide matrix-matrix operations. Various vendor supplied BLAS exist today that are tuned to run optimally on many hardware configurations. Numerical libraries such as LAPACK [15] and SCALAPACK [16] employ the BLAS to provide methods for commonly used numerical algorithms.

Linear Algebra Package (LAPACK)

LAPACK was designed to run efficiently on modern high-performance processors by taking into consideration the multi-layered memory hierarchies of the machines. Earlier numerical packages such as EISPACK and LINPACK disregarded the memory-hierarchy and thus were prone to inefficiencies by spending too much time in the transfer of data rather than in doing useful computation. LAPACK uses the BLAS to take advantage of
optimizations for the underlying hardware. Further, it predominantly uses Level 3 BLAS, i.e., blocked matrix-matrix operations in the inner most loops of the functions to most effectively utilize the memory hierarchy.

Scalable LAPACK (SCALAPACK)

In order to run LAPACK effectively on a distributed memory parallel clusters, a layer was introduced above LAPACK, that employed explicit message passing to incorporate off processor memory into the memory hierarchy. SCALAPACK has been written in Single-Program-Multiple-Data style (SPMD), and is amenable to be used in parallel SPMD programs.

1. INTRODUCTION
2. LITERATURE REVIEW
2.1 Communication Software
2.1.1 Parallel Virtual Machine (PVM)
2.1.2 Message Passing Interface (MPI)
2.2 Mathematical Libraries
2.2.1 Linear Algebra Package (LAPACK)
2.2.2 Scalable LAPACK (SCALAPACK)
2.3 Dynamic Resizing Frameworks
2.3.1 Piranha
2.3.2 Adaptive Multiblock PARTI (AMP)
2.3.3 Adaptive MPI (AMPI)
2.3.4 Dynamic Resource Management System (DRMS)
3. DESIGN
3.1 Motivation and Approach
3.2 System Architecture
3.3 Resize Library and API
3.3.1 Contacting scheduler at a remap point
3.3.2 Expansion
3.3.3 Shrinking
3.4 Scheduler Components
3.4.1 Job Startup
3.4.2 Job Monitor
3.4.3 Performance Data Gatherer
3.4.4 Remap Scheduler
4. EXPERIMENTAL RESULTS
4.1 Experimental Setup
4.2 Adaptive Sweet Spot detection
4.2.1 Matrix Multiplication
4.2.2 LU Factorization
4.2.3 Improvement in iteration time
4.2.4 Redistribution overhead
4.3 Processor allocation decisions for job mixes
4.3.1 Job Mix 1
4.3.2 Job Mix 2
4.3.3 System throughput and job turnaround time
5. CONCLUSIONS AND FUTURE WORK
5.1 Summary
5.2 Future Work