The Concurrent Systems Architecture Group at the University of California at San Diego is proud to announce the latest release of the High-Performance Virtual Machines (HPVM) software for Myrinet- and Giganet-clustered, Windows NT 4.0 compute nodes. This release includes the following enhancements to previous HPVM code: - Decreased latency. A zero-byte FM message has dropped to to 8.8 usec from 10.4 usec. - Giganet interconnect (a virtual interface architecture implementation) communication substrait support. - Shared memory communication substrait support. SMP systems will now achieve higher intra-machine performance by utilizing their CPU/memory bus. - Workstation development release. FM application programmers can now develop and initially test their FM-based programs on their Windows NT workstation. This feature uses main memory as a communication substrait. - Integration with Windows NT performance monitor. Users can visualize FM and/or MPI performance using NT's performance monitor GUI. - Bulk Synchronous Parallel API support. Source and binary releases are available from: http://www-csag.ucsd.edu/projects/hpvm/sw-distributions/index.html For more information on HPVM, see: http://www-csag.ucsd.edu/projects/hpvm.html --------------------------- Supported Hardware/Software --------------------------- HPVM 1.9 operates on Windows NT 4.0 (workstation or server). It can operate in "cluster mode" utilizing a Myrinet or Giganet interconnect. Within a SMP system, FM programs achieve higher performance by passing messages through shared memory. Also, FM can operate in "development mode" within a standalone workstation using main memory as a communication substrait. HPVM 1.9 has been tested with Microsoft Visual C++ 5.0 and Digital Fortran 5.0. Support for Servernet and socket-based implementations are underway. Support for GCC on Windows NT should be ready in the near term. Please visit our web page for updates and news. ------------- Supplied APIs ------------- HPVM 1.9 supports a number of APIs. Included in this release are: FM (Fast Messages), MPI-1 (MPICH-based), SHMEM, Global Arrays (PNNL) and BSP. ------------------ Project Objectives ------------------ The goal of the HPVM project is to enable high-performance computing on distributed computational resources. The applications HPVM supports include both traditional supercomputing applications and novel, emerging, high-performance distributed applications. We strive to achieve HPVM's goal of high performance on these applications without preventing programmers from using the standard APIs to which they are accustomed. As microprocessors continue to increase in performance, they have been approaching the performance of the fastest vector processors and, because of their low cost, have become the building block of choice for high-performance parallel computing systems. While parallel machines typically exploit custom high-performance networks, high-speed commodity networks are becoming widely available at a reasonable cost. The conjugation of high-performance networks and gigaflop microprocessors makes high-performance computing on clusters an attractive alternative to the more traditional, tightly-integrated parallel computers. Because of the speed of recent computer and network hardware, the critical performance challenges involve: - delivering high-performance communication with standard, high-level APIs, - coordinating scheduling and resource management, and - managing heterogeneity. HPVM 1.9 targets clusters of Pentium II computers interconnected with a Myrinet or Giganet network and running Windows NT. Other interconnects such sockets and Servernet II will likely be supported in future releases. Also, support for Linux-based PCs is being considered. The performance of HPVM is surprisingly competitive with MPP systems, such as the IBM SP2 and Cray T3D. It achieves impressive low-level communication performance across the cluster: one-way latencies < 9 usec and bandwidths > 100 MBytes/sec -- even for small packets (< 256 bytes). HPVM 1.9 supports the following interfaces, with peak latencies (usecs) and bandwidths (MBytes/sec) listed beside each: - Fast Messages over Myrinet (8.8, 101) - Fast Messages over Giganet (13.9, 81.5) - Fast Messages over shared memory (3.3, 185) - MPI over FM/Myrinet* (12.3, 84.0) - SHMEM over FM/Myrinet* (22, 93.2) - Global Arrays over FM/Myrinet* (27.1, 62.0) - BSP (not measured) * - MPI, SHMEM and Global Arrays all run over FM/Giganet and FM/shared memory but they were not measured. ------------------------------------------ Project Manager and Principal Investigator ------------------------------------------ Andrew A. Chien ---------------- Acknowledgements ---------------- HPVM research is supported in part by DARPA orders #E313 and #E524 through the US Air Force Rome Laboratory Contracts F30602-96-1-0286 and F30602-97-2-0121, and NSF Young Investigator award CCR-94-57809. It is also supported in part by funds from the NSF Partnerships for Advanced Computational Infrastructure - the Alliance and NPACI. Support from Microsoft, Hewlett-Packard, Myricom, Intel Corporation, Tandem Computers, and Platform Computing is also gratefully acknowledged.