HPVM is a layered system where the fundamental design allows the underlying hardware performance to be exposed through the protocols to the user's application.
Logically the system is split into three components:
The development environment contains example programs, include files, sample make files, prebuilt libraries, and documentation. You should install this environment where users usually compile and link their applications. This environment is installed through the Development Libraries install option and should be placed on the machine where users usually develop, compile, and link their applications. It is unnecessary to place these files on the compute nodes of the cluster.
All programs that run on the compute cluster use FM at the lowest level. MPI-FM, for example is layered on top of FM and programs would link with both FM.lib and MPI.lib.
Important! Compiling user programs requires either Microsoft Visual C++ 5.0/6.0 and/or Digital Fortran. The example makefiles assume that the environment variables for these compilers are properly setup. The file `DevStudio/VC/bin/vcvars32.bat' will setup the environment correctly for Visual C++, and `DevStudio/DF/bin/dfvars.bat' will setup the environment for Digital Fortran. These batch files are provided by the compiler setup program. Also, the example files assume the use of Microsoft's nmake.
New for HPVM1.9! A Workstation Setup is now available that allows users to develop, debug, and test an FM program on a local workstation without the need of high-performance networking cards. This single-node setup uses a new shared-memory transport to provide interprocess communication. The setup automatically installs all of the needed services to run an FM job on your local workstation. Executables created in this environment can be run on an HPVM 1.9 cluster without recompiling or relinking.
Since all of the supplied APIs (MPI, SHMEM, GA, BSP) use FM either directly or indirectly, it is only the FM runtime that must be installed on the cluster. The FM runtime is split into three pieces:
Each node in the cluster must have two basic FM components installed: A High-Speed interconnect and the Context Manager (fmcm.exe). In addition, if a Myrinet cluster is being built, the HPVM-supplied device driver, network card control program (fmmcp.dat), and the network configuration file (fmconfig.txt), must also be present.
You should select the Compute Node setup type for each machine in your cluster. Aside from the usual NT-style setup questions, you must be prepared to answer a few questions at setup.
Important! Once you have completed the Compute Node Setup and have chosen Myrinet, you will need to define a network configuration file. The default file is in `\hpvm1.9\bin\fmconfig.txt'. A later section gives complete details one how to write a network configuration file. Setup will remind you to edit your own network configuration file. If you reboot the machine without making these changes, the context manager will probably not start due to the fact that your machine name is not listed in the fmconfig.txt file. In this case, edit the configuration file appropriately and use the Services applet in the NT control panel to restart the service.
You need to select a single machine to run the global manager service. The global manager only uses TCP/IP so it is not required that it be put on a machine with the high-performance network.
One should note that the FM runtime does not specify how jobs get launched onto individual cluster nodes. Furthermore, the FM runtime does not have any built-in support for job management. Instead, some other mechanism must be employed to start tasks on the individual nodes. Once, tasks are started, the global manager is used to enable initial synchronization of tasks.
This version of HPVM only supports Platform Computing's Load Sharing Facility (LSF) to launch parallel jobs onto cluster nodes. LSF provides a command-line interface to launch jobs as well as a GUI if you are locally logged onto one of the cluster machines or the LSF server. HPVM provides a front-end client (written in Java) to provide a GUI interface to some of the LSF services for remote users.
The client, when connected to the supplied server, allows users to submit batch jobs and monitor some aspects of the cluster such as compute load.
If you desire the LSF support, select the Java Server setup type. Setup asks only one extra question in this case
Important! This setup also installs the "HPVM Job Submission" service and starts it. The service should run as the LSF Administrator. You should go to the NT Services Control Panel Applet and select the proper username and password for your system (set up will remind you to this).
Go to the first, previous, next, last section, table of contents.