Go to the first, previous, next, last section, table of contents.


Other functions

C function: void shmem_int_{sum,prod,min,max,and,or}_to_all (int *target, int *source, int nreduce, int PE_start, int logPE_stride, int PE_size, int *pWrk, long *pSync)
C function: void shmem_double_{sum,prod,min,max}_to_all (double *target, double *source, int nreduce, int PE_start, int logPE_stride, int PE_size, double *pWrk, long *pSync)
These functions implement all-to-all reduction and are collective operations. Data is reduced across the nodes and is then broadcast to all of the nodes in the active set. nreduce specifies the number of reduction operations (i.e. the number of elements in source and target).

The synchronization workspace (pSync) should be of size _SHMEM_REDUCE_SYNC_SIZE, and each element should be initialized to _SHMEM_SYNC_VALUE before first use.

The size of the data workspace buffer(pWrk) must be at least the maximum of _SHMEM_REDUCE_MIN_WRKDATA_SIZE and nreduce/2 + 1. The workspace buffer does not need to be initialized before first use. shmem_X_Y_to_all() will automatically reset the elements of pWrk back to _SHMEM_REDUCE_MIN_WRKDATA_SIZE upon completion, so the same synchronization workspace can be immediately reused.

The source and target buffers must be left intact until all of the nodes are done referencing them. This can be implemented by calling a barrier function after calling a shmem_X_Y_to_all() function.

The example programs `collc.c' and `coll.f' in the HPVM 1.0 release show how to perform this operation.

C function: int my_pe (void)
Returns the local node ID.

C function: int num_pes (void)
Return the number of nodes.


Go to the first, previous, next, last section, table of contents.