next up previous
Next: Unstructured Concurrency Up: ICC++: A C++ Dialect Previous: ObjectsAbstractions and Concurrency

Arrays, Objects, and Collections

  ICC++ provides collections, which extend C++ arrays and integrate them into the object model. This allows array-level functionality to be expressed using member functions of an array class. Our collection classes support a wide variety of concurrency patterns, from a data-parallel array model to more complex concurrent abstractions. They are related to collections in pC++ [26] when used for data parallelism, but each element has access to the entire collection, which allows them to implement more complex composite behavior as well. Finally, collections allow distributions to be explicitly specified [17].

Defining Object Collections

  Collections are defined by adding ``[ ]'' to the type name in a standard class declaration. This declaration creates two distinct classes: one for the elements, called type, and the other for the collection itself, called type[ ], and declared members can pertain either to the element or to the collection class. To prevent ambiguity, declarations for the entire collection need explicit type qualification. Each instance of a collection class has the collection members, and a linearly addressable set of elements, each of which have the element members. Nested collections are defined using multiple sets of [ ], and intermediate levels of them are both collections themselves and elements of the enclosing level. A simple collection definition follows:

// a collection definition
class Counter[] {
  int elt_total;             // an element member variable
  int Counter[]::total;      // a collection member variable
 public:
  Counter(void);             // an element constructor
  Counter[](void);           // a collection constructor

  int count(int);            // element member functions
  int elt_sum(int);
  int Counter[]::sum(void);  // a collection member function
};

This declaration creates two classes: the Counter[] collection type and the Counter element type. The Counter element has just one field: elt_total. The Counter[] collection consists of a linearly addressable set of Counter objects and one field of collection state: total. Counter[] has the member function sum, and Counter has the member functions count and elt_sum. Notice that the collection declarations are qualified and the element declarations are not; all unqualified declarations in a collection definition belong to the element type, and declarations for the whole collection must be qualified with the collection type name.

Since collections are classes, derivation is supported for them. It works class-wise so that derived inherits from base and derived[] inherits from base[]. For nested collections, each level inherits class-wise from the corresponding level in the base collection class.

Using Collections

  Collections are declared and used just as arrays are, using identical [ ] syntax for both declaration and indexing. Additionally, collection members, both variables and functions, are used just like those of any other object, as illustrated in:

Counter Foo[15];
conc for(int i = 0; i < 15; i++)
  Foo[i].count(i);
cout << Foo.sum();

Note that since there are no syntactic dependencies between the iterations of the conc for loop, the count calls can proceed in parallel. In addition, because the execution of iterations is not specified, the compiler is free to reorganize them for higher efficiency, as parallel compilers traditionally do.

This example illustrates the simplest use of collections: to express data parallelism by simply applying the same method to every element of a collection. This construes data parallelism as object-parallelism as in pC++, a generalization of the vector operations in languages such as Fortran which is the traditional meaning of data parallelism.

Concurrent Abstractions

  Collections are a convenient way of expressing distributed abstractions which present a concurrent interface. Collections have predefined members which give elements access to the entire collection. Among these are index(), which yields an element's position in the collection, and <collection-type>::this, which refers to the collection object itself. The collection itself has pre-defined members besides operator [ ], including size(void), which returns the number of elements in the collection. Other predefined collection members are documented in 17].

Consider the following example template definition:

template<class Constituent>
class MultiSet[]{
 private :
  Constituent elts[];
  int elt_count;

 public :
  // constructors omitted

  Constituent add_elt(Constituent elt){
    return elts[elt_count++] = elt;
  }

  int find_elt(Constituent elt){
    return MultiSet[]::this->find_elt(elt);
  }

  int find_elt_internal(Constituent elt){
    int count = 0;
    for (int i=0; i<elt_count; i++)
      if (elts[i] == elt) count++;
     return count;
  }

  int MultiSet[]::find_elt(Constituent elt){
    int count = 0;
    conc for(int i = 0; i < size(); i++)
      count += (*this)[i].find_elt_internal(elt);
  }
}; 

The MultiSet abstraction is a distributed multi-set in which different constituents are stored in each collection element. Constituents are inserted into specific elements; looking up an element therefore involves looking across the entire collection. The ability of elements to access the entire collection allows the MultiSet elements to cooperatively implement a concurrent interface to the abstraction: multiple calls to add_elt can proceed simultaneously when called upon different elements of the collection, as in:

MultiSet<int> set[17];
int stuff[100];

conc for(int i=0; i<100; i++) {
  set[i%17].add_elt(stuff[i]);
}

Note that the use of templates and collections allows the MultiSet to be a reusable abstraction that presents a concurrent interface. This combination makes libraries of concurrent abstractions possible.

Discussion

  Collections in ICC++ represent a unification of collections as distributed arrays of objects [26, 37] and the aggregate approach 13]. The array approach is more compatible with the preexisting C++ notion of arrays, in part because it has distinct collection and element types. A drawback to the independence of the types is that the element members have no primitive mechanism to refer to the whole collection, making it harder to implement concurrent abstractions like the MultiSet above. The aggregate approach supports cooperation among the constituents by allowing them to name each other, but by combining the array and constituent types complicates deriving collection definitions from object definitions. By creating two distinct but related types, ICC++ collections combine the advantages of both approaches.

This approach to collections, which integrates arrays into the object model, also divorces arrays and pointers. C++ vitiates pointers, allowing arithmetic only within C++ aggregates (i.e. objects and arrays). ICC++ forbids using pointers as arrays, replacing them with array-typed references. This choice not only allows more precise program analysis, it also provides uniform typing for multidimensional arrays. In addition, it allows the implementation to distribute collections without complex implementations of pointer arithmetic, but it necessitates some changes to C++ syntax. The type [ ] must be used instead of type * wherever an array variable is declared or defined. Two dimensional arrays must be declared using type [ ][ ], and analogously for higher dimensionalities.


next up previous
Next: Unstructured Concurrency Up: ICC++: A C++ Dialect Previous: ObjectsAbstractions and Concurrency

Julian Dolby
dolby@cs.uiuc.edu