Next: Sending Problem Reports Up: The Concert Tutorial Previous: Using the Concert

Performance Statistics

The Concert runtime system provides statistics that measure the performance of your programs; the statistics collected measure the amount of concurrency and contention in your code. Four different sets of statistics are collected; this tutorial will provide an example of the concurrency measurements.

Using Tracing

The last program is section 4 was a distributed counters aggregate, and it was shown in Figure 11. Since aggregates are non-serializing, there might be some concurrency, even in this small program. The performance statistics provide a way to see whether or not this actually happens. There are several steps to using the performance tools, as this example will illustrate.

  1. The first step in using the tools is to compile the program with the -p flag. Load the program into Emacs, and use the M-c mechanism, adding the -p flag to the default compile command, as shown in Figure 20

  2. Once this is done, you must run the program with the -p command line option; since this is a very small program, use the -s 1 command line option to produce very fine samples (see Figure 21).

  3. After the program terminates, you will notice a file called counters-example.trace in the directory. Run genProcTrace, and enter the name of this file when it prompts for a trace file. Enter 2 when it prompts for record number, and enter 50 when it asks for a smoothing factor. This is shown in Figure 22

  4. You will now see a file called counters-example.2.avg in the directory. Now run gnuplot and, at its command line, type plot 'counters-example.2.avg' w line, and you will see a graph with lines going in various directions, like Figure 23

Using Registration

The graph in Figure 23 does not contain all the information that could be desired; it would be useful to know what portions of the program cause the high and low points in the concurrency graph.

The runtime system provides a mechanism known as registration to do this; all messages sent to the (global trace_monitor) object are recorded, and appear in the trace at the point at which they executed in the program. This provides a way of annotating the trace to determine which parts of the program are responsible for high and low points of concurrency.

To see where the peaks and valleys of concurrent occur in this example, a few messages to (global trace_monitor) have been placed in the initial_message. One would expect the peak of concurrency to occur around Two, because it happens right around the forall loop to the aggregate representatives, which is the only source of parallelism in this example. The resultant source code is show in Figure 24 and the trace produced by gnuplot is shown in Figure 25. As you can see, the maximum concurrency does indeed occur around Two.

Although the concurrency profile is the most commonly used statistic, the runtime can produce several others. Complete documentation of these statistics, and instructions for viewing all of the statistics with the Pablo performance visualization system is provided in [4].



Next: Sending Problem Reports Up: The Concert Tutorial


Julian Dolby
Vijay Karamcheti
John Plevyak
Xingbin Zhang
Concurrent Systems Architecture Group