Search                        Top                                  Index
HELP PROFILE                                Robert Duncan, February 1996

    uses profile;

LIB * PROFILE shows how much time a program spends executing each of its

         CONTENTS - (Use <ENTER> g to access required sections)

  1   Introduction
  2   Using the profiler
  3   How the profiler works
  4   Why profiling may be inaccurate
  5   Counting all active callers
  6   Choosing which procedures get profiled
  7   Customising the report
  8   Using profile_display
  9   Getting a graphical report
 10   Tracing garbage collections

1  Introduction

LIB * PROFILE helps you determine where a program is spending its time
by showing how long it takes executing inside each procedure. For
example, look at this simple but time-consuming program:

    define example();
        define procB();
            repeat 1e7 times endrepeat;
        define procA();
            repeat 1e7 times endrepeat;

To run this under the profiler do:

    uses profile;

    profile example();

This will execute the example program and produce a report like this:

    Number of interrupts: 598
    CPU time: 6.00 seconds, GC time: 0.00 seconds

    Count  %Time  Count  %Time
      310   51.8    310   51.8  procB
      288   48.2    598  100.0  procA

This shows that the program took exactly 6 seconds to run and that the
time was divided pretty much equally between executing the two local
procedures procA and procB.

If you try this example now, the precise figures you get will be
different depending on the machine you are running on, but the
percentage values should remain much the same.

2  Using the profiler

To use the profiler you must first load the library with the command

    uses profile;

This defines two entry points to the profiler: a macro profile
(demonstrated above) and a procedure profile_apply. The macro is more
convenient for interactive use. It has the form:

    profile expression

where the expression extends to the end of the line. The expression is
evaluated under the control of the profiler.

The procedure profile_apply works like ordinary apply except that the
application is done under the control of the profiler. The profile macro
expands to a call to profile_apply.

To profile the application

    test(arg1, arg2)

use either of the forms

    profile test(arg1, arg2);

    profile_apply(arg1, arg2, test);

3  How the profiler works

The profiler uses a sampling technique to estimate the time distribution
of a program. It sets a timer to interrupt the program at regular
intervals and at each interrupt increments a counter for the procedure
which is executing at that moment. Over one complete run of a program,
the counts accumulated against each procedure give an estimate of the
relative amounts of time spent inside each one.

The first line of the report shows how many interrupts were counted
during the program's execution:

    Number of interrupts: 598

The higher this number, the more accurate the rest of the results are
likely to be. The default interrupt interval is 1/100th of a second; you
can change this by assigning

    N -> profile_interval;

where N is the time in microseconds, but the minimum interval is limited
by the resolution of the system clock (see REF * sys_timer).

The main body of the report shows the results for individual procedures:

    Count  %Time  Count  %Time
      310   51.8    310   51.8  procB
      288   48.2    598  100.0  procA

The first column contains the raw count data and is sorted with the most
frequently occurring procedure at the top. The next column shows the
count divided through by the total number of interrupts, to give an
estimate of the percentage of time spent within that procedure. The last
two columns simply accumulate the counts and percentage times.

The standard report shows results for no more than 10 procedures because
typically these will account for the major proportion of work within the
program, with the remainder contributing insignificant amounts. To
change the maximum number of results displayed to N do

    N -> profile_show_max;

or make it <false> to show results for all procedures.

4  Why profiling may be inaccurate

Using a sampling technique means that you can profile any program
without having to compile it specially for profiling, but it does have

Like any sample-based tool, the profiler can only show an approximation
to reality, and while that approximation is likely to get better as the
number of interrupts increases, there is no guarantee that this will be
so because the sampling method is not statistically sound. Every program
has a pattern of execution unique to itself, and it is quite possible
that the profiler's pattern of regular interrupts could interlock with
the program's execution path in such a way as to consistently or
frequently miss significant execution steps, biasing the results.

Also, the interrupts are rarely serviced at the precise moment at which
they arise. This is a property of all asynchronous events within Poplog:
they can only be handled at discrete execution points where it is known
to be safe to do so. Many system procedures don't service interrupts at
all and so will never appear in the profile report. And user procedures
only allow for interrupts at specific points: typically, once on entry
to the procedure and once on each pass through a loop. So the execution
points at which the profiler gets to examine the execution state are not
randomly distributed at all.

The fact that many system procedures don't service interrupts is
generally a good thing, because it's not often that the time spent
inside a particular system procedure is of much interest: it's usually
better that the time should be allocated to the procedure's caller, and
that's typically what the profiler will do.

The problem with user procedures is more disruptive. A procedure which
does a lot of work but which doesn't contain any loops has at most one
chance to be interrupted on procedure entry. An interrupt arriving after
that but while the procedure is still executing will not be serviced
until the procedure returns or until another interruptable procedure is
called; in either case, the interrupt will be charged to a procedure
other than the one actually doing the work. As an extreme example,
consider the loop:

    while expression do work() endwhile

where work() is an expensive but loop-free procedure call. It is most
likely that interrupts occurring during the execution of work() will be
serviced on its return, in the interrupt check of the while loop, and
the time taken by work() will be charged to its caller.

In defence of the profiler it may be argued that it is rare to find
procedures which do significant amounts of work but which contain no
loops (recursion is OK, because each recursive call will do an interrupt
check) and that generally the profiler will work better for programs
constructed from lots of small procedures, which many would consider to
be good programming style. But all in all, profiling results should be
taken as an indication rather than the last word.

5  Counting all active callers

A different kind of profile can be generated by setting the variable
profile_count_all. If this is non-false, then at each interrupt the
profiler will increment a counter not just for the procedure which is
actually executing at that point, but for all procedures which are
active in the call chain. Returning to the original example:

    true -> profile_count_all;

    profile example();

    Number of interrupts: 597
    CPU time: 5.98 seconds, GC time: 0.00 seconds

    Count  %Time  Count  %Time
      281   47.1    597  100.0  procA
        0    0.0    597  100.0  example
      316   52.9    316   52.9  procB

The first two columns of this modified report are as before -- though no
longer sorted -- but the third column now shows the number of times that
each procedure was found to be active in the call chain, expressed as a
time percentage in column four.

To see the difference, observe how this table includes an entry for the
main example procedure which did not appear before. This is because
example does no real work itself -- it calls procA to do it. But example
is clearly active throughout the whole execution of the program, and
that is properly reflected here. The active time records not only the
time spent executing a procedure's own code, but also that of any other
procedures it calls. So the active time for procA is also recorded at
100%, made up of 47% from executing its own loop code and 53% from
calling procB. Only the results for procB are similar to before, because
it is a leaf procedure and makes no other calls.

The profile_count_all option gives a more top-down view of program
execution. It treats each procedure as a black box and shows the amount
of time allocated to each irrespective of what may be going on inside.
Collecting this information can be expensive: profiling with the
count-all option can be any amount slower than without, depending on the
maximum depth of the call graph.

6  Choosing which procedures get profiled

You can focus attention on particular procedures in a program by
excluding or ignoring those procedures which are of no interest.

Excluding and ignoring mean different things: both will stop a procedure
from appearing in the final report, but they differ in what happens to
its profile count. If a procedure is excluded, its count is added to
that of its caller (or its caller's caller if the caller is itself
excluded, and so on) whereas if it is ignored the count is discounted
from the report altogether. For a given program, excluding a procedure
will not change the (average) number of profile interrupts recorded, but
just redistribute those interrupts between the remaining procedures;
ignoring a procedure will reduce the number of interrupts overall,
because interrupts occurring while the ignored procedure is executing
will not be counted.

In the normal case the profiler determines which procedures are to be
excluded and ignored by looking at the global variables


Both of these are user-assignable. The profiler excludes all anonymous
procedures and those whose names occur in profile_exclude_list and
ignores those named in profile_ignore_list.

You might add to profile_exclude_list the names of frequently-called
utility procedures whose times you would prefer to have allocated to
their point of call. On this basis, the default value of
profile_exclude_list already contains the procedure null:

    profile_exclude_list =>
    ** [null]

null is called frequently from Pop-11 code, often from unexpected
places, and the time it takes is of little interest in most programs.
(While there are many other system procedures which are equally
uninteresting, they are effectively always excluded by virtue of their
not servicing interrupts so there's no need to list them all

Use profile_ignore_list for procedures which spend much of their time
waiting (for user input, asynchronous events, etc.) or printing. Waiting
time is not counted as part of execution time (CPU time) but will
distort the profiling figures because the profile interrupts will occur
just as often -- and possibly more often -- while waiting as they do
when doing useful work. The default value of profile_ignore_list is

    profile_ignore_list =>
    ** []

You can discount most interactive (terminal) I/O time by assigning:

    [charin charout rawcharin rawcharout cucharout pr syspr pretty
        sysflush] -> profile_ignore_list;

although in Ved immediate mode there will be various Ved procedures to
take account of too.

For more sophisticated effects you can redefine the procedures

    profile_include(p) -> item

    profile_ignore(item) -> bool

In deciding whether to exclude or ignore a particular procedure p, the
profiler first calls profile_include and then passes any non-false item
it returns to profile_ignore. In the standard definitions, the returned
item is the procedure name (its pdprops) which is checked against
profile_exclude_list and profile_ignore_list respectively. But you can
redefine these procedures to be more selective about what they accept
and reject. Some examples:

(a) To exclude all Ved procedures

        define profile_include(pdr) -> props;
            lvars props = pdprops(pdr);
            if isword(props) and isstartstring("ved", props) then
                false -> props;

(b) To exclude all system procedures (NB: this category includes
    anything locked into an image with sys_lock_system)

        define profile_include(pdr);
            isinheap(pdr) and pdprops(pdr);

(c) To restrict profiling to a particular section of a program, entered
    through the procedure MyProc

        define profile_ignore(x);

7  Customising the report

Profile output is generated by the procedure

    profile_report(cpu_time, gc_time, counts, active_counts)

You can redefine this to produce your own kind of report. The arguments

        The CPU time and GC time used by the program, in hundredths of a
        second (derived from *systime and *popgctime).

        The raw count data collected by the profiler. A property.

        The active count data collected by the profiler. A property if
        profile_count_all is true, or <false> otherwise.

Arguments to the counts property are not the procedures themselves but
the results returned by profile_include, typically the pdprops. This
means that all procedures with the same pdprops will get combined into a
single count. If you want to have the procedures themselves in the
counts property, you must redefine profile_include accordingly.

Confusingly, arguments to the active_counts property (if present) are
always procedures; to combine this data with the standard counts you
must run them through profile_include first.

A redefined version of profile_report need not produce any output: it
could, for example, save away the results to be combined over a number
of runs. If you want to generate the standard report at any time, use
the procedure:

    profile_print(cpu_time, gc_time, counts, active_counts)

This is sensitive to the variables profile_show_max and profile_include
so make sure these are defined as you would like them before you call

8  Using profile_display

Older versions of LIB * PROFILE used the procedure

    profile_display(cpu_time, counts)

to produce their reports. This is no longer part of the library, but if
you have redefined it as part of some existing customisation, the
default value of profile_report will call it in preference to

9  Getting a graphical report

You can obtain a graphical display of the results of profiling by
loading the library LIB * PROFILE_GRAPHICAL. The results are presented
as a horizontal bar chart in a Ved window. See HELP * PROFILE_GRAPHICAL
for full details.

10  Tracing garbage collections

If *popgctrace is set while profiling, the profiler will print a message
after each garbage collection showing which procedure was executing when
the garbage collection occurred. You can arrange for popgctrace to be
always set during profiling by doing

    true -> profile_gc_trace;

--- C.all/help/profile
--- Copyright University of Sussex 1996. All rights reserved.