Profiling an application means investigating its
runtime performance by collecting metrics during its execution. One
of the most popular metrics is method call count - this is the number of times each function (method) of the program
was called during a run. Another useful metric is method
clock time - the actual time spent in each of the methods
of the program. You can also measure the CPU (central processing unit)
time, which directly reflects the work done on behalf of the method
by any of the computer's processors. This does not take into
account the I/O, sleep, context switch, or wait time.
Generally, a metric is a
mapping which associates numerical values with program static or dynamic
elements such as functions, variables, classes, objects, types, or
threads. The numerical values may represent various resources used
by the program.
For in-depth analysis of program performance,
it is useful to analyze a call graph. Call graphs
capture the “call” relationships between the methods.
The nodes of the call graph represent the program methods, while the
directed arcs represent calls made from one method to another. In
a call graph, the call counts or the timing data are collected for
the arcs.
Tracing |
 |
Tracing is one of two methods discussed here for
collecting profile data. Java virtual machines use tracing with reduction.
Here is how it works: the profile data is collected whenever the application
makes a function call. The calling method and the called method (sometimes
called “callee”) names are recorded along with the time
spent in the call. The data is accumulated (this is “reduction”) so that consecutive calls from the same caller to the same callee
increase the recorded time value. The number of calls is also recorded.
Tracing requires frequent reading of the current
time (or measuring other resources consumed by the program), and can
introduce large overhead. It produces accurate call counts and the
call graph, but the timing data can be substantially influenced by
the additional overhead.
Sampling |
 |
In sampling, the program runs at its own pace,
but from time to time the profiler checks the application state more
closely by temporarily interrupting the program's progress and
determining which method is executing. The sampling interval is the
elapsed time between two consecutive status checks. Sampling uses “wall clock time” as the basis for the sampling interval, but
only collects data for the CPU-scheduled threads. The methods that
consume more CPU time will be detected more frequently. With a large
number of samples, the CPU times for each function are estimated quite
well.
Sampling is a complementary technique to tracing.
It is characterized by relatively low overhead, produces fairly accurate
timing data (at least for long-running applications), but cannot produce
call counts. Also, the call graph is only partial. Usually a number
of less significant arcs and nodes will be missing.
See also Data Sampling Considerations.
Tuning Performance |
 |
The application tuning process consists of three
major steps:
Run the application and
generate profile data.
Analyze the profile data
and identify any performance bottlenecks.
Modify the application
to eliminate the problem.
In most cases you should check if the performance
problem has been eliminated by running the application again and comparing
the new profile data with the previous data. In fact, the whole process
should be iterated until reasonable performance expectations are met.
To be able to compare the profile data meaningfully,
you need to run the application using the same input data or load
(which is called a benchmark) and in the same environment. See also Preparing a Benchmark.
Remember the 80-20 rule: in most cases 80% of
the application resources are used by only 20% of the program code.
Tune those parts of the code that will have a large impact on performance.
There are two important rules to remember when
modifying programs to improve performance. These might seem obvious,
but in practice they are often forgotten.
Don't put performance above correctness.
When you modify the code, and especially when you change some
of the algorithms, always take care to preserve program correctness.
After you change the code, you'll want to test its performance.
Do not forget to test its correctness. You don't have to perform
thorough testing after each minor change, but it is certainly a good
idea to do this after you're done with the tuning.
Measure your progress.
Try to keep track
of the performance improvements you make. Some of the code changes,
although small, can cause great improvements. On the other hand, extensive
changes that seemed very promising may yield only minor improvements,
or improvements that are offset by a degradation in another part of
the application. Remember that good performance is only one of the
factors determining software quality. Some changes (for example, inlining)
may not be worth it if they compromise code readability or flexibility.