TECH : Function-level Profiling

Performance Analysis Team
ARTICLE BY:
POSTED:

Part of the Performance Analysis team’s job is to visit customers to gain feedback and feature requests, and ensure we're providing the best tools we can.

One of the most frequently asked questions we hear is "Which are my hottest functions, and why?".

In this article we'll cover some of the features we provide to try and answer that question.

PC Sampling

The Razor performance analysis tools have a feature called PC (or Program Counter) sampling. This feature gives an indication of which functions are being executed the most, sometimes referred to as "hot" functions.

PC sampling measures performance by regularly determining which function the program counter is in. Keeping a count of how many times we are in each function builds up a picture of application performance.

In the screenshot below, the dark green bars show which function the PC was in when the sample was taken, doWorkA() and doWorkB() are examples in this case.

Razor showing PC samples

The statistics provided by PC sampling give a good overview of application performance, but sometimes it is preferable to get exact timings for each and every function. This is where Function-level profiling, and more specifically, function instrumentation, comes in.

Function Instrumentation

Some hardware, such as the PlayStation®Vita DevKit, provides a zero-intrusion hardware trace which gives us the absolute timings for each and every function. This enables us to build detailed call graph and timeline data, with no runtime cost to the game.

Where hardware support is not available, we have evolved some technology we originally implemented in our PlayStation®2 and PlayStation®3 CPU profilers, called function instrumentation.

Function instrumentation works by patching functions at runtime with code that emits profiling data on entry and exit. This gives us absolute timings for every execution of each function. As this process is performed at runtime, it works with both debug and release builds.

Patching the code in this way improves on the PC sampling method, as we can now determine exactly how many times each function is called, and how long it ran for.

Having the exact timings for functions helps to determine where code optimizations can, and should, be made. The function instrumentation feature can also be used at each iteration of the code, to ensure any changes haven't negatively affected the performance of the application.

In the game development world, where a few microseconds could be the difference between running at 60 frames per second and 30 frames per second, the level of detail provided by function instrumentation is vital.

Back to top