| <?xml version="1.0" encoding="UTF-8" standalone="no"?> |
| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="prev" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="next" href="profile_mode_api.html" title="Extensions for Custom Containers" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr></table><hr /></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.design"></a>Design</h2></div></div></div><p> |
| </p><div class="table"><a id="table.profile_code_loc"></a><p class="title"><strong>Table 19.1. Profile Code Location</strong></p><div class="table-contents"><table class="table" summary="Profile Code Location" border="1"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="left">Code Location</th><th align="left">Use</th></tr></thead><tbody><tr><td align="left"><code class="code">libstdc++-v3/include/std/*</code></td><td align="left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/*</code></td><td align="left">Profile extension public headers (map, vector, ...).</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td align="left">Profile extension internals. Implementation files are |
| only included from <code class="code">impl/profiler.h</code>, which is the only |
| file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break" /><p> |
| </p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"></a>Wrapper Model</h3></div></div></div><p> |
| In order to get our instrumented library version included instead of the |
| release one, |
| we use the same wrapper model as the debug mode. |
| We subclass entities from the release version. Wherever |
| <code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is |
| <code class="code">std::__norm</code>, whereas the profile namespace is |
| <code class="code">std::__profile</code>. Using plain <code class="code">std</code> translates |
| into <code class="code">std::__profile</code>. |
| </p><p> |
| Whenever possible, we try to wrap at the public interface level, e.g., |
| in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>, |
| in order not to depend on implementation. |
| </p><p> |
| Mixing object files built with and without the profile mode must |
| not affect the program execution. However, there are no guarantees to |
| the accuracy of diagnostics when using even a single object not built with |
| <code class="code">-D_GLIBCXX_PROFILE</code>. |
| Currently, mixing the profile mode with debug and parallel extensions is |
| not allowed. Mixing them at compile time will result in preprocessor errors. |
| Mixing them at link time is undefined. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"></a>Instrumentation</h3></div></div></div><p> |
| Instead of instrumenting every public entry and exit point, |
| we chose to add instrumentation on demand, as needed |
| by individual diagnostics. |
| The main reason is that some diagnostics require us to extract bits of |
| internal state that are particular only to that diagnostic. |
| We plan to formalize this later, after we learn more about the requirements |
| of several diagnostics. |
| </p><p> |
| All the instrumentation points can be switched on and off using |
| <code class="code">-D[_NO]_GLIBCXX_PROFILE_<diagnostic></code> options. |
| With all the instrumentation calls off, there should be negligible |
| overhead over the release version. This property is needed to support |
| diagnostics based on timing of internal operations. For such diagnostics, |
| we anticipate turning most of the instrumentation off in order to prevent |
| profiling overhead from polluting time measurements, and thus diagnostics. |
| </p><p> |
| All the instrumentation on/off compile time switches live in |
| <code class="code">include/profile/profiler.h</code>. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"></a>Run Time Behavior</h3></div></div></div><p> |
| For practical reasons, the instrumentation library processes the trace |
| partially |
| rather than dumping it to disk in raw form. Each event is processed when |
| it occurs. It is usually attached a cost and it is aggregated into |
| the database of a specific diagnostic class. The cost model |
| is based largely on the standard performance guarantees, but in some |
| cases we use knowledge about GCC's standard library implementation. |
| </p><p> |
| Information is indexed by (1) call stack and (2) instance id or address |
| to be able to understand and summarize precise creation-use-destruction |
| dynamic chains. Although the analysis is sensitive to dynamic instances, |
| the reports are only sensitive to call context. Whenever a dynamic instance |
| is destroyed, we accumulate its effect to the corresponding entry for the |
| call stack of its constructor location. |
| </p><p> |
| For details, see |
| <a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">paper presented at |
| CGO 2009</a>. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"></a>Analysis and Diagnostics</h3></div></div></div><p> |
| Final analysis takes place offline, and it is based entirely on the |
| generated trace and debugging info in the application binary. |
| See section Diagnostics for a list of analysis types that we plan to support. |
| </p><p> |
| The input to the analysis is a table indexed by profile type and call stack. |
| The data type for each entry depends on the profile type. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"></a>Cost Model</h3></div></div></div><p> |
| While it is likely that cost models become complex as we get into |
| more sophisticated analysis, we will try to follow a simple set of rules |
| at the beginning. |
| </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span> |
| The idea is to estimate or measure the cost of all operations |
| in the original scenario versus the scenario we advise to switch to. |
| For instance, when advising to change a vector to a list, an occurrence |
| of the <code class="code">insert</code> method will generally count as a benefit. |
| Its magnitude depends on (1) the number of elements that get shifted |
| and (2) whether it triggers a reallocation. |
| </p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span> |
| We will measure the relative difference between similar operations on |
| different containers. We plan to write a battery of small tests that |
| compare the times of the executions of similar methods on different |
| containers. The idea is to run these tests on the target machine. |
| If this training phase is very quick, we may decide to perform it at |
| library initialization time. The results can be cached on disk and reused |
| across runs. |
| </p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span> |
| We plan to use timers for operations of larger granularity, such as sort. |
| For instance, we can switch between different sort methods on the fly |
| and report the one that performs best for each call context. |
| </p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span> |
| We may decide that the presence of an operation nullifies the advice. |
| For instance, when considering switching from <code class="code">set</code> to |
| <code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>, |
| we will simply not issue the advice, since this could signal that the use |
| care require a sorted container.</p></li></ul></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"></a>Reports</h3></div></div></div><p> |
| There are two types of reports. First, if we recognize a pattern for which |
| we have a substitute that is likely to give better performance, we print |
| the advice and estimated performance gain. The advice is usually associated |
| to a code position and possibly a call stack. |
| </p><p> |
| Second, we report performance characteristics for which we do not have |
| a clear solution for improvement. For instance, we can point to the user |
| the top 10 <code class="code">multimap</code> locations |
| which have the worst data locality in actual traversals. |
| Although this does not offer a solution, |
| it helps the user focus on the key problems and ignore the uninteresting ones. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"></a>Testing</h3></div></div></div><p> |
| First, we want to make sure we preserve the behavior of the release mode. |
| You can just type <code class="code">"make check-profile"</code>, which |
| builds and runs the whole test suite in profile mode. |
| </p><p> |
| Second, we want to test the correctness of each diagnostic. |
| We created a <code class="code">profile</code> directory in the test suite. |
| Each diagnostic must come with at least two tests, one for false positives |
| and one for false negatives. |
| </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 19. Profile Mode </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html> |