libstdc++-v3/doc/html/manual/profile_mode_design.html - gcc - Git at Google

 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="prev" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="next" href="profile_mode_api.html" title="Extensions for Custom Containers" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr></table><hr /></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.design"></a>Design</h2></div></div></div><p>
 </p><div class="table"><a id="table.profile_code_loc"></a><p class="title"><strong>Table 19.1. Profile Code Location</strong></p><div class="table-contents"><table class="table" summary="Profile Code Location" border="1"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="left">Code Location</th><th align="left">Use</th></tr></thead><tbody><tr><td align="left"><code class="code">libstdc++-v3/include/std/*</code></td><td align="left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/*</code></td><td align="left">Profile extension public headers (map, vector, ...).</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td align="left">Profile extension internals.  Implementation files are
      only included from <code class="code">impl/profiler.h</code>, which is the only
      file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break" /><p>
 </p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"></a>Wrapper Model</h3></div></div></div><p>
   In order to get our instrumented library version included instead of the
   release one,
   we use the same wrapper model as the debug mode.
   We subclass entities from the release version.  Wherever
   <code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is
   <code class="code">std::__norm</code>, whereas the profile namespace is
   <code class="code">std::__profile</code>.  Using plain <code class="code">std</code> translates
   into <code class="code">std::__profile</code>.
   </p><p>
   Whenever possible, we try to wrap at the public interface level, e.g.,
   in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>,
   in order not to depend on implementation.
   </p><p>
   Mixing object files built with and without the profile mode must
   not affect the program execution.  However, there are no guarantees to
   the accuracy of diagnostics when using even a single object not built with
   <code class="code">-D_GLIBCXX_PROFILE</code>.
   Currently, mixing the profile mode with debug and parallel extensions is
   not allowed.  Mixing them at compile time will result in preprocessor errors.
   Mixing them at link time is undefined.
   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"></a>Instrumentation</h3></div></div></div><p>
   Instead of instrumenting every public entry and exit point,
   we chose to add instrumentation on demand, as needed
   by individual diagnostics.
   The main reason is that some diagnostics require us to extract bits of
   internal state that are particular only to that diagnostic.
   We plan to formalize this later, after we learn more about the requirements
   of several diagnostics.
   </p><p>
   All the instrumentation points can be switched on and off using
   <code class="code">-D[_NO]_GLIBCXX_PROFILE_&lt;diagnostic&gt;</code> options.
   With all the instrumentation calls off, there should be negligible
   overhead over the release version.  This property is needed to support
   diagnostics based on timing of internal operations.  For such diagnostics,
   we anticipate turning most of the instrumentation off in order to prevent
   profiling overhead from polluting time measurements, and thus diagnostics.
   </p><p>
   All the instrumentation on/off compile time switches live in
   <code class="code">include/profile/profiler.h</code>.
   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"></a>Run Time Behavior</h3></div></div></div><p>
   For practical reasons, the instrumentation library processes the trace
   partially
   rather than dumping it to disk in raw form.  Each event is processed when
   it occurs.  It is usually attached a cost and it is aggregated into
   the database of a specific diagnostic class.  The cost model
   is based largely on the standard performance guarantees, but in some
   cases we use knowledge about GCC's standard library implementation.
   </p><p>
   Information is indexed by (1) call stack and (2) instance id or address
   to be able to understand and summarize precise creation-use-destruction
   dynamic chains.  Although the analysis is sensitive to dynamic instances,
   the reports are only sensitive to call context.  Whenever a dynamic instance
   is destroyed, we accumulate its effect to the corresponding entry for the
   call stack of its constructor location.
   </p><p>
   For details, see
    <a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">paper presented at
    CGO 2009</a>.
   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"></a>Analysis and Diagnostics</h3></div></div></div><p>
   Final analysis takes place offline, and it is based entirely on the
   generated trace and debugging info in the application binary.
   See section Diagnostics for a list of analysis types that we plan to support.
   </p><p>
   The input to the analysis is a table indexed by profile type and call stack.
   The data type for each entry depends on the profile type.
   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"></a>Cost Model</h3></div></div></div><p>
   While it is likely that cost models become complex as we get into
   more sophisticated analysis, we will try to follow a simple set of rules
   at the beginning.
   </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span>
   The idea is to estimate or measure the cost of all operations
   in the original scenario versus the scenario we advise to switch to.
   For instance, when advising to change a vector to a list, an occurrence
   of the <code class="code">insert</code> method will generally count as a benefit.
   Its magnitude depends on (1) the number of elements that get shifted
   and (2) whether it triggers a reallocation.
   </p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span>
   We will measure the relative difference between similar operations on
   different containers.  We plan to write a battery of small tests that
   compare the times of the executions of similar methods on different
   containers.  The idea is to run these tests on the target machine.
   If this training phase is very quick, we may decide to perform it at
   library initialization time.  The results can be cached on disk and reused
   across runs.
   </p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span>
   We plan to use timers for operations of larger granularity, such as sort.
   For instance, we can switch between different sort methods on the fly
   and report the one that performs best for each call context.
   </p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span>
   We may decide that the presence of an operation nullifies the advice.
   For instance, when considering switching from <code class="code">set</code> to
   <code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>,
   we will simply not issue the advice, since this could signal that the use
   care require a sorted container.</p></li></ul></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"></a>Reports</h3></div></div></div><p>
 There are two types of reports.  First, if we recognize a pattern for which
 we have a substitute that is likely to give better performance, we print
 the advice and estimated performance gain.  The advice is usually associated
 to a code position and possibly a call stack.
   </p><p>
 Second, we report performance characteristics for which we do not have
 a clear solution for improvement.  For instance, we can point to the user
 the top 10 <code class="code">multimap</code> locations
 which have the worst data locality in actual traversals.
 Although this does not offer a solution,
 it helps the user focus on the key problems and ignore the uninteresting ones.
   </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"></a>Testing</h3></div></div></div><p>
   First, we want to make sure we preserve the behavior of the release mode.
   You can just type <code class="code">"make check-profile"</code>, which
   builds and runs the whole test suite in profile mode.
   </p><p>
   Second, we want to test the correctness of each diagnostic.
   We created a <code class="code">profile</code> directory in the test suite.
   Each diagnostic must come with at least two tests, one for false positives
   and one for false negatives.
   </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 19. Profile Mode </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html>
	<?xml version="1.0" encoding="UTF-8" standalone="no"?>
	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Design</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="prev" href="profile_mode.html" title="Chapter 19. Profile Mode" /><link rel="next" href="profile_mode_api.html" title="Extensions for Custom Containers" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Design</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><th width="60%" align="center">Chapter 19. Profile Mode</th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr></table><hr /></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.design"></a>Design</h2></div></div></div><p>
	</p><div class="table"><a id="table.profile_code_loc"></a><p class="title"><strong>Table 19.1. Profile Code Location</strong></p><div class="table-contents"><table class="table" summary="Profile Code Location" border="1"><colgroup><col align="left" class="c1" /><col align="left" class="c2" /></colgroup><thead><tr><th align="left">Code Location</th><th align="left">Use</th></tr></thead><tbody><tr><td align="left"><code class="code">libstdc++-v3/include/std/</code></td><td align="left">Preprocessor code to redirect to profile extension headers.</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/</code></td><td align="left">Profile extension public headers (map, vector, ...).</td></tr><tr><td align="left"><code class="code">libstdc++-v3/include/profile/impl/*</code></td><td align="left">Profile extension internals. Implementation files are
	only included from <code class="code">impl/profiler.h</code>, which is the only
	file included from the public headers.</td></tr></tbody></table></div></div><br class="table-break" /><p>
	</p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.wrapper"></a>Wrapper Model</h3></div></div></div><p>
	In order to get our instrumented library version included instead of the
	release one,
	we use the same wrapper model as the debug mode.
	We subclass entities from the release version. Wherever
	<code class="code">_GLIBCXX_PROFILE</code> is defined, the release namespace is
	<code class="code">std::__norm</code>, whereas the profile namespace is
	<code class="code">std::__profile</code>. Using plain <code class="code">std</code> translates
	into <code class="code">std::__profile</code>.
	</p><p>
	Whenever possible, we try to wrap at the public interface level, e.g.,
	in <code class="code">unordered_set</code> rather than in <code class="code">hashtable</code>,
	in order not to depend on implementation.
	</p><p>
	Mixing object files built with and without the profile mode must
	not affect the program execution. However, there are no guarantees to
	the accuracy of diagnostics when using even a single object not built with
	<code class="code">-D_GLIBCXX_PROFILE</code>.
	Currently, mixing the profile mode with debug and parallel extensions is
	not allowed. Mixing them at compile time will result in preprocessor errors.
	Mixing them at link time is undefined.
	</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.instrumentation"></a>Instrumentation</h3></div></div></div><p>
	Instead of instrumenting every public entry and exit point,
	we chose to add instrumentation on demand, as needed
	by individual diagnostics.
	The main reason is that some diagnostics require us to extract bits of
	internal state that are particular only to that diagnostic.
	We plan to formalize this later, after we learn more about the requirements
	of several diagnostics.
	</p><p>
	All the instrumentation points can be switched on and off using
	<code class="code">-D[_NO]_GLIBCXX_PROFILE_<diagnostic></code> options.
	With all the instrumentation calls off, there should be negligible
	overhead over the release version. This property is needed to support
	diagnostics based on timing of internal operations. For such diagnostics,
	we anticipate turning most of the instrumentation off in order to prevent
	profiling overhead from polluting time measurements, and thus diagnostics.
	</p><p>
	All the instrumentation on/off compile time switches live in
	<code class="code">include/profile/profiler.h</code>.
	</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.rtlib"></a>Run Time Behavior</h3></div></div></div><p>
	For practical reasons, the instrumentation library processes the trace
	partially
	rather than dumping it to disk in raw form. Each event is processed when
	it occurs. It is usually attached a cost and it is aggregated into
	the database of a specific diagnostic class. The cost model
	is based largely on the standard performance guarantees, but in some
	cases we use knowledge about GCC's standard library implementation.
	</p><p>
	Information is indexed by (1) call stack and (2) instance id or address
	to be able to understand and summarize precise creation-use-destruction
	dynamic chains. Although the analysis is sensitive to dynamic instances,
	the reports are only sensitive to call context. Whenever a dynamic instance
	is destroyed, we accumulate its effect to the corresponding entry for the
	call stack of its constructor location.
	</p><p>
	For details, see
	<a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">paper presented at
	CGO 2009</a>.
	</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.analysis"></a>Analysis and Diagnostics</h3></div></div></div><p>
	Final analysis takes place offline, and it is based entirely on the
	generated trace and debugging info in the application binary.
	See section Diagnostics for a list of analysis types that we plan to support.
	</p><p>
	The input to the analysis is a table indexed by profile type and call stack.
	The data type for each entry depends on the profile type.
	</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.cost-model"></a>Cost Model</h3></div></div></div><p>
	While it is likely that cost models become complex as we get into
	more sophisticated analysis, we will try to follow a simple set of rules
	at the beginning.
	</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p><span class="emphasis"><em>Relative benefit estimation:</em></span>
	The idea is to estimate or measure the cost of all operations
	in the original scenario versus the scenario we advise to switch to.
	For instance, when advising to change a vector to a list, an occurrence
	of the <code class="code">insert</code> method will generally count as a benefit.
	Its magnitude depends on (1) the number of elements that get shifted
	and (2) whether it triggers a reallocation.
	</p></li><li class="listitem"><p><span class="emphasis"><em>Synthetic measurements:</em></span>
	We will measure the relative difference between similar operations on
	different containers. We plan to write a battery of small tests that
	compare the times of the executions of similar methods on different
	containers. The idea is to run these tests on the target machine.
	If this training phase is very quick, we may decide to perform it at
	library initialization time. The results can be cached on disk and reused
	across runs.
	</p></li><li class="listitem"><p><span class="emphasis"><em>Timers:</em></span>
	We plan to use timers for operations of larger granularity, such as sort.
	For instance, we can switch between different sort methods on the fly
	and report the one that performs best for each call context.
	</p></li><li class="listitem"><p><span class="emphasis"><em>Show stoppers:</em></span>
	We may decide that the presence of an operation nullifies the advice.
	For instance, when considering switching from <code class="code">set</code> to
	<code class="code">unordered_set</code>, if we detect use of operator <code class="code">++</code>,
	we will simply not issue the advice, since this could signal that the use
	care require a sorted container.</p></li></ul></div></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.reports"></a>Reports</h3></div></div></div><p>
	There are two types of reports. First, if we recognize a pattern for which
	we have a substitute that is likely to give better performance, we print
	the advice and estimated performance gain. The advice is usually associated
	to a code position and possibly a call stack.
	</p><p>
	Second, we report performance characteristics for which we do not have
	a clear solution for improvement. For instance, we can point to the user
	the top 10 <code class="code">multimap</code> locations
	which have the worst data locality in actual traversals.
	Although this does not offer a solution,
	it helps the user focus on the key problems and ignore the uninteresting ones.
	</p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.design.testing"></a>Testing</h3></div></div></div><p>
	First, we want to make sure we preserve the behavior of the release mode.
	You can just type <code class="code">"make check-profile"</code>, which
	builds and runs the whole test suite in profile mode.
	</p><p>
	Second, we want to test the correctness of each diagnostic.
	We created a <code class="code">profile</code> directory in the test suite.
	Each diagnostic must come with at least two tests, one for false positives
	and one for false negatives.
	</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="profile_mode.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="profile_mode.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_api.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 19. Profile Mode </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Extensions for Custom Containers</td></tr></table></div></body></html>