| <?xml version="1.0" encoding="UTF-8" standalone="no"?> |
| <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Chapter 19. Profile Mode</title><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><meta name="keywords" content="C++, library, profile" /><meta name="keywords" content="ISO C++, library" /><meta name="keywords" content="ISO C++, runtime, library" /><link rel="home" href="../index.html" title="The GNU C++ Library" /><link rel="up" href="extensions.html" title="Part III. Extensions" /><link rel="prev" href="parallel_mode_test.html" title="Testing" /><link rel="next" href="profile_mode_design.html" title="Design" /></head><body><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter 19. Profile Mode</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><th width="60%" align="center">Part III. |
| Extensions |
| |
| </th><td width="20%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr></table><hr /></div><div class="chapter"><div class="titlepage"><div><div><h2 class="title"><a id="manual.ext.profile_mode"></a>Chapter 19. Profile Mode</h2></div></div></div><div class="toc"><p><strong>Table of Contents</strong></p><dl class="toc"><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.intro">Intro</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.using">Using the Profile Mode</a></span></dt><dt><span class="section"><a href="profile_mode.html#manual.ext.profile_mode.tuning">Tuning the Profile Mode</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_design.html">Design</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.wrapper">Wrapper Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.instrumentation">Instrumentation</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.rtlib">Run Time Behavior</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.analysis">Analysis and Diagnostics</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.cost-model">Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.reports">Reports</a></span></dt><dt><span class="section"><a href="profile_mode_design.html#manual.ext.profile_mode.design.testing">Testing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_api.html">Extensions for Custom Containers</a></span></dt><dt><span class="section"><a href="profile_mode_cost_model.html">Empirical Cost Model</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html">Implementation Issues</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stack">Stack Traces</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.symbols">Symbolization of Instruction Addresses</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.concurrency">Concurrency</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.stdlib-in-proflib">Using the Standard Library in the Instrumentation Implementation</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.malloc-hooks">Malloc Hooks</a></span></dt><dt><span class="section"><a href="profile_mode_impl.html#manual.ext.profile_mode.implementation.construction-destruction">Construction and Destruction of Global Objects</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_devel.html">Developer Information</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.bigpic">Big Picture</a></span></dt><dt><span class="section"><a href="profile_mode_devel.html#manual.ext.profile_mode.developer.howto">How To Add A Diagnostic</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html">Diagnostics</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.template">Diagnostic Template</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.containers">Containers</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_small">Hashtable Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_too_large">Hashtable Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.inefficient_hash">Inefficient Hash</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_small">Vector Too Small</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_too_large">Vector Too Large</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_hashtable">Vector to Hashtable</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.hashtable_to_vector">Hashtable to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.vector_to_list">Vector to List</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_vector">List to Vector</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.list_to_slist">List to Forward List (Slist)</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.assoc_ord_to_unord">Ordered to Unordered Associative Container</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms">Algorithms</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.algorithms.sort">Sort Algorithm Performance</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality">Data Locality</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.sw_prefetch">Need Software Prefetch</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.locality.linked">Linked Structure Locality</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread">Multithreaded Data Access</a></span></dt><dd><dl><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.ddtest">Data Dependence Violations at Container Level</a></span></dt><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.mthread.false_share">False Sharing</a></span></dt></dl></dd><dt><span class="section"><a href="profile_mode_diagnostics.html#manual.ext.profile_mode.analysis.statistics">Statistics</a></span></dt></dl></dd><dt><span class="bibliography"><a href="profile_mode.html#profile_mode.biblio">Bibliography</a></span></dt></dl></div><div class="section"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a id="manual.ext.profile_mode.intro"></a>Intro</h2></div></div></div><p> |
| <span class="emphasis"><em>Goal: </em></span>Give performance improvement advice based on |
| recognition of suboptimal usage patterns of the standard library. |
| </p><p> |
| <span class="emphasis"><em>Method: </em></span>Wrap the standard library code. Insert |
| calls to an instrumentation library to record the internal state of |
| various components at interesting entry/exit points to/from the standard |
| library. Process trace, recognize suboptimal patterns, give advice. |
| For details, see the |
| <a class="link" href="https://ieeexplore.ieee.org/document/4907670/" target="_top">Perflint |
| paper presented at CGO 2009</a>. |
| </p><p> |
| <span class="emphasis"><em>Strengths: </em></span> |
| </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> |
| Unintrusive solution. The application code does not require any |
| modification. |
| </p></li><li class="listitem"><p> The advice is call context sensitive, thus capable of |
| identifying precisely interesting dynamic performance behavior. |
| </p></li><li class="listitem"><p> |
| The overhead model is pay-per-view. When you turn off a diagnostic class |
| at compile time, its overhead disappears. |
| </p></li></ul></div><p> |
| </p><p> |
| <span class="emphasis"><em>Drawbacks: </em></span> |
| </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> |
| You must recompile the application code with custom options. |
| </p></li><li class="listitem"><p>You must run the application on representative input. |
| The advice is input dependent. |
| </p></li><li class="listitem"><p> |
| The execution time will increase, in some cases by factors. |
| </p></li></ul></div><p> |
| </p><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.using"></a>Using the Profile Mode</h3></div></div></div><p> |
| This is the anticipated common workflow for program <code class="code">foo.cc</code>: |
| </p><pre class="programlisting"> |
| $ cat foo.cc |
| #include <vector> |
| int main() { |
| vector<int> v; |
| for (int k = 0; k < 1024; ++k) v.insert(v.begin(), k); |
| } |
| |
| $ g++ -D_GLIBCXX_PROFILE foo.cc |
| $ ./a.out |
| $ cat libstdcxx-profile.txt |
| vector-to-list: improvement = 5: call stack = 0x804842c ... |
| : advice = change std::vector to std::list |
| vector-size: improvement = 3: call stack = 0x804842c ... |
| : advice = change initial container size from 0 to 1024 |
| </pre><p> |
| </p><p> |
| Anatomy of a warning: |
| </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> |
| Warning id. This is a short descriptive string for the class |
| that this warning belongs to. E.g., "vector-to-list". |
| </p></li><li class="listitem"><p> |
| Estimated improvement. This is an approximation of the benefit expected |
| from implementing the change suggested by the warning. It is given on |
| a log10 scale. Negative values mean that the alternative would actually |
| do worse than the current choice. |
| In the example above, 5 comes from the fact that the overhead of |
| inserting at the beginning of a vector vs. a list is around 1024 * 1024 / 2, |
| which is around 10e5. The improvement from setting the initial size to |
| 1024 is in the range of 10e3, since the overhead of dynamic resizing is |
| linear in this case. |
| </p></li><li class="listitem"><p> |
| Call stack. Currently, the addresses are printed without |
| symbol name or code location attribution. |
| Users are expected to postprocess the output using, for instance, addr2line. |
| </p></li><li class="listitem"><p> |
| The warning message. For some warnings, this is static text, e.g., |
| "change vector to list". For other warnings, such as the one above, |
| the message contains numeric advice, e.g., the suggested initial size |
| of the vector. |
| </p></li></ul></div><p> |
| </p><p>Three files are generated. <code class="code">libstdcxx-profile.txt</code> |
| contains human readable advice. <code class="code">libstdcxx-profile.raw</code> |
| contains implementation specific data about each diagnostic. |
| Their format is not documented. They are sufficient to generate |
| all the advice given in <code class="code">libstdcxx-profile.txt</code>. The advantage |
| of keeping this raw format is that traces from multiple executions can |
| be aggregated simply by concatenating the raw traces. We intend to |
| offer an external utility program that can issue advice from a trace. |
| <code class="code">libstdcxx-profile.conf.out</code> lists the actual diagnostic |
| parameters used. To alter parameters, edit this file and rename it to |
| <code class="code">libstdcxx-profile.conf</code>. |
| </p><p>Advice is given regardless whether the transformation is valid. |
| For instance, we advise changing a map to an unordered_map even if the |
| application semantics require that data be ordered. |
| We believe such warnings can help users understand the performance |
| behavior of their application better, which can lead to changes |
| at a higher abstraction level. |
| </p></div><div class="section"><div class="titlepage"><div><div><h3 class="title"><a id="manual.ext.profile_mode.tuning"></a>Tuning the Profile Mode</h3></div></div></div><p>Compile time switches and environment variables (see also file |
| profiler.h). Unless specified otherwise, they can be set at compile time |
| using -D_<name> or by setting variable <name> |
| in the environment where the program is run, before starting execution. |
| </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p> |
| <code class="code">_GLIBCXX_PROFILE_NO_<diagnostic></code>: |
| disable specific diagnostics. |
| See section Diagnostics for possible values. |
| (Environment variables not supported.) |
| </p></li><li class="listitem"><p> |
| <code class="code">_GLIBCXX_PROFILE_TRACE_PATH_ROOT</code>: set an alternative root |
| path for the output files. |
| </p></li><li class="listitem"><p>_GLIBCXX_PROFILE_MAX_WARN_COUNT: set it to the maximum |
| number of warnings desired. The default value is 10.</p></li><li class="listitem"><p> |
| <code class="code">_GLIBCXX_PROFILE_MAX_STACK_DEPTH</code>: if set to 0, |
| the advice will |
| be collected and reported for the program as a whole, and not for each |
| call context. |
| This could also be used in continuous regression tests, where you |
| just need to know whether there is a regression or not. |
| The default value is 32. |
| </p></li><li class="listitem"><p> |
| <code class="code">_GLIBCXX_PROFILE_MEM_PER_DIAGNOSTIC</code>: |
| set a limit on how much memory to use for the accounting tables for each |
| diagnostic type. When this limit is reached, new events are ignored |
| until the memory usage decreases under the limit. Generally, this means |
| that newly created containers will not be instrumented until some |
| live containers are deleted. The default is 128 MB. |
| </p></li><li class="listitem"><p> |
| <code class="code">_GLIBCXX_PROFILE_NO_THREADS</code>: |
| Make the library not use threads. If thread local storage (TLS) is not |
| available, you will get a preprocessor error asking you to set |
| -D_GLIBCXX_PROFILE_NO_THREADS if your program is single-threaded. |
| Multithreaded execution without TLS is not supported. |
| (Environment variable not supported.) |
| </p></li><li class="listitem"><p> |
| <code class="code">_GLIBCXX_HAVE_EXECINFO_H</code>: |
| This name should be defined automatically at library configuration time. |
| If your library was configured without <code class="code">execinfo.h</code>, but |
| you have it in your include path, you can define it explicitly. Without |
| it, advice is collected for the program as a whole, and not for each |
| call context. |
| (Environment variable not supported.) |
| </p></li></ul></div><p> |
| </p></div></div><div class="bibliography"><div class="titlepage"><div><div><h2 class="title"><a id="profile_mode.biblio"></a>Bibliography</h2></div></div></div><div class="biblioentry"><a id="id-1.3.5.6.9.2"></a><p><span class="citetitle"><em class="citetitle"> |
| Perflint: A Context Sensitive Performance Advisor for C++ Programs |
| </em>. </span><span class="author"><span class="firstname">Lixia</span> <span class="surname">Liu</span>. </span><span class="author"><span class="firstname">Silvius</span> <span class="surname">Rus</span>. </span><span class="copyright">Copyright © 2009 . </span><span class="publisher"><span class="publishername"> |
| Proceedings of the 2009 International Symposium on Code Generation |
| and Optimization |
| . </span></span></p></div></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="parallel_mode_test.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="extensions.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="profile_mode_design.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Testing </td><td width="20%" align="center"><a accesskey="h" href="../index.html">Home</a></td><td width="40%" align="right" valign="top"> Design</td></tr></table></div></body></html> |