ipa-cp: Select saner profile count to base heuristics on
When profile feedback is available, IPA-CP takes the count of the
hottest node and then evaluates all call contexts relative to it.
This means that typically almost no clones for specialized contexts
are ever created because the maximum is some special function, called
from everywhere (that is likely to get inlined anyway) and all the
examined edges look cold compared to it.
This patch changes the selection. It simply sorts counts of all edges
eligible for cloning in a vector and then picks the count in 90th
percentile (the actual number is configurable via a parameter).
I also tried more complex approaches which were summing the counts and
picking the edge which together with all hotter edges accounted for a
given portion of the total sum of all edge counts. But first it was
not apparently clear to me that they make more logical sense that the
simple method and practically I always also had to ignore a few
percent of the hottest edges with really extreme counts (looking at
bash and python). And when I had to do that anyway, it seemed simpler
to just "ignore" more and take the first non-ignored count as the
Nevertheless, if people think some more sophisticated method should be
used anyway, I am willing to be persuaded. But this patch is a clear
improvement over the current situation.
2021-10-26 Martin Jambor <email@example.com>
* params.opt (param_ipa_cp_profile_count_base): New parameter.
* doc/invoke.texi (Optimize Options): Add entry for
* ipa-cp.c (max_count): Replace with base_count, replace all
occurrences too, unless otherwise stated.
(ipcp_cloning_candidate_p): identify mostly-directly called
functions based on their counts, not max_count.
(compare_edge_profile_counts): New function.
(ipcp_propagate_stage): Instead of setting max_count, find the
appropriate edge count in a sorted vector of counts of eligible
edges and make it the base_count.
3 files changed