These tools currently focus on Android. They somewhat work with Linux builds, but not as well. As for Windows, some great tools already exist and are documented here:
There is also a dedicated mailing-list for binary size discussions:
Bugs are tracked here:
[TOC]
Determine the cause of binary size bloat between two commits. Works for Android and Linux (although Linux symbol diffs have issues, as noted below).
- Builds multiple revisions using release GN args.
- Default is to build just two revisions (before & after commit)
- Rather than building, can fetch build artifacts and
.size
files from perf bots (--cloud
)
- Measures all outputs using
resource_size.py
andsupersize
. - Saves & displays a breakdown of the difference in binary sizes.
# Build and diff HEAD^ and HEAD.
tools/binary_size/diagnose_bloat.py HEAD -v
# Diff BEFORE_REV and AFTER_REV using build artifacts downloaded from perf bots.
tools/binary_size/diagnose_bloat.py AFTER_REV --reference-rev BEFORE_REV --cloud -v
# Fetch a .size, libmonochrome.so, and MonochromePublic.apk from perf bots (Googlers only):
tools/binary_size/diagnose_bloat.py AFTER_REV --cloud --unstripped --single
# Build and diff all contiguous revs in range BEFORE_REV..AFTER_REV for src/v8.
tools/binary_size/diagnose_bloat.py AFTER_REV --reference-rev BEFORE_REV --subrepo v8 --all -v
# Display detailed usage info (there are many options).
tools/binary_size/diagnose_bloat.py -h
Collect, archive, and analyze Chrome's binary size. Supports Android and Linux (although Linux has issues).
.size
files are archived on perf builders so that regressions can be quickly
analyzed (via diagnose_bloat.py --cloud
).
.size
files are archived on official builders so that symbols can be diff'ed
between milestones.
.size
files are gzipped plain text files that contain:
- A list of .so section sizes, as reported by
readelf -S
, - Metadata (GN args, filenames, timestamps, git revision, build id),
- A list of symbols, including name, address, size,
padding (caused by alignment), and associated
.o
/.cc
files.
- Symbol list is Extracted from linker
.map
file.- Map files contain some unique pieces of information compared to
nm
output, such as** merge strings
entries, and some unnamed symbols (which although unnamed, contain the.o
path).
- Map files contain some unique pieces of information compared to
.o
files are mapped to.cc
files by parsing.ninja
files.- This means that
.h
files are never listed as sources. No information about inlined symbols is gathered.
- This means that
- Symbol aliases (when multiple symbols share an address) are collected from
debug information via
nm elf-file
.- Aliases are created by identical code folding (linker optimization).
- Aliases have the same address and size, but report their
.pss
as.size / .num_aliases
.
** merge strings
symbols are further broken down into individual string literal symbols. This is done by reading string literals from.o
files, and then searching for them within the** merge strings
sections.- "Shared symbols" are those that are owned by multiple
.o
files. These include inline functions defined in.h
files, and string literals that are de-duped at link-time. Shared symbols are normally represented using one symbol alias per path, but are sometimes collapsed into a single symbol where the path is set to{shared}/$SYMBOL_COUNT
. This collapsing is done only for symbols owned by a large number of paths.
-
Path normalization:
- Prefixes are removed:
out/Release/
,gen/
,obj/
- Archive names made more pathy:
foo/bar.a(baz.o)
->foo/bar.a/baz.o
- Shared symbols do not store the complete source paths. Instead, the
common ancestor is computed and stored as the path.
- Example:
base/{shared}/3
(the "3" means three different files contain the symbol)
- Example:
- Prefixes are removed:
-
Name normalization:
(anonymous::)
is removed from names (and stored as a symbol flag).[clone]
suffix removed (and stored as a symbol flag).vtable for FOO
->Foo [vtable]
- Mangling done by linkers is undone (e.g. prefixing with "unlikely.")
- Names are processed into:
name
: Name without template and argument parameterstemplate_name
: Name without argument parameters.full_name
: Name with all parameters.
-
Clustering
- Compiler & linker optimizations can cause symbols to be broken into multiple parts to become candidates for inlining ("partial inlining").
- These symbols are sometimes suffixed with "
[clone]
" (removed by normalization). - Clustering creates groups containing all pieces of a symbol (in the case where multiple pieces remain after inlining).
- Clustering is done by default on
SizeInfo.symbols
. To view unclustered symbols, useSizeInfo.raw_symbols
.
-
Diffing
- Some heuristics for matching up before/after symbols.
No. Most of the logic is would could work for any ELF executable. However, being a generic tool is not a goal. Some examples of existing Chrome-specific logic:
- Assumes
.ninja
build rules are available. - Heuristic for locating
.so
given.apk
. - Roadmap includes
.pak
file analysis.
Collect size information and dump it into a .size
file.
*** note Note: Refer to diagnose_bloat.py for list of GN args to build a Release binary (or just use the tool with --single).
Googlers: If you just want a .size
for a commit on master:
GIT_REV="HEAD~200"
tools/binary_size/diagnose_bloat.py --single --cloud --unstripped $GIT_REV
Example Usage:
# Android:
ninja -C out/Release -j 1000 apks/ChromePublic.apk
tools/binary_size/supersize archive chrome.size --apk-file out/Release/apks/ChromePublic.apk -v
# Linux:
ninja -C out/Release -j 1000 chrome
tools/binary_size/supersize archive chrome.size --elf-file out/Release/chrome -v
Creates an interactive size breakdown (by source path) as a stand-alone html report.
Example output: https://agrieve.github.io/chrome/
Example Usage:
tools/binary_size/supersize html_report chrome.size --report-dir size-report -v
xdg-open size-report/index.html
A convenience command equivalent to: console before.size after.size --query='Print(Diff(size_info1, size_info2))'
Example Usage:
tools/binary_size/supersize diff before.size after.size --all
Starts a Python interpreter where you can run custom queries, or run pre-made
queries from canned_queries.py
.
Example Usage:
# Prints size infomation and exits (does not enter interactive mode).
tools/binary_size/supersize console chrome.size --query='Print(size_info)'
# Enters a Python REPL (it will print more guidance).
tools/binary_size/supersize console chrome.size
Example session:
>>> ShowExamples() # Get some inspiration.
...
>>> sorted = size_info.symbols.WhereInSection('t').Sorted()
>>> Print(sorted) # Have a look at the largest symbols.
...
>>> sym = sorted.WhereNameMatches('TrellisQuantizeBlock')[0]
>>> Disassemble(sym) # Time to learn assembly.
...
>>> help(canned_queries)
...
>>> Print(canned_queries.TemplatesByName(depth=-1))
...
>>> syms = size_info.symbols.WherePathMatches(r'skia').Sorted()
>>> Print(syms, verbose=True) # Show full symbol names with parameter types.
...
>>> # Dump all string literals from skia files to "strings.txt".
>>> Print((t[1] for t in ReadStringLiterals(syms)), to_file='strings.txt')
- Better Linux support (clang+lld+lto vs gcc+gold).
- More
archive
features:- Find out more about 0xffffffffffffffff addresses, and why such large gaps exist after them. (crbug/709050)
- Collect .pak file information (using .o.whitelist files)
- Collect java symbol information
- Collect .apk entry information
- More
console
features:- Add
SplitByName()
- LikeGroupByName()
, but recursive. - A canned query, that does what ShowGlobals does (as described in Windows Binary Sizes).
- Add
- More
html_report
features:- Able to render size diffs (tint negative size red).
- Break down by other groupings (Create from result of
SplitByName()
) - Render as simple tree view rather than 2d boxes
- Integrate with
resource_sizes.py
so that it tracks size of major components separately: chrome vs blink vs skia vs v8. - Add dependency graph info, perhaps just on a per-file basis.
- No idea how to do this, but Windows can do it via
tools\win\linker_verbose_tracking.py
- No idea how to do this, but Windows can do it via