These tools currently focus on supporting Android. They somewhat work with Linux builds. As for Windows, some great tools already exist and are documented here:
There is also a dedicated mailing-list for binary size discussions:
Bugs are tracked here:
[TOC]
Determine the cause of binary size bloat between two commits. Works for Android and Linux (although Linux symbol diffs have issues, as noted below).
- Builds multiple revisions using release GN args.
- Default is to build just two revisions (before & after commit)
- Rather than building, can fetch build artifacts and
.size
files from perf bots (--cloud
)
- Measures all outputs using
resource_size.py
andsupersize
. - Saves & displays a breakdown of the difference in binary sizes.
# Build and diff monochrome_public_apk HEAD^ and HEAD.
tools/binary_size/diagnose_bloat.py HEAD -v
# Build and diff monochrome_apk HEAD^ and HEAD.
tools/binary_size/diagnose_bloat.py HEAD --enable-chrome-android-internal -v
# Build and diff monochrome_public_apk HEAD^ and HEAD without is_official_build.
tools/binary_size/diagnose_bloat.py HEAD --gn-args="is_official_build=false" -v
# Diff BEFORE_REV and AFTER_REV using build artifacts downloaded from perf bots.
tools/binary_size/diagnose_bloat.py AFTER_REV --reference-rev BEFORE_REV --cloud -v
# Fetch a .size, libmonochrome.so, and MonochromePublic.apk from perf bots (Googlers only):
tools/binary_size/diagnose_bloat.py AFTER_REV --cloud --unstripped --single
# Build and diff all contiguous revs in range BEFORE_REV..AFTER_REV for src/v8.
tools/binary_size/diagnose_bloat.py AFTER_REV --reference-rev BEFORE_REV --subrepo v8 --all -v
# Display detailed usage info (there are many options).
tools/binary_size/diagnose_bloat.py -h
Collect, archive, and analyze Chrome's binary size. Supports Android and Linux (although Linux has issues).
.size
files are archived on perf builders so that regressions can be quickly
analyzed (via diagnose_bloat.py --cloud
).
.size
files are archived on official builders so that symbols can be diff'ed
between milestones.
.size
files are gzipped plain text files that contain:
- A list of section sizes, including:
- .so sections as reported by
readelf -S
- .pak and .dex sections for apk files
- .so sections as reported by
- Metadata (apk size, GN args, filenames, timestamps, git revision, build id),
- A list of symbols, including name, address, size, padding (caused by alignment), and associated source/object files.
- Symbol list is Extracted from linker
.map
file.- Map files contain some unique pieces of information compared to
nm
output, such as** merge strings
entries, and some unnamed symbols (which although unnamed, contain the.o
path).
- Map files contain some unique pieces of information compared to
.o
files are mapped to.cc
files by parsing.ninja
files.- This means that
.h
files are never listed as sources. No information about inlined symbols is gathered.
- This means that
** merge strings
symbols are further broken down into individual string literal symbols. This is done by reading string literals from.o
files, and then searching for them within the** merge strings
sections.- Symbol aliases:
- Aliases have the same address and size, but report their
.pss
as.size / .num_aliases
. - Type 1: Different names. Caused by identical code folding.
- These are collected from debug information via
nm elf-file
.
- These are collected from debug information via
- Type 2: Same names, different paths. Caused by inline functions defined in
.h
files.- These are collected by running
nm
on each.o
file. - Normally represented using one alias per path, but are sometimes
collapsed into a single symbol with a path of
{shared}/$SYMBOL_COUNT
. This collapsing is done only for symbols owned by a large number of paths.
- These are collected by running
- Type 3: String literals that are de-duped at link-time.
- These are found as part of the string literal extraction process.
- Aliases have the same address and size, but report their
- Grit creates a mapping between numeric id and textual id for grd files.
- A side effect of pak whitelist generation is a mapping of
.cc
to numeric id. - A complete per-apk mapping of numeric id to textual id is stored in the
output_dir/size-info
dir.
- A side effect of pak whitelist generation is a mapping of
supersize
uses these two mappings to find associated source files for the pak entries found in all of the apk's.pak
files.- Pak entries with the same name are merged into a single symbol.
- This is the case of pak files for translations.
- The original grd file paths are stored in the full name of each symbol.
- Pak entries with the same name are merged into a single symbol.
- Java compile targets create a mapping between java fully qualified names
(FQN) and source files.
- For
.java
files the FQN of the public class is mapped to the file. - For
.srcjar
files the FQN of the public class is mapped to the.srcjar
file path. - A complete per-apk class FQN to source mapping is stored in the
output_dir/size-info
dir.
- For
- The
apkanalyzer
sdk tool is used to find the size and FQN of entries in the dex file.- If a proguard
.mapping
file is available, that is used to get back the original FQN.
- If a proguard
- The output from
apkanalyzer
is used bysupersize
along with the mapping file to find associated source files for the dex entries found in all of the apk's.dex
files.
- Shared bytes are stored in symbols with names starting with
Overhead:
.- Elf file, dex file, pak files, apk files all have compression overhead.
- These are treated as padding-only symbols to de-emphasize them in diffs.
- It is expected that these symbols have minor fluctuations since they are affected by changes in compressibility.
- All other files in an apk have one symbol each under the
.other
section with their corresponding path in the apk as their associated path.
-
Path normalization:
- Prefixes are removed:
out/Release/
,gen/
,obj/
- Archive names made more pathy:
foo/bar.a(baz.o)
->foo/bar.a/baz.o
- Shared symbols do not store the complete source paths. Instead, the
common ancestor is computed and stored as the path.
- Example:
base/{shared}/3
(the "3" means three different files contain the symbol)
- Example:
- Prefixes are removed:
-
Name normalization:
(anonymous::)
is removed from names (and stored as a symbol flag).[clone]
suffix removed (and stored as a symbol flag).vtable for FOO
->Foo [vtable]
- Mangling done by linkers is undone (e.g. prefixing with "unlikely.")
- Names are processed into:
name
: Name without template and argument parameterstemplate_name
: Name without argument parameters.full_name
: Name with all parameters.
-
Clustering:
- Compiler & linker optimizations can cause symbols to be broken into multiple parts to become candidates for inlining ("partial inlining").
- These symbols are sometimes suffixed with "
[clone]
" (removed by normalization). - Clustering creates groups containing all pieces of a symbol (in the case where multiple pieces remain after inlining).
- Clustering is done by default on
SizeInfo.symbols
. To view unclustered symbols, useSizeInfo.raw_symbols
.
-
Diffing:
- Some heuristics for matching up before/after symbols.
-
Simulated compression:
- Only some
.pak
files are compressed and others are kept uncompressed. - To get a reasonable idea of actual impact to final apk size, we use a
constant compression factor for all the compressed
.pak
files.- This prevents swings in compressed sizes for all symbols when new entries are added or old entries are removed.
- The constant is chosen so that it minimizes overall discrepancy with actual total compressed sizes.
- Only some
No. Most of the logic is would could work for any ELF executable. However, being a generic tool is not a goal. Some examples of existing Chrome-specific logic:
- Assumes
.ninja
build rules are available. - Heuristic for locating
.so
given.apk
. - Requires
size-info
dir in output directory to analyze.pak
and.dex
files.
Collect size information and dump it into a .size
file.
*** note Note: Refer to diagnose_bloat.py for list of GN args to build a Release binary (or just use the tool with --single).
Googlers: If you just want a .size
for a commit on master:
GIT_REV="HEAD~200"
tools/binary_size/diagnose_bloat.py --single --cloud --unstripped $GIT_REV
Example Usage:
# Android:
ninja -C out/Release -j 1000 apks/ChromePublic.apk
tools/binary_size/supersize archive chrome.size --apk-file out/Release/apks/ChromePublic.apk -v
# Linux:
ninja -C out/Release -j 1000 chrome
tools/binary_size/supersize archive chrome.size --elf-file out/Release/chrome -v
Creates an interactive size breakdown (by source path) as a stand-alone html report.
Example output: https://agrieve.github.io/chrome/
Example Usage:
tools/binary_size/supersize html_report chrome.size --report-dir size-report -v
xdg-open size-report/index.html
A convenience command equivalent to: console before.size after.size --query='Print(Diff(size_info1, size_info2))'
Example Usage:
tools/binary_size/supersize diff before.size after.size --all
Starts a Python interpreter where you can run custom queries, or run pre-made
queries from canned_queries.py
.
Example Usage:
# Prints size infomation and exits (does not enter interactive mode).
tools/binary_size/supersize console chrome.size --query='Print(size_info)'
# Enters a Python REPL (it will print more guidance).
tools/binary_size/supersize console chrome.size
Example session:
>>> ShowExamples() # Get some inspiration.
...
>>> sorted = size_info.symbols.WhereInSection('t').Sorted()
>>> Print(sorted) # Have a look at the largest symbols.
...
>>> sym = sorted.WhereNameMatches('TrellisQuantizeBlock')[0]
>>> Disassemble(sym) # Time to learn assembly.
...
>>> help(canned_queries)
...
>>> Print(canned_queries.TemplatesByName(depth=-1))
...
>>> syms = size_info.symbols.WherePathMatches(r'skia').Sorted()
>>> Print(syms, verbose=True) # Show full symbol names with parameter types.
...
>>> # Dump all string literals from skia files to "strings.txt".
>>> Print((t[1] for t in ReadStringLiterals(syms)), to_file='strings.txt')
- Better Linux support (clang+lld+lto vs gcc+gold).
- More
console
features:- Add
SplitByName()
- LikeGroupByName()
, but recursive. - A canned query, that does what ShowGlobals does (as described in Windows Binary Sizes).
- Add
- More
html_report
features:- Able to render size diffs (tint negative size red).
- Break down by other groupings (Create from result of
SplitByName()
) - Render as simple tree view rather than 2d boxes
- Integrate with
resource_sizes.py
so that it tracks size of major components separately: chrome vs blink vs skia vs v8. - Add dependency graph info, perhaps just on a per-file basis.
- No idea how to do this, but Windows can do it via
tools\win\linker_verbose_tracking.py
- No idea how to do this, but Windows can do it via