These tools currently focus on supporting Android. They somewhat work with Linux builds. As for Windows, some great tools already exist and are documented here:
There is also a dedicated mailing-list for binary size discussions:
Bugs and feature requests are tracked in crbug under:
Per-Milestone Binary Size Breakdowns:
Guide to dealing with chrome-perf size alerts:
[TOC]
- Introduced October 2018 as a mandatory CQ bot.
- Documented here.
- Introduced February 2020 to surface results from android-binary-size.
- Documented here.
- //build/android/resource_sizes.py
- Able to run on an
.apk
without having the build directory available. - Reports the size metrics captured by our perf builders. Viewable at
chromeperf under
Test suite="resource_sizes ($APK)"
. - Metrics reported by this tool are described in //docs/speed/binary_size/metrics.md.
Collects, archives, and analyzes Chrome's binary size. Supports Android and Linux (although Linux has issues).
.size
files are gzipped plain text files that contain:
- A list of section sizes, including:
- .so sections as reported by
readelf -S
- .pak and .dex sections for apk files
- .so sections as reported by
- Metadata (apk size, GN args, filenames, timestamps, git revision, build id),
- A list of symbols, including name, address, size, padding (caused by alignment), and associated source/object files.
- Symbol list is extracted from linker
.map
file.- Map files contain some unique pieces of information compared to
nm
output, such as** merge strings
entries, and some unnamed symbols (which although unnamed, contain the.o
path). - Generated in
is_official_build=true
builds ifgenerate_linker_map
is true. In official builds on Android generate_linker_map is true by default.
- Map files contain some unique pieces of information compared to
.o
files are mapped to.cc
files by parsing.ninja
files.- This means that
.h
files are never listed as sources. No information about inlined symbols is gathered.
- This means that
** merge strings
symbols are further broken down into individual string literal symbols. This is done by reading string literals from.o
files, and then searching for them within the** merge strings
sections.- For LLD with ThinLTO,
llvm-bcanalyzer
is used to extract string literals.
- For LLD with ThinLTO,
- Symbol aliases:
- Aliases have the same address and size, but report their
.pss
as.size / .num_aliases
. - Type 1: Different names. Caused by identical code folding.
- These are collected from debug information via
nm elf-file
.
- These are collected from debug information via
- Type 2: Same names, different paths. Caused by inline functions defined in
.h
files.- These are collected by running
nm
on each.o
file.- For LLD with ThinLTO,
llvm-bcanalyzer
is used to process.o
files, which are actually LLVM Bitcode files.
- For LLD with ThinLTO,
- Normally represented using one alias per path, but are sometimes
collapsed into a single symbol with a path of
{shared}/$SYMBOL_COUNT
. This collapsing is done only for symbols owned by a large number of paths.
- These are collected by running
- Type 3: String literals that are de-duped at link-time.
- These are found as part of the string literal extraction process.
- Aliases have the same address and size, but report their
- Grit creates a mapping between numeric id and textual id for grd files.
- A side effect of pak whitelist generation is a mapping of
.cc
to numeric id. - A complete per-apk mapping of numeric id to textual id is stored in the
output_dir/size-info
dir.
- A side effect of pak whitelist generation is a mapping of
supersize
uses these two mappings to find associated source files for the pak entries found in all of the apk's.pak
files.- Pak entries with the same name are merged into a single symbol.
- This is the case of pak files for translations.
- The original grd file paths are stored in the full name of each symbol.
- Pak entries with the same name are merged into a single symbol.
- Java compile targets create a mapping between java fully qualified names
(FQN) and source files.
- For
.java
files the FQN of the public class is mapped to the file. - For
.srcjar
files the FQN of the public class is mapped to the.srcjar
file path. - A complete per-apk class FQN to source mapping is stored in the
output_dir/size-info
dir.
- For
- The
apkanalyzer
sdk tool is used to find the size and FQN of entries in the dex file.- If a proguard
.mapping
file is available, that is used to get back the original FQN.
- If a proguard
- The output from
apkanalyzer
is used bysupersize
along with the mapping file to find associated source files for the dex entries found in all of the apk's.dex
files.
All files in an apk that are not broken down into sub-entries are tracked by a
symbol within the .other
section.
Overhead symbols track bytes that are generally unactionable. They are recorded
as size=0, padding=$size
(padding-only symbols) to de-emphasize them in diffs.
Star symbols are those that track sections of the binary that are not padding, but which the tool is not able to break down further (e.g. "** Merge Globals")
- ** symbol gap: A gap between symbols that is larger than what could be due to alignment.
- Overhead: ELF file:
elf_file_size - sum(elf_sections)
.- Captures bytes taken up by ELF headers and section alignment.
- Overhead: APK file:
apk_file_size - sum(compressed_file_sizes)
- Captures bytes taken up by
.zip
metadata and zipalign padding.
- Captures bytes taken up by
- Overhead: ${NAME}.pak:
pak_file_size - sum(pak_entries)
- Overhead: Pak compression artifacts:
compressed_size_of_paks - sum(pak_entries)
- It would be possible to correctly attribute compressed size to pak symbols, but doing so makes diffs very noisy (any change in compression ratio causes every symbol to change by a small amount). Instead, SuperSize uses a hard-coded compression ratio for compressed .pak symbols, and captures any remainder in this overhead symbol.
- TODO(crbug/894320): Improve how compression is tracked.
-
Path normalization:
- Prefixes are removed:
out/Release/
,gen/
,obj/
- Archive names made more pathy:
foo/bar.a(baz.o)
->foo/bar.a/baz.o
- Shared symbols do not store the complete source paths. Instead, the
common ancestor is computed and stored as the path.
- Example:
base/{shared}/3
(the "3" means three different files contain the symbol)
- Example:
- Prefixes are removed:
-
Name normalization:
(anonymous::)
is removed from names (and stored as a symbol flag).[clone]
suffix removed (and stored as a symbol flag).vtable for FOO
->Foo [vtable]
- Mangling done by linkers is undone (e.g. prefixing with "unlikely.")
- Names are processed into:
name
: Name without template and argument parameterstemplate_name
: Name without argument parameters.full_name
: Name with all parameters.
-
Special cases:
- LLVM function outlining creates many OUTLINED_FUNCTION_* symbols. These renamed to '** outlined functions' or '** outlined functions * (count)', and are deduped so an address can have at most one such symbol.
-
Clustering:
- Compiler & linker optimizations can cause symbols to be broken into multiple parts to become candidates for inlining ("partial inlining").
- These symbols are sometimes suffixed with "
[clone]
" (removed by normalization). - Clustering creates groups containing all pieces of a symbol (in the case where multiple pieces remain after inlining).
- Clustering is done by default on
SizeInfo.symbols
. To view unclustered symbols, useSizeInfo.raw_symbols
.
-
Diffing:
- Some heuristics for matching up before/after symbols.
-
Simulated compression:
- Only some
.pak
files are compressed and others are kept uncompressed. - To get a reasonable idea of actual impact to final apk size, we use a
constant compression factor for all the compressed
.pak
files.- This prevents swings in compressed sizes for all symbols when new entries are added or old entries are removed.
- The constant is chosen so that it minimizes overall discrepancy with actual total compressed sizes.
- Only some
No. Some examples of why it's Chrome-specific:
- Assumes
.ninja
build rules are available. - Heuristic for locating
.so
given.apk
. - Requires
size-info
dir in output directory to analyze.pak
and.dex
files.
Collect size information and dump it into a .size
file.
*** note Note: Refer to diagnose_bloat.py for list of GN args to build a Release binary (or just use the tool with --single).
Example Usage:
# Android:
ninja -C out/Release -j 1000 apks/ChromePublic.apk
tools/binary_size/supersize archive chrome.size --apk-file out/Release/apks/ChromePublic.apk -v
# Linux:
ninja -C out/Release -j 1000 chrome
tools/binary_size/supersize archive chrome.size --elf-file out/Release/chrome -v
Creates an .ndjson
(newline-delimited JSON) file that the
SuperSize viewer
is able to load.
Example Usage:
# Creates the data file ./report.ndjson, generated based on ./chrome.size
tools/binary_size/supersize html_report chrome.size report.ndjson -v
# Includes every symbol in the data file, although it will take longer to load.
tools/binary_size/supersize html_report chrome.size report.ndjson --all-symbols
# Create a data file showing a diff between two .size files.
tools/binary_size/supersize html_report after.size --diff-with before.size report.ndjson
Locally view the .ndjson
file generated by html_report
, by starting a web
server that links to the file.
Example Usage:
# Starts a local server to view the data in ./report.ndjson
tools/binary_size/supersize start_server report.ndjson
# Set a custom address and port.
tools/binary_size/supersize start_server report.ndjson -a localhost -p 8080
A convenience command equivalent to:
console before.size after.size --query='Print(Diff(size_info1, size_info2))'
Example Usage:
tools/binary_size/supersize diff before.size after.size --all
Starts a Python interpreter where you can run custom queries, or run pre-made
queries from canned_queries.py
.
Example Usage:
# Prints size infomation and exits (does not enter interactive mode).
tools/binary_size/supersize console chrome.size --query='Print(size_info)'
# Enters a Python REPL (it will print more guidance).
tools/binary_size/supersize console chrome.size
Example session:
>>> ShowExamples() # Get some inspiration.
...
>>> sorted = size_info.symbols.WhereInSection('t').Sorted()
>>> Print(sorted) # Have a look at the largest symbols.
...
>>> sym = sorted.WhereNameMatches('TrellisQuantizeBlock')[0]
>>> Disassemble(sym) # Time to learn assembly.
...
>>> help(canned_queries)
...
>>> Print(canned_queries.TemplatesByName(depth=-1))
...
>>> syms = size_info.symbols.WherePathMatches(r'skia').Sorted()
>>> Print(syms, verbose=True) # Show full symbol names with parameter types.
...
>>> # Dump all string literals from skia files to "strings.txt".
>>> Print((t[1] for t in ReadStringLiterals(syms)), to_file='strings.txt')
Determines the cause of binary size bloat between two commits. Works for Android and Linux (although Linux symbol diffs have issues, as noted below).
- Builds multiple revisions using release GN args.
- Default is to build just two revisions (before & after commit)
- Measures all outputs using
resource_size.py
andsupersize
. - Saves & displays a breakdown of the difference in binary sizes.
# Build and diff monochrome_public_apk HEAD^ and HEAD.
tools/binary_size/diagnose_bloat.py HEAD -v
# Build and diff monochrome_apk HEAD^ and HEAD.
tools/binary_size/diagnose_bloat.py HEAD --enable-chrome-android-internal -v
# Build and diff monochrome_public_apk HEAD^ and HEAD without is_official_build.
tools/binary_size/diagnose_bloat.py HEAD --gn-args="is_official_build=false" -v
# Build and diff all contiguous revs in range BEFORE_REV..AFTER_REV for src/v8.
tools/binary_size/diagnose_bloat.py AFTER_REV --reference-rev BEFORE_REV --subrepo v8 --all -v
# Build and diff system_webview_apk HEAD^ and HEAD with arsc obfucstion disabled.
tools/binary_size/diagnose_bloat.py HEAD --target system_webview_apk --gn-args enable_arsc_obfuscation=false
# Display detailed usage info (there are many options).
tools/binary_size/diagnose_bloat.py -h
- https://github.com/google/bloaty
- Our usage tracked in crbug/698733