-
Notifications
You must be signed in to change notification settings - Fork 4k
/
aquery.html
452 lines (355 loc) · 15.9 KB
/
aquery.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
---
layout: documentation
title: Aquery (action graph query)
---
<h1>Action Graph Query (aquery)</h1>
<p>
The <code>aquery</code> command allows you to query for actions in your build graph.
It operates on the post-analysis Configured Target Graph and exposes
information about <b>Actions, Artifacts and their relationships.</b>
</p>
<p>
<code>aquery</code> is useful when you are interested in the properties of the Actions/Artifacts
generated from the Configured Target Graph. For example, the actual commands run
and their inputs/outputs/mnemonics.
</p>
<p>
The tool accepts several command-line <a href="#aquery-options">options</a>.
Notably, the aquery command runs on top of a regular Bazel build and inherits
the set of options available during a build.
</p>
<p>
It supports the same set of functions that is also available to traditional
<code>query</code> but <code>siblings</code>, <code>buildfiles</code> and
<code>tests</code>.
</p>
<p>An example <code>aquery</code> output (without specific details):</p>
<pre>
$ bazel aquery 'deps(//some:label)'
action 'Writing file some_file_name'
Mnemonic: ...
Target: ...
Configuration: ...
ActionKey: ...
Inputs: [...]
Outputs: [...]
</pre>
<h2 id='basic-syntax'>Basic syntax</h2>
<p>A simple example of the syntax for <code>aquery</code> is as follows:</p>
<p><code>bazel aquery "aquery_function(function(//target))"</code></p>
<p>The query expression (in quotes) consists of the following:
<ul>
<li>
<code>aquery_function(...)</code>: functions specific to <code>aquery</code>.
More details <a href="#functions">below</a>.
</li>
<li>
<code>function(...)</code>: the standard <a href="query.html#functions">functions</a>
as traditional <code>query</code>.
</li>
<li>
<code>//target</code> is the label to the interested target.
</li>
</ul>
<pre>
# aquery examples:
# Get the action graph generated while building //src/target_a
$ bazel aquery '//src/target_a'
# Get the action graph generated while building all dependencies of //src/target_a
$ bazel aquery 'deps(//src/target_a)'
# Get the action graph generated while building all dependencies of //src/target_a
# whose inputs filenames match the regex ".*cpp".
$ bazel aquery 'inputs(".*cpp", deps(//src/target_a))'
</pre>
<h2 id='functions'>Using aquery functions</h2>
<p>There are three <code>aquery</code> functions:</p>
<ul>
<li><code>inputs</code>: filter actions by inputs.</li>
<li><code>outputs</code>: filter actions by outputs</li>
<li><code>mnemonic</code>: filter actions by mnemonic</li>
</ul>
<p><code>expr ::= inputs(word, expr)</code></p>
<p>
The <code>inputs</code> operator returns the actions generated from building <code>expr</code>,
whose input filenames match the regex provided by <code>word</code>.
</p>
<p><code>$ bazel aquery 'inputs(".*cpp", deps(//src/target_a))'</code></p>
<p><code>outputs</code> and <code>mnemonic</code> functions share a similar syntax.</p>
<p>You can also combine functions to achieve the AND operation. For example:</p>
<pre>
$ bazel aquery 'mnemonic("Cpp.*", (inputs(".*cpp", inputs("foo.*", //src/target_a))))'
</pre>
<p>
The above command would find all actions involved in building <code>//src/target_a</code>,
whose mnemonics match <code>"Cpp.*"</code> and inputs match the patterns
<code>".*cpp"</code> and <code>"foo.*"</code>.
</p>
<h3>Important: aquery functions can't be nested inside non-aquery functions</h3>
<ul>
<li>Conceptually this makes sense since the output of aquery functions is Actions,
not Configured Targets.</li>
<li>
An example of the syntax error produced:
<pre>
$ bazel aquery 'deps(inputs(".*cpp", //src/target_a))'
ERROR: aquery filter functions (inputs, outputs, mnemonic) produce actions,
and therefore can't be the input of other function types: deps
deps(inputs(".*cpp", //src/target_a))
</pre>
</li>
</ul>
<h2 id='options'>Options</h2>
<h3>Build options</h3>
<p>
<code>aquery</code> runs on top of a regular Bazel build and thus inherits the set of
<a href="command-line-reference.html#build-options">options</a>
available during a build.
</p>
<h3 id="aquery-options">Aquery options</h3>
<h4><code class='flag'>--output=(text|proto|textproto|jsonproto), default=text</code></h4>
<p>
The default output format (<code>text</code>) is human-readable,
use <code>proto</code>, <code>textproto</code>, or <code>jsonproto</code> for machine-readable format.
The proto message is <code>analysis.ActionGraphContainer</code>.
</p>
<h4><code class='flag'>--include_commandline, default=true</code></h4>
<p>
Includes the content of the action command lines in the output (potentially large).
</p>
<h4><code class='flag'>--include_artifacts, default=true</code></h4>
<p>
Includes names of the action inputs and outputs in the output (potentially large).
</p>
<h4><code class='flag'>--include_aspects, default=true</code></h4>
<p>
Whether to include Aspect-generated actions in the output.
</p>
<h4><code class='flag'>--include_param_files, default=false</code></h4>
<p>
Include the content of the param files used in the command (potentially large).
Warning: Enabling this flag will automatically enable the <code>--include_commandline</code> flag.
</p>
<h4><code class='flag'>--skyframe_state, default=false</code></h4>
<p>
Without performing extra analysis, dump the Action Graph from Skyframe.
Note: Specifying a target with <code>--skyframe_state</code> is currently not supported.
This flag is only available with <code>--output=proto</code> or <code>--output=textproto</code>.
</p>
<h2 id='misc'>Other tools and features</h2>
<h3 id="skyframe-state">Querying against the state of Skyframe</h3>
<p>
<a href="https://bazel.build/designs/skyframe.html">Skyframe</a> is the evaluation and
incrementality model of Bazel. On each instance of Bazel server, Skyframe stores the dependency graph
constructed from the previous runs of the <a href="guide.html#analysis-phase">Analysis phase</a>.
</p>
<p>
In some cases, it is useful to query the Action Graph on Skyframe.
An example use case would be:
</p>
<ol>
<li>Run <code>bazel build //target_a</code></li>
<li>Run <code>bazel build //target_b</code></li>
<li>File <code>foo.out</code> was generated.</li>
</ol>
<p>
<i>As a Bazel user, I want to determine if <code>foo.out</code> was generated from building
<code>//target_a</code> or <code>//target_b</code></i>.
</p>
<p>
One could run <code>bazel aquery 'outputs("foo.out", //target_a)'</code> and
<code>bazel aquery 'outputs("foo.out", //target_b)'</code> to figure out the action responsible
for creating <code>foo.out</code>, and in turn the target. However, the number of different
targets previously built can be larger than 2, which makes running multiple <code>aquery</code>
commands a hassle.
</p>
<p>As an alternative, the <code>--skyframe_state</code> flag can be used:</p>
<pre>
# List all actions on Skyframe's action graph
$ bazel aquery --output=proto --skyframe_state
# or
# List all actions on Skyframe's action graph, whose output matches "foo.out"
$ bazel aquery --output=proto --skyframe_state 'outputs("foo.out")'
</pre>
<p>
With <code>--skyframe_state</code> mode, <code>aquery</code> takes the content of the Action Graph
that Skyframe keeps on the instance of Bazel, (optionally) performs filtering on it and
outputs the content, without re-running the analysis phase.
</p>
<h4>Special considerations</h4>
<h5>Output format</h5>
<p><code>--skyframe_state</code> is currently only available for <code>--output=proto</code>
and <code>--output=textproto</code></p>
<h5>Non-inclusion of target labels in the query expression</h5>
<p>
Currently, <code>--skyframe_state</code> queries the whole action graph that exists on Skyframe,
regardless of the targets. Having the target label specified in the query together with
<code>--skyframe_state</code> is considered a syntax error:
</p>
<pre>
# WRONG: Target Included
$ bazel aquery --output=proto --skyframe_state <b>//target_a</b>
ERROR: Error while parsing '//target_a)': Specifying build target(s) [//target_a] with --skyframe_state is currently not supported.
# WRONG: Target Included
$ bazel aquery --output=proto --skyframe_state 'inputs(".*.java", <b>//target_a</b>)'
ERROR: Error while parsing '//target_a)': Specifying build target(s) [//target_a] with --skyframe_state is currently not supported.
# CORRECT: Without Target
$ bazel aquery --output=proto --skyframe_state
$ bazel aquery --output=proto --skyframe_state 'inputs(".*.java")'
</pre>
<h3 id="diff-tool">Comparing aquery outputs</h3>
<p>
You can compare the outputs of two different aquery invocations using the <code>aquery_differ</code> tool.
For instance: when you make some changes to your rule definition and want to verify that the
command lines being run did not change. <code>aquery_differ</code> is the tool for that.
</p>
<p>
The tool is available in the <a href="https://github.com/bazelbuild/bazel/tree/master/tools/aquery_differ">bazelbuild/bazel</a> repository.
To use it, clone the repository to your local machine. An example usage:
</p>
<pre>
$ bazel run //tools/aquery_differ -- \
--before=/path/to/before.proto \
--after=/path/to/after.proto \
--input_type=proto \
--attrs=cmdline \
--attrs=inputs
</pre>
<p>
The above command returns the difference between the <code>before</code> and <code>after</code> aquery outputs:
which actions were present in one but not the other, which actions have different
command line/inputs in each aquery output, ...). The result of running the above command would be:
</p>
<pre>
Aquery output 'after' change contains an action that generates the following outputs that aquery output 'before' change doesn't:
...
/list of output files/
...
[cmdline]
Difference in the action that generates the following output(s):
/path/to/abc.out
--- /path/to/before.proto
+++ /path/to/after.proto
@@ -1,3 +1,3 @@
...
/cmdline diff, in unified diff format/
...
</pre>
<h4>Command options</h4>
<p><code class='flag'>--before, --after</code>: The aquery output files to be compared</p>
<p>
<code class='flag'>--input_type=(proto|text_proto), default=proto</code>: the format of the input
files. Support is provided for <code>proto</code> and <code>textproto</code> aquery output.
</p>
<p>
<code class='flag'>--attrs=(cmdline|inputs), default=cmdline</code>: the attributes of actions
to be compared.
</p>
<h3 id="aspect-on-aspect">Aspect-on-aspect</h3>
<p>
It is possible for <a href="https://docs.bazel.build/versions/main/skylark/aspects.html">Aspects</a>
to be applied on top of each other. The aquery output of the action generated by these Aspects would
then include the <i>Aspect path</i>, which is the sequence of Aspects applied to the target which generated the action.
</p>
<p>An example of Aspect-on-Aspect:</p>
<pre>
t0
^
| <- a1
t1
^
| <- a2
t2
</pre>
<p>
Let t<sub>i</sub> be a target of rule r<sub>i</sub>, which applies an Aspect a<sub>i</sub>
to its dependencies.
</p>
<p>Assume that a2 generates an action X when applied to target t0. The text output of
<code>bazel aquery --include_aspects 'deps(//t2)'</code> for action X would be:</p>
<pre>
action ...
Mnemonic: ...
Target: //my_pkg:t0
Configuration: ...
AspectDescriptors: [//my_pkg:rule.bzl%<b>a2</b>(foo=...)
-> //my_pkg:rule.bzl%<b>a1</b>(bar=...)]
...
</pre>
<p>
This means that action <code>X</code> was generated by Aspect <code>a2</code> applied onto
<code>a1(t0)</code>, where <code>a1(t0)</code> is the result of Aspect <code>a1</code> applied
onto target <code>t0</code>.
</p>
<p>Each <code>AspectDescriptor</code> has the following format:</p>
<pre>
AspectClass([param=value,...])
</pre>
<p>
<code>AspectClass</code> could be the name of the Aspect class (for native Aspects) or
<code>bzl_file%aspect_name</code> (for Starlark Aspects). <code>AspectDescriptor</code> are
sorted in topological order of the
<a href="https://docs.bazel.build/versions/main/skylark/aspects.html#aspect-basics">dependency graph</a>.
</p>
<h3 id="linking-json-profile">Linking with the JSON profile</h3>
<p>
While aquery provides information about the actions being run in a build (why they're being run,
their inputs/outputs), the <a href="https://docs.bazel.build/versions/main/skylark/performance.html#performance-profiling">JSON profile</a>
tells us the timing and duration of their execution.
It is possible to combine these 2 sets of information via a common denominator: an action's primary output.
</p>
<p>
To include actions' outputs in the JSON profile, generate the profile with
<code>--experimental_include_primary_output --noexperimental_slim_json_profile</code>.
Slim profiles are incompatible with the inclusion of primary outputs. An action's primary output
is included by default by aquery.
</p>
<p>
We don't currently provide a canonical tool to combine these 2 data sources, but you should be
able to build your own script with the above information.
</p>
<h2 id='known-issues'>Known issues</h2>
<h3 id='shared-actions'>Handling shared actions</h3>
<p>
Sometimes actions are
<a href="https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/actions/Actions.java;l=59;drc=146d51aa1ec9dcb721a7483479ef0b1ac21d39f1">shared</a>
between configured targets.
In the execution phase, those shared actions are
<a href="https://source.bazel.build/bazel/+/master:src/main/java/com/google/devtools/build/lib/actions/Actions.java;l=241;drc=003b8734036a07b496012730964ac220f486b61f">simply considered as one</a> and only executed once.
However, aquery operates on the pre-execution, post-analysis action graph, and hence treats these
like separate actions whose output Artifacts have the exact same <code>execPath</code>. As a result,
equivalent Artifacts appear duplicated.
</p>
<p>
The list of aquery issues/planned features can be found on
<a href="https://github.com/bazelbuild/bazel/labels/team-Performance">GitHub</a>.
</p>
<h2 id='faqs'>FAQs</h2>
<h3 id='action-key'>The ActionKey remains the same even though the content of an input file changed.</h3>
<p>
In the context of aquery, the <code>ActionKey</code> refers to the <code>String</code> gotten from
<code><a href="https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/actions/ActionAnalysisMetadata.java;l=89;drc=8b856f5484f0117b2aebc302f849c2a15f273310">ActionAnalysisMetadata#getKey</a></code>:
</p>
<pre>
Returns a string encoding all of the significant behaviour of this Action that might affect the
output. The general contract of <code>getKey</code> is this: if the work to be performed by the
execution of this action changes, the key must change.
...
Examples of changes that should affect the key are:
- Changes to the BUILD file that materially affect the rule which gave rise to this Action.
- Changes to the command-line options, environment, or other global configuration resources
which affect the behaviour of this kind of Action (other than changes to the names of the
input/output files, which are handled externally).
- An upgrade to the build tools which changes the program logic of this kind of Action
(typically this is achieved by incorporating a UUID into the key, which is changed each
time the program logic of this action changes).
Note the following exception: for actions that discover inputs, the key must change if any
input names change or else action validation may falsely validate.
</pre>
<p>
This excludes the changes to the content of the input files, and is not to be confused with
<code><a href="https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/common/RemoteCacheClient.java;l=38;drc=21577f202eb90ce94a337ebd2ede824d609537b6">RemoteCacheClient#ActionKey</a></code>.
</p>
<h2 id='updates'>Updates</h2>
<p>
Please contact twerth@google.com and leba@google.com for any issue/feature request.
</p>