Skip to content

Commit f4bb9b5

Browse files
[MCA] Extend -instruction-tables option with verbosity levels (#130574)
Option becomes: -instruction-tables=`<level>` The choice of `<level>` controls number of printed information. `<level>` may be `none` (default), `normal`, `full`. Note: If the option is used without `<label>`, default is `normal` (legacy). When `<level>` is `full`, additional information are: - `<Bypass Latency>`: Latency when a bypass is implemented between operands in pipelines (see SchedReadAdvance). - `<LLVM Opcode Name>`: mnemonic plus operands identifier. - `<Resources units>`: Used resources associated with LLVM Opcode. - `<instruction comment>`: reports comment if any from source assembly. Level `full` can be used to better check scheduling info when TableGen is modified. LLVM Opcode name help to find right instruction regexp to fix TableGen Scheduling Info. -instruction-tables=full option is validated on AArch64/Neoverse/V1-sve-instructions.s Follow up of MR #126703 --------- Co-authored-by: Julien Villette <julien.villette@sipearl.com>
1 parent 214fb43 commit f4bb9b5

File tree

8 files changed

+3172
-2534
lines changed

8 files changed

+3172
-2534
lines changed

llvm/docs/CommandGuide/llvm-mca.rst

+19-1
Original file line numberDiff line numberDiff line change
@@ -197,14 +197,32 @@ option specifies "``-``", then the output will also be sent to standard output.
197197

198198
Enable all the view.
199199

200-
.. option:: -instruction-tables
200+
.. option:: -instruction-tables=<level>
201201

202202
Prints resource pressure information based on the static information
203203
available from the processor model. This differs from the resource pressure
204204
view because it doesn't require that the code is simulated. It instead prints
205205
the theoretical uniform distribution of resource pressure for every
206206
instruction in sequence.
207207

208+
The choice of `<level>` controls number of printed information.
209+
`<level>` may be `none` (default), `normal`, `full`.
210+
Note: If the option is used without `<label>`, default is `normal` (legacy).
211+
212+
When `<level>` is `full`, additional information are:
213+
- `<Bypass Latency>`: Latency when a bypass is implemented between operands
214+
in pipelines (see SchedReadAdvance).
215+
- `<LLVM Opcode Name>`: mnemonic plus operands identifier.
216+
- `<Resources units>`: Used resources associated with LLVM Opcode.
217+
- `<instruction comment>`: reports comment if any from source assembly.
218+
219+
`<Resources units>` syntax can be:
220+
- <Resource Name>: ReleaseAtCycle is 1.
221+
- <Resource Name>[<ReleaseAtCycle>]: ReleaseAtCycle is greater than 1
222+
and AcquireAtCycle is 0.
223+
- <Resource Name>[<AcquireAtCycle>,<ReleaseAtCycle>]: ReleaseAtCycle
224+
is greater than 1 and AcquireAtCycle is greater than 0.
225+
208226
.. option:: -bottleneck-analysis
209227

210228
Print information about bottlenecks that affect the throughput. This analysis

llvm/include/llvm/MC/MCSchedule.h

+4
Original file line numberDiff line numberDiff line change
@@ -402,6 +402,10 @@ struct MCSchedModel {
402402
static unsigned getForwardingDelayCycles(ArrayRef<MCReadAdvanceEntry> Entries,
403403
unsigned WriteResourceIdx = 0);
404404

405+
/// Returns the bypass delay cycle for the maximum latency write cycle
406+
static unsigned getBypassDelayCycles(const MCSubtargetInfo &STI,
407+
const MCSchedClassDesc &SCDesc);
408+
405409
/// Returns the default initialized model.
406410
static const MCSchedModel Default;
407411
};

llvm/lib/MC/MCSchedule.cpp

+34
Original file line numberDiff line numberDiff line change
@@ -177,3 +177,37 @@ MCSchedModel::getForwardingDelayCycles(ArrayRef<MCReadAdvanceEntry> Entries,
177177

178178
return std::abs(DelayCycles);
179179
}
180+
181+
unsigned MCSchedModel::getBypassDelayCycles(const MCSubtargetInfo &STI,
182+
const MCSchedClassDesc &SCDesc) {
183+
184+
ArrayRef<MCReadAdvanceEntry> Entries = STI.getReadAdvanceEntries(SCDesc);
185+
if (Entries.empty())
186+
return 0;
187+
188+
unsigned MaxLatency = 0;
189+
unsigned WriteResourceID = 0;
190+
unsigned DefEnd = SCDesc.NumWriteLatencyEntries;
191+
192+
for (unsigned DefIdx = 0; DefIdx != DefEnd; ++DefIdx) {
193+
// Lookup the definition's write latency in SubtargetInfo.
194+
const MCWriteLatencyEntry *WLEntry =
195+
STI.getWriteLatencyEntry(&SCDesc, DefIdx);
196+
unsigned Cycles = 0;
197+
// If latency is Invalid (<0), consider 0 cycle latency
198+
if (WLEntry->Cycles > 0)
199+
Cycles = (unsigned)WLEntry->Cycles;
200+
if (Cycles > MaxLatency) {
201+
MaxLatency = Cycles;
202+
WriteResourceID = WLEntry->WriteResourceID;
203+
}
204+
}
205+
206+
for (const MCReadAdvanceEntry &E : Entries) {
207+
if (E.WriteResourceID == WriteResourceID)
208+
return E.Cycles;
209+
}
210+
211+
// Unable to find WriteResourceID in MCReadAdvanceEntry Entries
212+
return 0;
213+
}

llvm/test/tools/llvm-mca/AArch64/Neoverse/V1-sve-instructions.s

+2,491-2,467
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)