Skip to content

Conversation

@fifield
Copy link
Collaborator

@fifield fifield commented Nov 12, 2025

This PR is a proposal to add declarative event trace configuration to AIE dialect.

The current implementation:

Input Code

aie.device(npu1_1col) {
  %tile02 = aie.tile(0, 2)
  
    // Trace configuration for compute tile (0,2) - core events
    aie.trace @core_trace(%tile_0_2) {
      // Set trace mode (Event-Time captures timestamps)
      aie.trace.mode "Event-Time"

      // Configure packet routing (ID and type for packet-switched routing)
      aie.trace.packet id=1 type=core

      // Specify which events to capture (up to 8 events)
      aie.trace.event<"INSTR_EVENT_0">        // User event 0 (start marker)
      aie.trace.event<"INSTR_EVENT_1">        // User event 1 (end marker)
      aie.trace.event<"INSTR_VECTOR">         // Vector instructions
      aie.trace.event<"MEMORY_STALL">         // Memory access stalls
      aie.trace.event<"STREAM_STALL">         // Stream buffer stalls
      aie.trace.event<"LOCK_STALL">           // Lock acquisition stalls
      aie.trace.event<"PORT_RUNNING_0">       // DMA:0 S2MM running
      aie.trace.event<"PORT_IDLE_1">          // DMA:1 MM2S idle
      aie.trace.port<0> port=DMA channel=0 direction=S2MM
      aie.trace.port<1> port=DMA channel=0 direction=MM2S

      // Specify start/stop control (broadcast events)
      aie.trace.start event=<"BROADCAST_15">
      aie.trace.stop event=<"BROADCAST_14">
    }
  
  // Runtime sequence with trace invocation
  aiex.runtime_sequence @seq(%arg0: memref<32xi32>) {
    aie.trace.start_config @core_trace
    // ... other runtime operations
  }
}

Generated Output

  // Intermediate representation (after -aie-trace-to-config)
  aie.trace.config @core_trace_config(%tile_0_2) packet_type = core {
    aie.trace.reg register = "Trace_Control0" field = "Mode" value = 0 : i32 comment = "trace mode"
    aie.trace.reg register = "Trace_Control1" field = "ID" value = 1 : i32 comment = "packet ID"
    aie.trace.reg register = "Trace_Control1" field = "Packet_Type" value = 0 : i32 comment = "packet type"
    aie.trace.reg register = "Trace_Control0" field = "Trace_Start_Event" value = 122 : i32 comment = "start event"
    aie.trace.reg register = "Trace_Control0" field = "Trace_Stop_Event" value = 121 : i32 comment = "stop event"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_0_ID" value = "DMA:0" comment = "port 0 ID"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_0_Master_Slave" value = 1 : i32 comment = "port 0 master/slave"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_1_ID" value = "DMA:0" comment = "port 1 ID"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" field = "Port_1_Master_Slave" value = 0 : i32 comment = "port 1 master/slave"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event0" value = 33 : i32 comment = "event slot 0"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event1" value = 34 : i32 comment = "event slot 1"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event2" value = 37 : i32 comment = "event slot 2"
    aie.trace.reg register = "Trace_Event0" field = "Trace_Event3" value = 23 : i32 comment = "event slot 3"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event4" value = 24 : i32 comment = "event slot 4"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event5" value = 26 : i32 comment = "event slot 5"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event6" value = 79 : i32 comment = "event slot 6"
    aie.trace.reg register = "Trace_Event1" field = "Trace_Event7" value = 78 : i32 comment = "event slot 7"
  }

  // Intermediate representation (after -aie-trace-pack-reg-writes)
  aie.trace.config @core_trace_config(%tile_0_2) packet_type = core {
    aie.trace.reg register = "Trace_Control0" value = 2038038528 : i32 mask = 2139029507 comment = "trace mode + start event + stop event"
    aie.trace.reg register = "Trace_Control1" value = 1 : i32 mask = 28703 comment = "packet ID + packet type"
    aie.trace.reg register = "Stream_Switch_Event_Port_Selection_0" value = 289 : i32 mask = 16191 comment = "port 0 ID + port 0 master/slave + port 1 ID + port 1 master/slave"
    aie.trace.reg register = "Trace_Event0" value = 388309537 : i32 mask = 2139062143 comment = "event slot 0 + event slot 1 + event slot 2 + event slot 3"
    aie.trace.reg register = "Trace_Event1" value = 1313806872 : i32 mask = 2139062143 comment = "event slot 4 + event slot 5 + event slot 6 + event slot 7"
  }

  // Final output (after -aiex-inline-trace-config)
  aiex.runtime_sequence @seq(%arg0: memref<32xi32>) {
    aiex.npu.write32 {address = 213200 : ui32, column = 0 : i32, row = 2 : i32, value = 2038038528 : ui32}
    aiex.npu.write32 {address = 213204 : ui32, column = 0 : i32, row = 2 : i32, value = 1 : ui32}
    aiex.npu.write32 {address = 261888 : ui32, column = 0 : i32, row = 2 : i32, value = 289 : ui32}
    aiex.npu.write32 {address = 213216 : ui32, column = 0 : i32, row = 2 : i32, value = 388309537 : ui32}
    aiex.npu.write32 {address = 213220 : ui32, column = 0 : i32, row = 2 : i32, value = 1313806872 : ui32}
    // Additional npu.write32 for other registers...
  }

Combo events:

aie.trace @my_trace(%tile02) {
  // Combo 0: lock stalled AND NOT DMA active
  aie.trace.combo_event<0> "LOCK_STALL" AND_NOT "DMA_S2MM_0_STALLED"
  
  // Combo 1: instruction event OR vector operation
  aie.trace.combo_event<1> "INSTR_EVENT_0" OR "INSTR_VECTOR"
  
  // Combo 2: (combo0) AND (combo1)
  aie.trace.combo_event<2> "COMBO_EVENT_0" AND "COMBO_EVENT_1"
  
  // Trace the combo results
  aie.trace.event<"COMBO_EVENT_0">
  aie.trace.event<"COMBO_EVENT_1">
  aie.trace.event<"COMBO_EVENT_2">
  ...
}

Edge Events:

aie.trace @my_trace(%tile02) {
  // Edge detector 0: count lock stalls (rising edges)
  aie.trace.edge_event<0> event="LOCK_STALL" trigger=RISING
  
  // Edge detector 1: count transitions (both edges)
  aie.trace.edge_event<1> event="LOCK_STALL" trigger=BOTH
  
  // Trace the edge-detected events
  aie.trace.event<"EDGE_DETECTION_EVENT_0">
  aie.trace.event<"EDGE_DETECTION_EVENT_1">
  ...
}

Depends on #2712 and #2696

@fifield fifield force-pushed the events_proposal branch 3 times, most recently from 1819788 to 0a3ea70 Compare November 13, 2025 02:32
fifield and others added 2 commits November 14, 2025 14:27
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
fifield and others added 24 commits November 14, 2025 15:16
- Renamed trace_events_enum.py to trace_events package with per-architecture modules
- Changed from single file to package: aie.utils.trace_events.{aie,aie2,aie2p}
- Renamed PLEvent to ShimTileEvent to match expected usage
- Generate files during build into build/regdb/ directory instead of manually generating
- Copy generated files to Python package directory during build
- Install generated files along with checked-in __init__.py
- Update all imports to use new package structure
- Conditionally emit UCEvent enum only when events are defined (e.g. aie2ps in future)
- Emit unified events.json into build/regdb for compiler ingest (for future work)
- Map aieml architecture to aie2 for Python module naming

Fill enum value gaps with reserved placeholders

The generator now fills gaps in enum values with rsvd_XX placeholders
to ensure continuous numbering from min to max value. This matches the
behavior of the original manually-generated file and ensures that all
hardware event codes are represented in the enum, even if they are
reserved or undefined.

Example: If events 0-53 and 55-127 are defined, value 54 gets
placeholder rsvd_54 to maintain continuity.

Fixes comparison with original trace_events_enum.py file.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: fifield <800843+fifield@users.noreply.github.com>
Co-authored-by: Jeff Fifield <jeff.fifield@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
- Add generate_events_tablegen.py to create TableGen enum definitions
  from aie-rt event headers
- Update AIE dialect CMakeLists to generate AIEEvents*.td.inc files
  during build process
- Include generated event enums in AIEAttrs.td
- Add proper CMake dependencies to ensure enums are generated before
  they are needed by tablegen

This allows the AIE dialect to emit C++ enum classes for all event
types (CoreEvent, MemEvent, ShimTileEvent, MemTileEvent) across all
AIE architectures (AIE, AIE2, AIE2P).
- Created AIETraceAttrs.td with TraceModeAttr, TracePacketTypeAttr, TraceEventAttr
- Created AIETraceOps.td with trace operations:
  - aie.trace (symbol operation)
  - aie.trace.mode, aie.trace.event, aie.trace.packet
  - aie.trace.start, aie.trace.stop
  - aie.trace.config, aie.trace.reg (intermediate ops)
  - aie.trace.start_config (runtime invocation)
- Implemented basic C++ verifiers in AIETraceOps.cpp
- Updated AIEAttrs.td and AIEOps.td to include trace definitions
- Updated lib/Dialect/AIE/IR/CMakeLists.txt to build AIETraceOps.cpp
- Added parsing test that validates operations parse correctly
- Test passes: operations parse and print correctly
- Created test_trace_verify.mlir with negative tests:
  - Too many events (>8)
  - Packet ID out of range (0 and 32)
  - Start/stop events missing parameters
  - Start/stop events with conflicting parameters
- All verifiers working correctly
- Tests pass successfully
- Created AIERegisterDatabase.h with BitFieldInfo, RegisterInfo, EventInfo structs
- Implemented AIERegisterDatabase.cpp with JSON parsing for:
  - Register database (utils/aie_registers_aie2.json)
  - Event database (utils/events_database.json)
- Key features:
  - lookupRegister(name, module) - finds register by name and module
  - lookupEvent(name, module) - finds event code by name and module
  - encodeFieldValue(field, value) - encodes value into bitfield
- Uses module::name keys to handle duplicate register names across modules
- Compiles successfully and links with AIE dialect
- Added pass definitions to AIEPasses.td for all three trace passes
- Updated AIEPasses.h with pass creation function declarations
- Implemented AIETraceToConfig.cpp:
  - Converts aie.trace → aie.trace.config
  - Emits aie.trace.reg operations for each register field
  - Handles Trace_Control0 (mode, start/stop events)
  - Handles Trace_Control1 (packet ID, packet type)
  - Handles Trace_Event0/1 (event slots 0-7)
  - Updates trace.start_config symbol references
- Created stub implementations for AIEInlineTraceConfig and AIEConfigToNPU
- Added test_trace_to_config.mlir that validates transformation
- Test passes: trace ops correctly lowered to config ops
- Implemented AIEInlineTraceConfig.cpp:
  - Finds all aie.trace.start_config operations
  - Looks up referenced trace.config symbol
  - Clones all aie.trace.reg operations to call site
  - Removes trace.start_config invocation
- Relaxed parent constraint on aie.trace.reg to allow DeviceOp parent
  (needed for inlined reg ops)
- Added test_inline_trace_config.mlir
- Test passes: trace.reg ops successfully inlined
- Implemented AIEConfigToNPU.cpp with prototype stub
- Pass collects inlined trace.reg operations
- Placeholder for full implementation that would:
  - Load RegisterDatabase
  - Resolve register names to offsets
  - Encode bitfield values
  - Merge writes to same register
  - Generate aiex.npu.write32 operations
- Compiles successfully
- Created test_trace_end_to_end.mlir demonstrating complete pipeline
- Tests full transformation: aie.trace → aie.trace.config → inlined aie.trace.reg
- Validates:
  - High-level trace configuration with 4 events
  - Mode, packet routing, start/stop events
  - Correct lowering through both passes
  - Symbol references updated correctly
  - Register specifications generated for all fields
- Test passes: complete pipeline working end-to-end
- Enhanced aie.trace.reg to include optional tile operand
- Updated AIETraceToConfig to pass nullptr for tile (parent has it)
- Updated AIEInlineTraceConfig to pass tile reference when cloning
- Implemented full AIEConfigToNPU with:
  - RegisterDatabase loading and integration
  - Register name → offset resolution
  - Event name → event code resolution
  - Bitfield value encoding
  - Register write merging (multiple fields → single register)
  - Absolute address calculation
  - aiex.npu.write32 generation (when AIEX dialect available)
- Pass validates register/field lookups work correctly
- Demonstrates complete lowering pipeline infrastructure

Note: NPU write generation requires AIEX dialect to be pre-loaded.
This will be addressed in production integration.
PROBLEM FIXED:
- trace.reg with 'for %tile' lost col/row information
- Inlined trace.reg at device level was fragile

SOLUTION:
- Removed tile operand from trace.reg (now only in trace.config)
- Simplified parent constraint: trace.reg only in TraceConfigOp
- Moved RegisterDatabase integration from Pass 3 to Pass 2
- Pass 2 now generates npu.write32 directly with col/row from tile
- Pass 3 is now a no-op (kept for extensibility)

BENEFITS:
- Col/row extracted immediately during inlining (not lost)
- Cleaner IR (no intermediate trace.reg at device level)
- Two-pass pipeline instead of three
- npu.write32 has explicit column/row attributes

This fixes the architectural issue identified in code review.
PROBLEM: AIEX dialect couldn't be loaded during AIE pass execution

SOLUTION: Move NPU-generating passes to AIEX dialect where they belong
- Moved AIEInlineTraceConfig.cpp to lib/Dialect/AIEX/Transforms/
- Moved AIEConfigToNPU.cpp to lib/Dialect/AIEX/Transforms/
- Updated pass definitions in AIEXPasses.td
- Removed from AIEPasses.td
- Updated CMakeLists for both dialects
- Updated pass registration headers
- Fixed namespaces (AIEX, not AIE)

RESULT: npu.write32 generation now works!
- Pass renamed: aie-inline-trace-config → aiex-inline-trace-config
- Pass renamed: aie-config-to-npu → aiex-config-to-npu
- Col/row preserved in npu.write32 operations
- RegisterDatabase integration functional
- Bitfield merging working

Example output:
aiex.npu.write32 {address=0xB40D0, column=0, row=2, value=0x1E2E0001}

This is the correct architectural placement: AIEX depends on AIE.
- Moved test_inline_trace_config.mlir to test/Dialect/AIEX/trace/
- Moved test_trace_end_to_end.mlir to test/Dialect/AIEX/trace/
- Updated pass name: aie-inline-trace-config → aiex-inline-trace-config
- Added CHECK for aiex.npu.write32 operations
- Verified col/row attributes are preserved
- Both tests pass successfully

Test organization:
- AIE tests: parse, verify, trace-to-config (AIE dialect operations)
- AIEX tests: inline-trace-config, end-to-end (NPU generation)

This reflects the correct architectural separation.
- Added aiex.runtime_sequence wrapper to both tests
- Shows proper usage: trace.start_config inside runtime_sequence
- Tests validate npu.write32 generation within runtime context
- All tests passing with correct architectural pattern

This demonstrates the intended usage pattern where trace configuration
is invoked from within a runtime sequence.
Implement aie.trace.port operation for hardware stream switch port monitoring:

- Extend AIETargetModel with port mapping API (getStreamSwitchPortIndex, isValidStreamSwitchPort)
- Add AIE_TracePortOp to AIETraceOps.td with slot (0-7), port, channel, master attributes
- Implement TracePortOp::verify() with duplicate slot detection and port validation
- Extend RegisterDatabase with resolvePortValue() for PORT:CHANNEL string parsing
- Update AIETraceToConfig to process TracePortOp and generate register writes
- Update AIETracePackRegWrites to resolve PORT:CHANNEL to hardware indices
- Add comprehensive test suite: parse, verify, lowering, end-to-end tests

Target registers: Stream_Switch_Event_Port_Selection_0/1
Enables monitoring of up to 8 stream switch ports with PORT_RUNNING, PORT_IDLE, PORT_STALLED, PORT_TLAST events.

Note: Current implementation uses stub port mappings - actual hardware tables needed.
Use DMAChannelDir enum instead of boolean for stream switch port direction:
- S2MM (Stream-to-Memory-Mapped) = master port (value 1)
- MM2S (Memory-Mapped-to-Stream) = slave port (value 0)

This provides clearer semantics matching the hardware's DMA channel direction
naming convention and makes the port direction more explicit in the IR.

Updated:
- AIE_TracePortOp: Changed 'master' attribute from BoolAttr to DMAChannelDir 'direction'
- TracePortOp::verify(): Convert DMAChannelDir to bool for TargetModel API
- AIETraceToConfig Pass 1: Convert direction enum to master/slave value
- All test files: Updated syntax from 'master=true/false' to 'direction=S2MM/MM2S'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant