Skip to content

[Refactor] New Halide-Like IR (Remove ISL, YAML deps and Polyhedral Compiler) #422

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 346 commits into
base: main
Choose a base branch
from

Conversation

hikettei
Copy link
Owner

@hikettei hikettei commented Jan 30, 2025

Blueprint New Specifications

  • Merge this by this week ...
  • Gemm >= 300GFLOPS is must
  • Goals
    • Bring back ECL, CCL Tests (which was previously failed due to ISL)
    • Blueprint is DAG, Everything is FastGraph
    • Remove weirdness in type inference (remove expr!)
    • Add X86/PTX Renderer
    • Refactor the entire codebase
    • Use defsimplifier
    • jit.lisp: tweak the view inference
    • Maybe remove ISL dependencies
    • Tile the range
    • Rename JIT_DEBUG -> DEBUG
    • Make Everything FastGraph
    • BEAM Search only
    • Test Speed in CI !!! (examples/bench.lisp)
  • Refactor aasm.lisp
  • Various optimizations are applicable after lowering aref indexing
    • TileBands
    • Shared Memory Transfer
      • GROUP
    • Update the description
    • Bring Back Memory Planner
;; RenderGraph is a DAG
(print-ast
 (with-blueprint
   (%progn
    (%defglobal 'a)
    (%defglobal 'b)
    (%range
     'gid0
     (%progn
      (let ((a (%add (%iconst 1) (%iconst 2)))
            (b (%add (%iconst 2) (%iconst 3))))
        (%mul a b)))
     :start 0 :end (%iconst 'n)))))

[P=0, ID=0]:
   :PROGN {0}
   ├ :DEFINE-GLOBAL {N1}
   ├ :DEFINE-GLOBAL {N2}
   └ :RANGE {N3}
     ├ load(GID0)
     │ └ Allocate[:int64] NIL
     ├ load(N)
     │ └ Allocate[:int64] NIL
     └ :PROGN {N4}
       └ :MUL {N5}
         ├ :ADD {N6}
         │ ├ load(1)
         │ │ └ Allocate[:int64] NIL
         │ └ load(2)
         │   └ Allocate[:int64] NIL
         └ :ADD {N7}
           ├ load(2)
           │ └ Allocate[:int64] NIL
           └ load(3)
             └ Allocate[:int64] NIL
  • Implement a subclass of Graph:
    • RenderGraph
    • ScheduleGraph
    • (Normal) TensorGraph
  • Rename AASM -> IR
  • Move expr -> aasm
  • Render OPs definition are in aasm
  • Implement Dequantize using with blueprint
  • CI: Speed test vs PyTorch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant