Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Memhammer] Meta Schedule Rules #7

Closed
wants to merge 14 commits into from

Conversation

jinhongyii
Copy link
Collaborator

Migrated from tlc-pack repo

spectrometerHBH and others added 14 commits December 30, 2021 01:06
[Meta Schedule][M3c] Schedule Rules, Mutator & Postprocs (apache#485)

[Meta Schedule][M3c] PostOrderApply (apache#486)

Fix Post Order Apply (apache#490)

[MetaSchedule] Relay Integration (apache#489)

[M3c][Meta Schedule] Add Trace Correctness Test for PostOrderApply (apache#492)

Fix replay trace. (apache#493)

[M3c][Meta Schedule] Implement the Replay Func class. (apache#495)

[PR] Test script for meta-schedule task extraction. Interface to load… (apache#494)

[Meta Schedule Refactor] Get child blocks (apache#500)

Read-at && Write-at (apache#497)

[M3c][Meta Schedule] Measure Callbacks (apache#498)

[Bug] Fix Infinite Loop Caused When Calling Methods Not Overrided In PyClass (apache#496)

[MetaSchedule] Sample-Perfect-Tile (apache#501)

[MetaSchedule] TE Workloads (apache#502)

[TensorIR] GetProducer, GetConsumer (apache#506)

[MetaScheduleRefactor] Annotate&Unannotate (apache#505)

[MetaSchedule] Multi-Level-Tiling & Auto-Inline (apache#503)

[Tests] Add unittests for auto-inline and multi-level-tiling (apache#508)

[Meta Schedule] Minor Fixes (apache#507)

[MetaSchedule] Rewrite Cooperative-Fetching / Unbound-Block / Reduction-Block (apache#509)

[MetaSchedule] Rewrite Parallel-Vectorize-Unroll / Verify-GPU / Disallow-Dynamic-Loops (apache#499)

[Meta Schedule] Add Helper Function & Minor Modification (apache#512)

[MetaSchedule] Test for Rewrite Parallel-Vectorize-Unroll  (apache#513)

[Meta Schedule] Feature Extractor & Cost Model (apache#510)

Blockize & Tensorize (apache#514)

Layout Rewriting: Suggest-Index-Map (apache#520)

[MetaSchedule] Parallel-Vectorize-Unroll & Random-Compute-Location (apache#516)

[Meta Schedule] Per-Store-Feature (apache#521)

Add traced schedule for blockize & tensorize (apache#526)

[Meta Schedule] Add XGBoost Model & Random Model (apache#519)

User-Interface: Tune-TIR (apache#525)

User-Interface: Tune-TE (apache#527)

[Minor] More logging on python (apache#528)

Get CUDA tuning working (apache#529)

[MetaSchedule] TensorRT BYOC (apache#518)

[BugFix] LocalBuilder API (apache#531)

[Meta Schedule] Add Cost Model Update Measure Callback (apache#530)

[Bugfix] BuilderInput with default params (apache#532)

[MetaSchedule] Mutator-Tile-Size, Mutate-Parallel, Mutate-Unroll (apache#534)

[Meta Schedule] Evolutionary Search (apache#522)

[BugFix] Remove duplicated definition of MakeMultinomialSampler (apache#535)

[Meta Schedule] Fix some bugs (apache#537)

Initiate Experiments for CPU Performance Alignment with Ansor (apache#538)

[Meta Schedule] Tweak experiment scripts (apache#539)

[Meta Schedule] Initiate experiments on CUDA (apache#540)

[TIR][Schedule] Buffer transform (apache#523)

Auto Tensor Core (apache#524)

Working on Evo Search (apache#542)

[Meta Schedule] Add Replay Tuning Interface (apache#543)

Evolutionary Search on CPU (apache#544)

Misc improvement over the error message (apache#545)

[TIR][Schedule] Software pipelining (apache#533)

[Meta Schedule Refactor] fixing unit tests (apache#547)

[MetaSchedule] Mutator-Compute-Location (apache#548)

Misc Improvement of Evolutionary Search (apache#549)

Hotfix for software pipeline (apache#552)

Misc Improvement (apache#550)

[Cherry-Pick][TensorIR] Primitive "SetScope" (apache#9738) (apache#555)

Rule RFactor (apache#551)

[MemHammer] Rewrite Rules (apache#554)

[MetaSchedule] Schedule Rule: Cross-Thread Reduction (apache#556)

[MetaSchedule] Performance Alignment - NRM and SFM (CUDA) (apache#559)

[MetaSchedule] Perf Alignment - NRM on CUDA (apache#560)

[TIR] Reorder the block iters of the blocks generated by RFactor (apache#561)
Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Bohan Hou <32121147+spectrometerHBH@users.noreply.github.com>
Co-authored-by: Hongyi Jin <3231950289@qq.com>
Co-authored-by: Ruihang Lai <lairuihangdongdong@qq.com>
Co-authored-by: Junru Shao <junrushao1994@gmail.com>
Co-authored-by: Wuwei Lin <wuwei@apache.org>
Co-authored-by: Sunghyun Park <49998730+sunggg@users.noreply.github.com>
Co-authored-by: Xiyou Zhou <xiyou@octoml.ai>
* format

new auto padding algorithm

address comment

revert black

address comment

address comment

format

finally over

rename

auto padding

tmp

make gemm work

minor

auto padder + mutator (undone)

* add new line

* address comment
* meta schedule perf align: misc improvement for search space

* fix unittest

* remove a log(info)

* code review

* update member name

* init_max_fail_count to init_min_unmeasured

Co-authored-by: Junru Shao <junrushao1994@gmail.com>
@@ -134,7 +134,7 @@ struct ReadWriteAtImpl {
impl.MakeLoopAndBlock<is_read>(src->name + "_" + storage_scope);
StmtSRef result_block_sref =
impl.ReplaceScopeBlock(new_loop_block.first.get(), new_loop_block.second->block.get());
impl.UpdateBlockInfo(result_block_sref);
impl.UpdateBlockInfo(result_block_sref, !new_loop_block.second->iter_values.empty());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is affine_binding == !iter_values.empty()?

@@ -617,10 +615,17 @@ class IterMapRewriter : public ExprMutator {
PrimExpr expected_extra_base = 0;
PrimExpr expected_scale = base_scale.value();
for (size_t i = 0; i < expr->args.size();) {
if (is_one(expr->args[i]->source->extent)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have some test cases for iter map?

return false;
}

void FallbackRule(const For& loop, Array<Integer>* stage, Array<Integer>* order) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void FallbackRule(const For& loop, Array<Integer>* stage, Array<Integer>* order) {
void FallbackSoftwarPipelineRule(const For& loop, Array<Integer>* stage, Array<Integer>* order) {

/*!
* \brief whether the loop's body has the pattern: 2 cache read shared followed by a nested
* software pipeline
* \param the loop
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* \param the loop
* \param loop the loop

Comment on lines +333 to +335
if (tir::IsCacheReadSharedPattern(loop)) {
stage = {0, 0, 0, 0, 0, 1, 1};
order = {0, 3, 1, 4, 5, 2, 6};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer doing it with at least some basic analysis

@vinx13
Copy link
Collaborator

vinx13 commented Feb 16, 2022

@jinhongyii can you rebase and let's get this in

@junrushao junrushao force-pushed the meta-schedule branch 6 times, most recently from fd5719a to 88cf664 Compare February 21, 2022 17:42
@junrushao
Copy link
Owner

I'm going to close this PR given it looks like it's been stale for a while. @jinhongyii @vinx13

@junrushao junrushao closed this Jun 1, 2022
junrushao pushed a commit that referenced this pull request Oct 18, 2022
* Shape and type deduction.

* Fix header.

* Add call attrs to the deduce signature.

* Address comments.

* Add DiagnosticContext to IRBuilder and inference signature.

* Fix nits.
junrushao pushed a commit that referenced this pull request Feb 8, 2023
* Shape and type deduction.

* Fix header.

* Add call attrs to the deduce signature.

* Address comments.

* Add DiagnosticContext to IRBuilder and inference signature.

* Fix nits.
junrushao pushed a commit that referenced this pull request Feb 8, 2023
* [IR] Introduce StructInfo

* StructInfoFunctor and Analysis Support

* [TVMScript] Parse type/shape annotation with StructInfo

* remove runtime type assign

* Remove type/shape during parsing (#2)

* Normalizer prep: simple checks and legacy function renaming.

* Struct info deduction in BlockBuilder.

* Two TODOs

* StructInfo Normalizer Fixes (#3)

* StructInfo AST Fix

* Fix Extern Func Deduction and shape mutator.

* Update VoidStructInfo & globalvar (#4)

* Fix passes and proper sinfo propagation.

* Refactor EraseToWellDefined to Enable Remapping

* [WIP] First stab at symbolic param tracking

* Update EraseToWellDefined to support symbolic shape return (#5)

* fix R.shape with ndim (#6)

* Remove update shape/type

* Address review comment, AnnotateTypeShape=>AnnotateStructInfo

* Update include/tvm/script/ir_builder/relax/frame.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

* Address comments

* Update printer to use structinfo (#7)

* Update Error mechanism to prep for obj loc based reporting

* Symbolic shape aware function call return value derivation.

The main flow works as follows:
- Match and populate shape_var_map and var_map by visit each pair of
  param and call arguments.
- Call EraseToWellDefined to map the ret parameter to new result.

* [ANALYSIS] Refactor well-form to only look at struct info.

* Update comments according to reviews.

* Update include/tvm/relax/struct_info.h

Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>

Co-authored-by: Siyuan Feng <Hzfengsy@sjtu.edu.cn>
Co-authored-by: Tianqi Chen <tqchen>
Co-authored-by: Ruihang Lai <ruihangl@cs.cmu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants