You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please leave any comments or edit this issue directly to adjust the release notes! Also see the rc0 vote thread in #12103.
Introduction
The TVM community has worked since the v0.8 release to deliver many exciting features and improvements. v0.9.0 is the first release on the new quarterly release schedule and includes many highlights, such as:
New tvm.relay.build parameters: runtime=, executor=,
AOT: support for the C++ runtime (with llvm and c targets only) and support for host-driven AOT in the C runtime
Hexagon RPC support
Testing via Hexagon SDK simulator and on device via Snapdragon-based HDK boards and phones
AOT and USMP support
Threading
Initial op support
MLF: support for multiple modules in a single MLF artifact
Several TIR schedule primitives and transforms including (abridged):
schedule.transform_layout - Applies a layout transformation to a buffer as specified by an IndexMap.
schedule.transform_block_layout - Applies a schedule transformation to a block as specified by an IndexMap.
schedule.set_axis_separators - Sets axis separators in a buffer to lower to multi-dimensional memory (e.g. texture memory).
transform.InjectSoftwarePipeline - Transforms annotated loop nest into a pipeline prologue, body and epilogue where producers and consumers are overlapped.
transform.CommonSubexprElimTIR - Implements common-subexpression elimination for TIR.
transform.InjectPTXAsyncCopy - Rewrites global to shared memory copies in CUDA with async copy when annotated tir::attr::async_scope.
transform.LowerCrossThreadReduction - Enables support for reductions across threads on GPUs.
And many more! See the list of RFCs and PRs included in v0.9.0 for a complete list, as well as the full change list.
RFCs
These RFCs have been merged in apache/tvm-rfcs since the last release.
Note that this list is not comprehensive of all PRs and discussions since v0.8. Please visit the full listing of commits for a complete view: v0.8.0...v0.9.0.rc0.
Please leave any comments or edit this issue directly to adjust the release notes! Also see the rc0 vote thread in #12103.
Introduction
The TVM community has worked since the v0.8 release to deliver many exciting features and improvements. v0.9.0 is the first release on the new quarterly release schedule and includes many highlights, such as:
tvm.relay.build
parameters:runtime=
,executor=
,llvm
andc
targets only) and support for host-driven AOT in the C runtimeschedule.transform_layout
- Applies a layout transformation to a buffer as specified by an IndexMap.schedule.transform_block_layout
- Applies a schedule transformation to a block as specified by an IndexMap.schedule.set_axis_separators
- Sets axis separators in a buffer to lower to multi-dimensional memory (e.g. texture memory).transform.InjectSoftwarePipeline
- Transforms annotated loop nest into a pipeline prologue, body and epilogue where producers and consumers are overlapped.transform.CommonSubexprElimTIR
- Implements common-subexpression elimination for TIR.transform.InjectPTXAsyncCopy
- Rewrites global to shared memory copies in CUDA with async copy when annotated tir::attr::async_scope.transform.LowerCrossThreadReduction
- Enables support for reductions across threads on GPUs.RFCs
These RFCs have been merged in apache/tvm-rfcs since the last release.
48d47c5
)cfcf114
)87ff1fa
)f47c6ad
)6990e13
)a518000
)70293c7
)7aed0ca
)ac15f2a
)de4fe97
)4203bd2
)b9e246f
)23250f5
)540c1f8
)b675ef8
)d9dd6eb
)9b6203a
)41e5ba0
)PackedFunc
into TVM Object System (#51) (2e0de6c
)f5ef65f
)f9fa824
)1b14456
)a3a7d2c
)263335f
)1a3d4f1
)67c39d2
)What's Changed
Note that this list is not comprehensive of all PRs and discussions since v0.8. Please visit the full listing of commits for a complete view: v0.8.0...v0.9.0.rc0.
AOT
BYOC
cudnn.conv.output_shape_from_cudnn
#9948ldmatrix
builtin to accelerate copying data from shared memory to warp memory #10855, [PTX] Support mma.sp to use Sparse Tensor Cores and refactor mma codegen #10339, [PTX-MMA] Add full PTX MMA code generation support #9909CI
last-successful
branch #10056, Add bot to ping reviewers after no activity #9973, Add Action to add cc'ed people as reviewers #9934Frontends
Hexagon
MetaSchedule
MicroTVM
Relay
conv2d_backward_weight
op (without topi) #9954 - Addconv2d_backward_weight
op (without topi)Runtime
PackedFunc
into TVM Object System #10032 - [PackedFunc] BringPackedFunc
into TVM Object Systemget_input_info
to graph_executor #9889 - [GraphExecutor] Add APIget_input_info
to graph_executorTE
TIR
access_ptr
rewriting, add a GPU test with depth 4 #11495 - [Software pipeline] Fix hardcoded index inaccess_ptr
rewriting, add a GPU test with depth 4TOPI
TVMScript
USMP
microNPU
microTVM
Misc
make docs
and doc building instructions #9534, Tutorial for running TVM on Arm(R) Cortex(R)-M55 CPU and Ethos(TM)-U55 NPU #9307, Document Project API server. #9654, Update NEWS to include v0.8 change log #9580--config
argument for config files #11012, [TVMC] Allow output module name to be passed as a command line argument #10962, [TVMC] Support compiling and running with VM #10722, [TVMC] Add configurationtir.add_lower_pass
to option--pass-config
#9817, [TVMC] Split common tvmc file into more specific files #9529, [TVMC][microTVM] Add new micro context #9229The text was updated successfully, but these errors were encountered: