- Fixed the dependency issue on PolyTOPS during building. !873 (All)
- Fixed the data type issue of int64 in GPU code generation. !868 (GPU)
Thanks goes to these wonderful people:
liuchao, yiyanzhi, hujiahui8, xuhui, yangshuo, nelson.lossing
- Support the auto-generation of cpu-pool2d(avg/max/global). !763 (CPU)
- Integerate Symbolic Tiling algorithm for AKG. !764 (Ascend)
- AutoTiling optimization on GPU Backend. !784 !785 !783 (GPU)
- optimization for Transpose Operator. !811 (CPU)
- Fix the bug that analysis error of read-write relationship in multi-filter.!778 (All)
- Fix the bug when batch_matmul recognizes the axis. !807 (CPU)
- Fix the bug when gcc version is less then 7.3. !817 (All)
Thanks goes to these wonderful people:
yangsijia, polyhedral, zhangrenwei, yiyanzhi, hujiahui8, zhengzuohe, zhangzhaochuang, xuhui, liuchao, xiaruijie, DeshiChen, nelson.lossing, zhenzhang, chenlei_autodiff, wYann, hanhuifeng, gaoxiong, zichun_ye, chengbin, wangrao124, huangmengxi, harenome, huoxinyou, yanglinfeng, Etienne
- Integrate the CPU compiling process for CustomOp. !612 (CPU)
- Optimize high-dimensional CSR OP. !664 (GPU)
- Update the restart process of Ascend backend. !701 (Ascend)
- Support the auto-generation of CSRMM operator. !709 (GPU)
- Integrate usage of intrinsic directives for the new Scheduler. !715 (All backends)
- Fix the bug of reshape elimination for the optimization process of fused operators compiling. !707 (All)
- Fix the bug of to_three_address pass. !728 (Ascend)
- Fix the bug of Operator types matching rules. !730 (All)
Thanks goes to these wonderful people:
yangsijia, polyhedral, zhangrenwei, yiyanzhi, hujiahui8, zhengzuohe, zhangzhaochuang, xuhui, liuchao, xiaruijie, DeshiChen, nelson.lossing, zhenzhang,
chenlei_autodiff, wYann, hanhuifeng, gaoxiong, zichun_ye, chengbin, wangrao124, huangmengxi, harenome, huoxinyou, yanglinfeng, Etienne
- Support a new loop intrinsic for MindSpore HybridDSL. !560 (ALL backends)
- Update the shared/local promotion strategy on GPU backend. !556 (GPU)
- Use the new interface provided by isl to reconstruct the reschedule pass after the pluto scheduling algorithm on the Ascend backend.!512 (Ascend)
- Fix the bug of Gemm/convolution operator due to repeated tiling. !582 (GPU)
- Fix the bug in the reduce operator when the Y direction is reduced and X is small. !559 (CPU)
Thanks goes to these wonderful people:
yangsijia, polyhedral, zhangrenwei, yiyanzhi, hujiahui8, zhengzuohe, zhangzhaochuang, xuhui, liuchao, xiaruijie, DeshiChen, nelson.lossing, zhenzhang,
chenlei_autodiff, lingyunli63, wYann, hanhuifeng, gaoxiong, zichun_ye, chengbin, wangrao124, huangmengxi, harenome, lear, huoxinyou, yanglinfeng, Etienne, Trump
- [STABLE] CPU backend support: AKG now supports automatic generation of high-performance code for CPUs with different architectures. !413 (CPU)
- [STABLE] CSR operators support: Develops csr operators(csrmv/csr mul/csr reduce_sum),and provides optimizing strategy to handle csr operators with dynamic upper bound. !407 (GPU)
- [STABLE] Scheduler Optimizing: Outermost band node now will be treated as multiple filter nodes if there is no dependency relationship between multiple statements. !460 (GPU)
- [STABLE] Replace gmp with imath on isl.!455 (ALL)
- [STABLE] Apply Autodiff in Custom Op. !464 (ALL)
- fix bugs in cpu tuning process:add "need_warm_up" attrs to avoid too many warm-up on cpu profiling.!495 (CPU)
- fix auto-tiling output mem-flow bug:modify the GEMM's auto-tiling strategy to match the actual target constraint.!504 (ASCEND)
- fix muilt output bug in emitter: when emit multi-output, should not convert real inputs.!506 (ALL)
Thanks goes to these wonderful people:
yangsijia, polyhedral, zhangrenwei, yiyanzhi, hujiahui8, zhengzuohe, zhangzhaochuang, xuhui, liuchao, xiaruijie, DeshiChen, nelson.lossing, zhenzhang, chenlei_autodiff, lingyunli63, wYann, hanhuifeng, gaoxiong, zichun_ye, chengbin, wangrao124, huangmengxi, harenome, lear, huoxinyou, yanglinfeng, Etienne, Trump
Contributions of any kind are welcome!
- [STABLE] New operators developing: Tensor of Tensor operators(Gather/GatherNd/TensorScatterAdd/UnsortedSegmentSum), which can be used to support GNN networks.!323(GPU)
- [STABLE] Add a topi op of UserDefine Op in AKG, which can be compiled from func_source_string or op_imply_path.!319(GPU)
- [STABLE] Lower interface Refactor: add StageLower for stage lower case.!310(GPU)
- [STABLE] The profiling suit for new runtime process.!306(ASCEND)
- Fixed memory promotion bug: sort the clusters before merging.!338 (ASCEND)
- Fixed irregular-reduce bug: replace shfl.down with shared memory reducetion.!332 (GPU)
- Fixed foldDimension bug: build wrong axis relation of relation.!302 (GPU)
Thanks goes to these wonderful people:
yangsijia, xxxxxxw, polyhedral, zhangrenwei, yiyanzhi, xixixian, hujiahui8, zhengzuohe, lishanni, zhangzhaochuang, xuhui, liu chao, gengzhen, xiaruijie,chenlei_autodiff, lingyunli63, wYann, lvwenyuan, peiwenfang, hanhuifeng, gaoxiong, chengyun Contributions of any kind are welcome!
- [STABLE] Support optimizing GEMM && Conv by using polyhedral + Tensorcore, as well as providing an akg::fragment_add/sub/mul/div library for GEMM op fusions. !156 (GPU)
- [STABLE] Optimize layout related operators(Transpose && pad/unpad) by adjusting the autotiling strategy and solving bank conflict for these ops. !152 (GPU)
- [STABLE] Add Kahan algorithm for reducetion operators. !107 (GPU)
- [STABLE] Support transdata + matmul prefusion pattern in akg. !103 (ASCEND)
- Fixed stitch_fusion bug when store is shared but load is not shared. !109 (GPU)
- Fixed reshape bug: when add reshape, should update tensor shape. !140 (GPU)
- Fixed autofuse bug: when set autofuse config, should record broadcast. !111 (GPU)
Thanks goes to these wonderful people:
yangsijia, xxxxxxw, polyhedral, zhangrenwei, yiyanzhi, xixixian, hujiahui8, zhengzuohe, lishanni, zhangzhaochuang, xuhui, liu chao, gengzhen, xiaruijie,chenlei_autodiff, lingyunli63, wYann, lvwenyuan, peiwenfang, hanhuifeng, gaoxiong, chengyun Contributions of any kind are welcome!
- [STABLE] Rebuild the AKG repository for providing a new way to support ascend backend by linking a static library contained all the ascend passes. (Ascend)
- [STABLE] Optimize the reduction add operation in ascend backend. (Ascend)
- [STABLE] Add support for tuning elemwise&&reduction operators. (GPU)
- Fixed a problem that data prefetch cannot be enabled by attributes in DSL.
- Fixed bugs of autotiling algorithms (tiling too small, cannot adapted matmul+bias, etc.) in Ascend platform.
- Fixed local memory promotion for large thread (2980!)
- Fixed reduce binding dimension issue on gpu platform (ff38!)
Thanks goes to these wonderful people:
yangsijia, xxxxxxw, polyhedral, zhangrenwei, yiyanzhi, xixixian, hujiahui8, zhengzuohe, lishanni, zhangzhaochuang, xuhui, liuchao, gengzhen, xiaruijie, chenlei_autodiff, lingyunli63, wYann, lvwenyuan, peiwenfang, hanhuifeng, gaoxiong, chengyun Contributions of any kind are welcome!
- Upload the initial framework
- Basic support for Ascend910 platform and gpu v100
- Integration with GraphKernel fusion of MindSpore.