Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release memory of task node on the fly #4735

Open
wants to merge 4,485 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
4485 commits
Select commit Hold shift + click to select a range
1e129b6
[one::OpBuilder] op interpreter core (#4407)
hjchen2 Mar 24, 2021
da24db6
Refine label check of PR (#4486)
jackalcooper Mar 24, 2021
8fcb985
Add XLA CI back (#4485)
jackalcooper Mar 24, 2021
76ecef1
interface_op support parallel_distribution (#4479)
guo-ran Mar 24, 2021
de2d579
Check xla is enable in script (#4490)
jackalcooper Mar 24, 2021
2b26de2
Refactor builder instr pb list to instr msg list (#4487)
poohRui Mar 24, 2021
7f3a525
Remove PackedBlob and packed_id (#4492)
chengtbf Mar 24, 2021
3d4b3d8
Feat autograd mode (#4496)
wyg1997 Mar 24, 2021
0ae13ad
Use local rpc backend by default if single process (#4493)
jackalcooper Mar 24, 2021
acd52f3
fixbug: mem reuse ctrl safe guard when block has multi-sink (#4497)
chengtbf Mar 24, 2021
2a65ed1
add some quantization ops conversion in onnx (#4465)
mosout Mar 24, 2021
308fa02
user_kernel support parallel_distribution (#4489)
guo-ran Mar 24, 2021
6ae8b0e
construct pure intr msg list without pb (#4494)
poohRui Mar 25, 2021
d202344
Remove RegstDesc::Lock() and TaskNode::LockRegst() (#4506)
chengtbf Mar 25, 2021
ce985ea
Always enable local rpc backend (#4505)
jackalcooper Mar 25, 2021
57ec3bd
Remove CtrlRegstDesc::returned_regst_num and reliant_regst_desc_id (#…
chengtbf Mar 25, 2021
7d44e93
CreateOpKernelState return nullptr when parallel_num = 1 (#4511)
guo-ran Mar 25, 2021
e7739d3
add sequantial callback instruction (#4503)
lixinqi Mar 25, 2021
3c08cb6
Hierarchical boxing sub graph (#4500)
guo-ran Mar 25, 2021
2a404cd
handle ctrl msg from other rank (#4491)
clackhan Mar 25, 2021
60c55b4
[one::OpBuilder] Contruct variable and user op from Python (#4458)
hjchen2 Mar 26, 2021
3205750
operator infer parallel_distribution (#4519)
guo-ran Mar 26, 2021
0b75935
Sequantial instruction (#4521)
lixinqi Mar 26, 2021
8e8d2ad
fix tensor construtor (#4522)
poohRui Mar 26, 2021
71c0337
indexed_slices_optimizer_rewrite_pass parallel_cast to hierarchical_p…
guo-ran Mar 26, 2021
aa67c5f
add fake_quantization op conversion in onnx (#4512)
mosout Mar 28, 2021
9eadf06
Fix compiler bug of multi node with rank info bootstrap (#4524)
clackhan Mar 29, 2021
4516679
ONEFLOW_PROFILER_KERNEL_PROFILE_KERNEL_FORWARD_RANGE (#4525)
liujuncheng Mar 29, 2021
b65f9e4
Feat autograd engine patch (#4529)
wyg1997 Mar 29, 2021
ee2f97d
oneflow.Model for function style api (#4246)
strint Mar 29, 2021
b117fde
add blob object id and eager blob object interface for eager mirrored…
poohRui Mar 29, 2021
ae3e899
fix clip op bug (#4509)
BBuf Mar 29, 2021
c39e79e
PhyInstrOperand (#4531)
lixinqi Mar 29, 2021
b5ed719
fix_multi_node_dead_lock_bug (#4537)
clackhan Mar 29, 2021
1a1ece3
Dev replace attr value proto to cfg (#4387)
ouyangyu Mar 30, 2021
059fa19
Remove BlobDescProto.header_is_opaque (#4541)
chengtbf Mar 30, 2021
afed501
optimize sparse softmax cross entropy gpu (#4532)
guo-ran Mar 30, 2021
9d92fc4
Fix autograd engine bugs (#4542)
wyg1997 Mar 30, 2021
cc6f0bd
CheckpointingPass add fake op as kBackwardPass (#4549)
chengtbf Mar 30, 2021
18c6828
module with op builder (#4546)
daquexian Mar 31, 2021
cae0072
Feat: NCCL use compute stream support 2D SBP (#4533)
chengtbf Mar 31, 2021
dc2d942
add common methods for module implementation (#4551)
daquexian Mar 31, 2021
09c55d0
enable sharing shape between BlobDescs (#4557)
lixinqi Mar 31, 2021
ff0b384
Refactor eager op interpreter. (#4555)
hjchen2 Mar 31, 2021
9b374f8
refine (#4558)
jackalcooper Mar 31, 2021
97d3023
Dry run & compile simulation (#4508)
jackalcooper Mar 31, 2021
09d7d7b
Use pip's local version identifiers (#4498)
jackalcooper Mar 31, 2021
45abd08
fix_partial_memory_leak (#4562)
clackhan Apr 1, 2021
0041be8
Refine nightly (#4561)
jackalcooper Apr 1, 2021
1a07571
Refactor eager mirrored tensor (#4560)
lixinqi Apr 1, 2021
33362de
Optimize compile time memory usage (#4538)
jackalcooper Apr 1, 2021
cc8d6c0
GPT Data Loader (#4527)
leaves-zwx Apr 1, 2021
358352d
fix softmax sbp (#4568)
guo-ran Apr 1, 2021
8f8bde6
fix blob_desc ctor (#4564)
lixinqi Apr 1, 2021
362dcee
Refactor tensor desc as abstract class (#4530)
poohRui Apr 1, 2021
d586051
Support link shared lib Protobuf (#4567)
jackalcooper Apr 1, 2021
58bf291
fuse tril scale softmax dropout (#4566)
guo-ran Apr 2, 2021
c4dbd4a
allow more streams using scheduling thread (#4572)
lixinqi Apr 2, 2021
57c0d84
Fix iteration variable name (#4573)
leaves-zwx Apr 3, 2021
99a8de4
sparse_softmax_cross_entropy support 2d model_parallel (#4576)
guo-ran Apr 5, 2021
5d10c96
try static switch (#4577)
lixinqi Apr 5, 2021
596908c
Local RPC supports TryLockResult::kDoing (#4581)
jackalcooper Apr 6, 2021
4f2cac6
change nccl logical debug log from console to LOG.INFO (#4584)
chengtbf Apr 6, 2021
5950abe
Nccl support s1 to B and P to s1 (#4579)
guo-ran Apr 6, 2021
50156d0
dev_fix_pad_op_sbp (#4569)
Ldpe2G Apr 6, 2021
14fa476
make eager execution async by default (#4499)
daquexian Apr 6, 2021
1f4ac2a
Fix DeviceTick time shape (#4587)
liujuncheng Apr 7, 2021
81da3f5
CheckpointingPass ignore Repeat/Unpack (#4588)
liujuncheng Apr 7, 2021
0b33b4f
clip_by_global_norm and count_not_finite add cast to P (#4580)
guo-ran Apr 7, 2021
d448705
Fix include cuda header (#4590)
liujuncheng Apr 7, 2021
8d6a5b4
NCCL logical op support [*, S(1)] -> [*, B] (#4593)
chengtbf Apr 7, 2021
ddea109
Remove option CUDA_SEPARABLE_COMPILATION (#4592)
liujuncheng Apr 7, 2021
0ce322e
Remove outdate PrintSbpLog in test_ops.cpp (#4585)
chengtbf Apr 7, 2021
96afb78
Refine create NcclCollectiveBoxingExecutorBackend (#4583)
liujuncheng Apr 7, 2021
a3cdcbb
Fix hob device type lifetime (#4594)
jackalcooper Apr 8, 2021
fb8b406
fuse bias add dropout gelu ops (#4595)
guo-ran Apr 8, 2021
9c3f5e3
Don't ssh copy packages if custom op test fails (#4597)
jackalcooper Apr 8, 2021
1e684d7
acc_kernel add half, acc_tick add GetSbp (#4596)
guo-ran Apr 8, 2021
70c5453
Add GradientAccumulationRewritePass (#3734)
liujuncheng Apr 8, 2021
12b950c
PhyInstrOperand::ForEachXXXMirroredObject (#4599)
lixinqi Apr 8, 2021
693253d
Make infer data type away from tensor desc (#4536)
poohRui Apr 8, 2021
9ae08cb
Matmul add alpha Attr (#4603)
guo-ran Apr 8, 2021
a2afc2a
half reduce_sum_like use gemm (#4600)
guo-ran Apr 8, 2021
e9747e9
fix empty initializer (#4598)
poohRui Apr 8, 2021
1a4728a
do_parallel_cast_before_widening_type_cast_pass use hierarchical_para…
guo-ran Apr 8, 2021
2c7c4e9
Add seperated matrix for eager (#4601)
jackalcooper Apr 9, 2021
edde49a
add module common part (#4610)
mosout Apr 9, 2021
6fc7818
model inherit new module (#4602)
strint Apr 9, 2021
8914430
Add static check for user op dtype register (#4611)
poohRui Apr 9, 2021
7489f6f
Optimize repeat placement (#4604)
liujuncheng Apr 9, 2021
bd0ce7b
Check and complete op_conf.scope_symbol_id in JobPass (#4606)
chengtbf Apr 9, 2021
6f2e855
Dry run output gpu mem usage (#4582)
jackalcooper Apr 9, 2021
8c161af
nccl logical 1D op support ParallelDistribution (#4614)
chengtbf Apr 9, 2021
733e840
fix count_not_finite and clip_by_global add's parallel_conf (#4613)
guo-ran Apr 9, 2021
fd01fac
Fix typo: Sender -> Receiver (#4617)
daquexian Apr 11, 2021
5c8708c
[one::OpBuilder] Dev user op gradient (#4515)
hjchen2 Apr 12, 2021
5244008
Fix ctrl_in_op restriction (#4622)
liujuncheng Apr 12, 2021
05ff4c3
VmLocalDepObject (#4608)
lixinqi Apr 13, 2021
50713f3
Add privilege for CI container for gdb (#4634)
jackalcooper Apr 13, 2021
fded90e
Xfjiang/dev save lr to csv (#4633)
ScXfjiang Apr 13, 2021
b70bcbe
Remove Pod and header in BlobDesc (#4607)
chengtbf Apr 13, 2021
8482f8f
Fix Memory Leak (#4624)
clackhan Apr 13, 2021
76edb34
Dev AttrValueMap and support dynamic attributes. (#4628)
hjchen2 Apr 13, 2021
df53a79
Remove RecordBlob and Blob::record_num (#4638)
chengtbf Apr 14, 2021
6fa241e
fix a bug in fake_quantization (#4642)
mosout Apr 14, 2021
26cba8e
Support oneflow convert tools (#4640)
BBuf Apr 14, 2021
c4ab77b
Write blob callback instruction (#4627)
poohRui Apr 14, 2021
3bce9ab
Add cudnn conf in resource (#4646)
clackhan Apr 14, 2021
d97b14c
refactor PhyInstrOperand (#4645)
lixinqi Apr 14, 2021
570a837
Update readme, remove CentOS info, add Ubuntu 20.04 info (#4649)
jackalcooper Apr 15, 2021
05ebd38
fix (#4651)
ScXfjiang Apr 15, 2021
bb8187f
Use vector instead of set for memory reduce (#4653)
chengtbf Apr 16, 2021
c4a2f06
Use cmake type release by default (#4658)
jackalcooper Apr 16, 2021
bd4aee4
pure stateful opkernel (#4648)
daquexian Apr 16, 2021
912b9f7
Add nvtx start end op (#4652)
guo-ran Apr 16, 2021
d20e2ba
refine (#4641)
jackalcooper Apr 17, 2021
1446c40
of_protoobj and of_cfgobj only link protobuf (#4650)
jackalcooper Apr 17, 2021
035d044
Dev autograd interpreter and rewrite batch gather grad function. (#4654)
hjchen2 Apr 19, 2021
7a75bdd
tensor with requires grad setter (#4552)
poohRui Apr 19, 2021
26972b5
Add OF_PROFILER_LOG_HOST_MEMORY_USAGE (#4664)
liujuncheng Apr 19, 2021
69c907c
prune hierarchical_parallel_cast (#4665)
guo-ran Apr 19, 2021
6e050aa
Refactor Global CommNetwork struct and remove deprecated (#4643)
chengtbf Apr 19, 2021
dd1eddc
Axpy use half2 (#4667)
guo-ran Apr 19, 2021
badcc1e
Remove RtBlobDesc (#4644)
chengtbf Apr 19, 2021
1b6ed70
nvtx start end add to clear_list (#4671)
guo-ran Apr 19, 2021
2983103
fused_bias_add_dropout use bias_add when rate is 0 or predict (#4668)
guo-ran Apr 19, 2021
2c269e5
Feat: nccl_use_compute_stream support batch accumulation (#4618)
chengtbf Apr 19, 2021
a84f1d8
Add expand op implementation (#4164)
Ldpe2G Apr 20, 2021
24f3392
CMake creates a source file in build dir (#4676)
jackalcooper Apr 20, 2021
0a6b7dd
refactor tensor/module/op register function (#4677)
poohRui Apr 20, 2021
49dff58
add memory detect info
levi1993 Apr 20, 2021
fa9ab9e
feat(no_grad): add no_grad scope in cpp and python (#4681)
wyg1997 Apr 20, 2021
15aac43
Refactor op expr. (#4673)
hjchen2 Apr 20, 2021
7985af9
acc 0 use memcpy instead of memset axpy (#4682)
guo-ran Apr 20, 2021
f9ab506
add error info (#4685)
guo-ran Apr 20, 2021
ecb00cd
Add more grad functions (#4672)
hjchen2 Apr 20, 2021
47028cc
small fix in opattrref optimize
levi1993 Apr 21, 2021
8b17626
use bitset
jackalcooper Apr 21, 2021
6166733
small fix in OpAttributeRef optimize (#4690)
levi1993 Apr 21, 2021
5c91bbb
run FuseAddToOutputPass again to fuse ops created in the first run (#…
guo-ran Apr 21, 2021
32a89f8
Merge branch 'master' of https://github.com/Oneflow-Inc/oneflow into …
jackalcooper Apr 21, 2021
19c1dd0
prune amp_white_identity (#4683)
guo-ran Apr 21, 2021
cdeac21
Merge branch 'master' into bitset_reachable
jackalcooper Apr 21, 2021
1b618b5
refactor using vector
jackalcooper Apr 21, 2021
16a7b0a
refine
jackalcooper Apr 21, 2021
4890265
refine
jackalcooper Apr 21, 2021
fb72d0e
rename
jackalcooper Apr 21, 2021
e1530eb
refine
jackalcooper Apr 21, 2021
08d805c
Broadcast matmul (#4688)
leaves-zwx Apr 21, 2021
bed1be4
address review
jackalcooper Apr 22, 2021
7988a35
address review
jackalcooper Apr 22, 2021
e1b180f
refine
jackalcooper Apr 22, 2021
22e871d
refine
jackalcooper Apr 22, 2021
74f7855
address review
jackalcooper Apr 22, 2021
4380a0b
Reserve space to avoid resize. (#4699)
hjchen2 Apr 22, 2021
b13d2c9
smaller BITSET_SIZE
jackalcooper Apr 22, 2021
72ab184
refine
jackalcooper Apr 22, 2021
6b9dd35
refine
jackalcooper Apr 22, 2021
80942e5
refine
jackalcooper Apr 22, 2021
5f24817
Remove oneflow_api module (#4697)
jackalcooper Apr 22, 2021
2bfcc32
refine nameing
jackalcooper Apr 22, 2021
c525a19
refine
jackalcooper Apr 22, 2021
ab91007
Fix transpose op (#4695)
leaves-zwx Apr 22, 2021
36190cd
Reserve space to avoid resize. (#4701)
hjchen2 Apr 22, 2021
5855033
refine
jackalcooper Apr 22, 2021
d616db1
Merge branch 'master' of https://github.com/Oneflow-Inc/oneflow into …
jackalcooper Apr 22, 2021
bc9e9f8
refine
jackalcooper Apr 22, 2021
761f23c
Remove user op conf in kernel init ctx (#4659)
clackhan Apr 22, 2021
5939950
Fix memory leaks caused by circle reference. (#4707)
hjchen2 Apr 22, 2021
c652aa8
flow.config.enable_mem_chain_merge(val=True) (#4704)
chengtbf Apr 22, 2021
6dc4d58
Relax tolerance for broadcast matmul test (#4700)
leaves-zwx Apr 22, 2021
6e9c8de
Merge branch 'master' into bitset_reachable
oneflow-ci-bot Apr 22, 2021
37c7d91
Support python3 -m oneflow --doctor (#4692)
jackalcooper Apr 23, 2021
1642063
Merge branch 'master' into bitset_reachable
oneflow-ci-bot Apr 23, 2021
1fa89a2
update
levi1993 Apr 23, 2021
f26ba11
Feat autograd interface and bug fix (#4691)
wyg1997 Apr 23, 2021
c93e3f5
fix sbp (#4709)
leaves-zwx Apr 23, 2021
a1eeb72
Fix spelling error (#4706)
chengtbf Apr 23, 2021
20a18fa
Remove outdate use_memory_allocation_algorithm_v2 (#4703)
chengtbf Apr 23, 2021
2c7712b
merge bitset
levi1993 Apr 23, 2021
9a8a153
delete swp file
levi1993 Apr 23, 2021
e0fd538
add broadcast_matmul to amp white list (#4714)
leaves-zwx Apr 23, 2021
2a8244b
rm useless reshape_op_util (#4702)
guo-ran Apr 23, 2021
0ebc2c8
fix docs export bug (#4715)
BBuf Apr 23, 2021
d45ab83
Fix test threshold in TripletMarginLoss (#4710)
MARD1NO Apr 23, 2021
62c70a3
Use bitset in MakePredicatorIsReachable to reduce memory usage (#4693)
jackalcooper Apr 23, 2021
8295059
support stashing job after passes (#4662)
mosout Apr 23, 2021
5feeb48
Remove MemoryCaseUtil::MergeThrdMemZoneId (#4705)
chengtbf Apr 23, 2021
ac37bd4
RepeatOp regst num hacking (#4684)
liujuncheng Apr 23, 2021
8a43b31
eager local stateful kernel (#4559)
daquexian Apr 23, 2021
e04c629
Add instruction to release tensor (#4717)
poohRui Apr 23, 2021
f3b4d6e
Fix bugs encountered by RNN (#4719)
hjchen2 Apr 24, 2021
6f42eb2
reshape ops infer parallel_distribution (#4716)
guo-ran Apr 24, 2021
078798e
Fused SelfAttention query multipy key and value op (#4660)
leaves-zwx Apr 24, 2021
9ed56bb
fix expand op docs bug (#4722)
BBuf Apr 25, 2021
98725b4
Add diag op (#4382)
hengzi Apr 25, 2021
ffa435a
small update
levi1993 Apr 25, 2021
ceed828
format fix
levi1993 Apr 25, 2021
dfcd6f7
Merge remote-tracking branch 'origin/master' into lml/mem_optimize
levi1993 Apr 25, 2021
99fec55
format modify
levi1993 Apr 25, 2021
f1e2c44
format modify
levi1993 Apr 25, 2021
90b8acc
Update compiler.cpp
levi1993 Apr 26, 2021
9da06ed
Update reshape_user_op_util.cpp
levi1993 Apr 26, 2021
57cd081
eager mirrored tensor refactor (#4711)
daquexian Apr 26, 2021
0d1ce5b
fix mirrored eager tensor with empty parallel desc (#4670)
poohRui Apr 26, 2021
fe153b8
Change get op attributes to get interface op attributes (#4733)
clackhan Apr 26, 2021
0e5cea2
Move tasks when pushing plan (#4708)
jackalcooper Apr 26, 2021
548ff6e
release memory of tas node on the fly
levi1993 Apr 26, 2021
d2cdcf9
fuse cast scalar_mul scalar_mul_by_tensor (#4730)
guo-ran Apr 26, 2021
1f9fbca
fix a bug
levi1993 Apr 26, 2021
d482157
modify comments
levi1993 Apr 26, 2021
15d28d7
dev empty op (#4720)
doombeaker Apr 26, 2021
6708aac
Update task_graph.h
levi1993 Apr 26, 2021
df7ff88
clear edges for tash_gph and use ClearEdges instead of DeleteNode
levi1993 Apr 26, 2021
044f0d1
Update task_graph.cpp
levi1993 Apr 26, 2021
59f1487
release tensor by ReleaseTensor instruction (#4737)
daquexian Apr 26, 2021
7007768
import oneflow in custom_exit (#4736)
jackalcooper Apr 26, 2021
605b74f
remove SourceIntruction, add ResettingIdToObjectMap, fix bug in ForEa…
daquexian Apr 27, 2021
7ff7af2
Refactor physical run (#4713)
lixinqi Apr 27, 2021
e62d63b
make parameter local for now (#4746)
daquexian Apr 27, 2021
dd38d7e
Remove GetPtrOrThrow in the core directory (#4743)
hengzi Apr 27, 2021
670762b
Rename InferXXXFn as XXXInferFn (#4739)
wanghongsheng01 Apr 27, 2021
e556837
support add_to_output fusion (#4749)
leaves-zwx Apr 27, 2021
13411eb
Refactor namespace eager (#4727)
hengzi Apr 27, 2021
3075ef5
stateful local kernel supports dynamic attrs (#4745)
daquexian Apr 27, 2021
339cf5f
Add resnet linear module (#4742)
BBuf Apr 27, 2021
77e0344
New eager test for interface 1.0 (#4741)
jackalcooper Apr 28, 2021
214c3cb
remove useless register func in tensor (#4750)
poohRui Apr 28, 2021
1d7c9ee
fix summary_graph bug (#4712)
hsj0429 Apr 28, 2021
5434a49
fix tmp_buffer in stateful local kernel (#4757)
daquexian Apr 28, 2021
0beac81
Add nn.AvgPool2d Module (#4623)
doombeaker Apr 28, 2021
6c942cc
add exp_tanh_gelu module (#4751)
BBuf Apr 28, 2021
6f3fc5a
Lml/mem optimize (#4725)
levi1993 Apr 28, 2021
b1907da
Merge branch 'master' into lml/release_memory_of_task_node_on_the_fly
levi1993 Apr 28, 2021
346c697
add default value for oneflow range (#4771)
MARD1NO Apr 28, 2021
a63b53e
b21 boxing add ctrl_edge (#4770)
guo-ran Apr 28, 2021
efa8dc7
model_io_v2 process multi variable (#4762)
guo-ran Apr 28, 2021
c936361
add greater_less_argmax module (#4756)
BBuf Apr 28, 2021
f2b28e3
Fix NcclLogicalS2S kernel comm BUG. (#4774)
chengtbf Apr 28, 2021
87e83a6
Add flatten module and unit test (#4759)
Ldpe2G Apr 28, 2021
ce04eff
NCCL logical op support S1 to B (#4772)
chengtbf Apr 29, 2021
d24969f
make ClearNodes and ClearEdges protected
levi1993 Apr 29, 2021
b03a77e
Merge branch 'master' into lml/release_memory_of_task_node_on_the_fly
levi1993 Apr 29, 2021
1266c94
fix bug about pos_weight (#4768)
MARD1NO Apr 29, 2021
9361227
rm prefix Protected
levi1993 Apr 29, 2021
022a2eb
Merge branch 'lml/release_memory_of_task_node_on_the_fly' of github.c…
levi1993 Apr 29, 2021
c0bcdc7
small fix
levi1993 Apr 29, 2021
ba18214
Create default log under log dir (#4766)
poohRui Apr 29, 2021
4bc260a
Dev refactor attr value map (#4755)
hjchen2 Apr 29, 2021
feb5d18
Merge branch 'master' into lml/release_memory_of_task_node_on_the_fly
levi1993 Apr 29, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
10 changes: 4 additions & 6 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
BraceWrapping:
AfterClass: true
AfterControlStatement: false
AfterEnum: false
Expand All @@ -37,19 +37,18 @@ BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 80
ColumnLimit: 100
CommentPragmas: '^ IWYU pragma:'
BreakBeforeInheritanceComma: false
ConstructorInitializerAllOnOneLineOrOnePerLine: true
ConstructorInitializerIndentWidth: 4
ContinuationIndentWidth: 4
Cpp11BracedListStyle: true
DerivePointerAlignment: true
DisableFormat: false
ExperimentalAutoDetectBinPacking: false
FixNamespaceComments: true
ForEachMacros: [ foreach, Q_FOREACH, BOOST_FOREACH ]
IncludeCategories:
IncludeCategories:
- Regex: '^<.*\.h>'
Priority: 1
- Regex: '^<.*'
Expand Down Expand Up @@ -78,7 +77,7 @@ PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 200
PointerAlignment: Left
ReflowComments: true
SortIncludes: true
SortIncludes: false
SpaceAfterCStyleCast: false
SpaceAfterTemplateKeyword: false
SpaceBeforeAssignmentOperators: true
Expand All @@ -94,4 +93,3 @@ Standard: Cpp11
TabWidth: 8
UseTab: Never
...

35 changes: 35 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
**/.git
/build
/build-*
/docs/build
/cmake-build-*
/third_party
/examples/**/oneflow
/benchmark/**/oneflow
/.vscode
/.idea
/.clangd
/dist
/wheelhouse*
/.DS_Store
/tmp_wheel
/manylinux*

**/__pycache__
**/*.pyc
**/log
**/.ipynb_checkpoints
**/core.0*
**/core.1*
**/core.2*
**/core.3*
**/core.4*
**/core.5*
**/core.6*
**/core.7*
**/core.8*
**/core.9*
/.cache
/oneflow-src.zip
/distributed-tmp
/serving-tmp
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
* @willzhang4a58
*test.cpp @lixinqi
/oneflow/core/kernel/opkernel_test_case.* @lixinqi
11 changes: 11 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE/general_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
## 概述


## PR Checklist
- [ ] PR 标题语句通畅,明确表达 PR 内容,适合直接作为新版本发布时的 changelog
- [ ] 代码格式化
- [ ] 已经本地编译通过
- [ ] 已本地针对改动测试
- [ ] 已添加 type 标签:(填写 type 标签名,如 `bug, enhancement, purge, feature, documentation`)
- [ ] 已添加 component 标签:(填写 component 标签名,如 `op, system, eager, build, xla, python, ci, test, tooling, onnx`)
- [ ] Draft 转正式 PR 前已请人 Review
69 changes: 69 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE/op_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
## 概述
描述 op 的功能、公式等。若参考了其它框架的接口,应列出超链接。

## 功能 CheckList
**注意** : 功能复选框均为可选项,若未选择,说明理由即可。例如:该 Op 由 Python 接口拼接而成,因此无 `SetBatchAxisInferFn` Op 注册;再比如:该 Op 无输入,因此无 `SetInputArgModifyFn`。

模板中自带的复选框可留空,但是不能删除。可根据实际情况增加复选框选项。

### Op
- [ ] Op SetBatchAxisInferFn
- [ ] Op SetGetSbpFn
- [ ] Op SetInputArgModifyFn
- [ ] Op 反向梯度注册

### Kernel
- [ ] CPU in:float32
- [ ] CPU in:float64
- [ ] CPU in:int32
- [ ] CPU in:int64
- [ ] CPU in:int8

- [ ] GPU in:float32
- [ ] GPU in:float64
- [ ] GPU in:int32
- [ ] GPU in:int64
- [ ] GPU in:float16
- [ ] GPU in:int8


### Python Wrapper
- [ ] Python API 参数检查及异常提示
- [ ] 接口注释
- [ ] Example 

### 测试
- [ ] 单机单卡 CPU Test Case
- [ ] 单机单卡 GPU Test Case
- [ ] 单机多卡 CPU Test Case
- [ ] 单机多卡 GPU Test Case
- [ ] 分布式 CPU Test Case
- [ ] 分布式 GPU Test Case

## GPU 有效带宽
带 GPU 的 Op,请参考 https://github.com/Oneflow-Inc/OneTeam/issues/167 测试有效带宽,并附带测试报告。
以下是报告样例:

理论带宽:
```text
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 250798.5
```

实际带宽:
```
PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2 elapsed(ms): 0.196064 memory_size(Byte): 50331648 bandwidth(GB/s): 239.08
PROFILER::KERNEL::CUDA_MEMORY_BANDWIDTH op_name: sqrt_2_grad elapsed(ms): 0.29072 memory_size(Byte): 75497472 bandwidth(GB/s): 241.856
```


## PR Checklist
- [ ] PR 标题语句通畅,明确表达 PR 内容,适合直接作为新版本发布时的 changelog
- [ ] 代码格式化
- [ ] 已经本地编译通过
- [ ] 已本地针对改动测试
- [ ] 已添加 type 标签:(填写 type 标签名,如 `bug, enhancement, purge, feature, documentation`)
- [ ] 已添加 component 标签:(填写 component 标签名,如 `op, system, eager, build, xla, python, ci, test, tooling, onnx`)
- [ ] Draft 转正式 PR 前已请人 Review
47 changes: 47 additions & 0 deletions .github/actions/mac-build/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: "Build OneFlow on macOS"
description: ""
runs:
using: "composite"
steps:
- name: Install dependencies
run: |
brew install nasm
shell: bash
- name: Set environment variables
run: |
set -x
cmake_flags=""
cmake_flags+=" -DPython3_EXECUTABLE=$(which python3)"
cmake_flags+=" -DRPC_BACKEND=LOCAL"
cmake_flags+=" -DCMAKE_BUILD_TYPE=Release"
cmake_flags+=" -DBUILD_CUDA=OFF"
echo "cmake_flags=${cmake_flags}" >> $GITHUB_ENV
shell: bash
- name: Build (third party)
run: |
mkdir -p build
cd build
cmake .. $cmake_flags -DTHIRD_PARTY=ON -DONEFLOW=OFF
make -j $(nproc)
shell: bash
- name: Build (of_ccobj)
run: |
mkdir -p build
cd build
cmake .. $cmake_flags -DTHIRD_PARTY=OFF -DONEFLOW=ON
make -j 2 of_ccobj
shell: bash
- name: Build (oneflow_internal)
run: |
mkdir -p build
cd build
cmake .. $cmake_flags -DTHIRD_PARTY=OFF -DONEFLOW=ON
make -j 2 oneflow_internal
shell: bash
- name: Build (generate_api)
run: |
mkdir -p build
cd build
cmake .. $cmake_flags -DTHIRD_PARTY=OFF -DONEFLOW=ON
make -j 2 generate_api
shell: bash
14 changes: 14 additions & 0 deletions .github/actions/setup/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
inputs:
name:
description: 'Placeholder'
default: 'Placeholder'
runs:
using: "composite"
steps:
- run: |
echo $HOSTNAME
rm -rf build/third_party
bash ci/setup_submodule.sh
auth_header="$(git config --local --get http.https://github.com/.extraheader)"
git -c "http.extraheader=$auth_header" -c protocol.version=2 submodule update --init --recursive
shell: bash
31 changes: 31 additions & 0 deletions .github/actions/upload_oss/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
inputs:
src_path:
required: true
oss_dst_path:
required: true
oss_access_key_id:
required: true
oss_access_key_secret:
required: true
runs:
using: "composite"
steps:
- run: |
if [ -z "$OSS_ACCESS_KEY_ID" ]
then
exit 0
fi
if [ ! -f "$HOME/ossutil64" ]; then
curl http://gosspublic.alicdn.com/ossutil/1.6.19/ossutil64 -o $HOME/ossutil64
fi
chmod 755 $HOME/ossutil64
$HOME/ossutil64 config -e oss-cn-beijing.aliyuncs.com -i ${{ inputs.oss_access_key_id }} -k ${{ inputs.oss_access_key_secret }} -L EN -c $HOME/.ossutilconfig
dir_arg=""
if [ -d "${{ inputs.src_path }}" ]; then
dir_arg="--recursive"
fi
$HOME/ossutil64 cp --update ${dir_arg} ${{ inputs.src_path }} ${{ inputs.oss_dst_path }}
shell: bash
env:
OSS_ACCESS_KEY_ID: ${{ inputs.oss_access_key_id }}
OSS_ACCESS_KEY_SECRET: ${{ inputs.oss_access_key_secret }}
26 changes: 26 additions & 0 deletions .github/actions/upload_ssh/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: "Upload via ssh"
description: ""
inputs:
src_path:
required: true
description: ""
dst_host:
required: true
description: ""
dst_path:
required: true
description: ""
runs:
using: "composite"
steps:
- run: |
set -x
dir_arg=""
if [ -d "${{ inputs.src_path }}" ]; then
dir_arg="-r"
fi
parent_dir=$(dirname ${{ inputs.dst_path }})
ssh -o StrictHostKeyChecking=no ${{ inputs.dst_host }} mkdir -p $parent_dir
ssh ${{ inputs.dst_host }} rm -rf ${{ inputs.dst_path }}
scp ${dir_arg} ${{ inputs.src_path }} ${{ inputs.dst_host }}:${{ inputs.dst_path }}
shell: bash
36 changes: 36 additions & 0 deletions .github/actions/whl/action.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
inputs:
tmp_dir:
description: "tmp dir"
required: true
cuda_version:
description: "cuda_version"
default: "10.2"
python_version:
description: "python_version"
default: "3.6"
extra_flags:
description: "flags like --xla"
default: ""
extra_docker_args:
description: ""
default: ""
runs:
using: "composite"
steps:
- run: |
set -x
src_dir=${PWD}
tmp_dir="${{ inputs.tmp_dir }}"
mkdir -p ${tmp_dir}
cd ${tmp_dir}
docker run --rm -v $PWD:/p -w $PWD:/p busybox rm -rf /p/wheelhouse
python3 ${src_dir}/docker/package/manylinux/build_wheel.py \
--cuda_version=${{ inputs.cuda_version }} \
--python_version=${{ inputs.python_version }} \
--use_tuna --use_system_proxy --use_aliyun_mirror \
--wheel_house_dir=${tmp_dir}/wheelhouse \
--oneflow_src_dir=${src_dir} ${{ inputs.extra_flags }} \
--retry=1 \
--skip_img \
--extra_docker_args "${extra_docker_args}"
shell: bash
19 changes: 19 additions & 0 deletions .github/workflows/pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
name: Check PR

on:
pull_request:
types: [opened, labeled, unlabeled, synchronize]

jobs:
check_labels:
runs-on: ubuntu-18.04
name: Labels
steps:
- name: Check type labels 'bug, enhancement, purge, feature, documentation'
if: (contains(github.event.pull_request.labels.*.name, 'bug') || contains(github.event.pull_request.labels.*.name, 'enhancement') || contains(github.event.pull_request.labels.*.name, 'purge') || contains(github.event.pull_request.labels.*.name, 'feature') || contains(github.event.pull_request.labels.*.name, 'documentation')) == false
run: |
exit 1
- name: Check component labels 'op, system, eager, build, xla, python, ci, test, tooling, onnx'
if: (contains(github.event.pull_request.labels.*.name, 'op') || contains(github.event.pull_request.labels.*.name, 'system') || contains(github.event.pull_request.labels.*.name, 'eager') || contains(github.event.pull_request.labels.*.name, 'build') || contains(github.event.pull_request.labels.*.name, 'xla') || contains(github.event.pull_request.labels.*.name, 'python') || contains(github.event.pull_request.labels.*.name, 'ci') || contains(github.event.pull_request.labels.*.name, 'test') || contains(github.event.pull_request.labels.*.name, 'tooling') || contains(github.event.pull_request.labels.*.name, 'onnx')) == false
run: |
exit 2
Loading