Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync ORTModule branch with master and fix tests #6526

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
202 commits
Select commit Hold shift + click to select a range
64709b1
Deprecate Python global configuration functions [Part 1] (#5923)
edgchen1 Dec 15, 2020
297c824
remove dnnl_dll_path from post build copy (#6142)
jywu-msft Dec 15, 2020
980a93c
Model Fusion For Bart (#6105)
liuziyue Dec 15, 2020
ac62cf8
Unify IExecutionProvider and IExecutionProviderFactory interfaces (#6…
RyanUnderhill Dec 16, 2020
939cc9b
Enable running the mnist_training sample without cuda (#6085)
georgen117 Dec 16, 2020
b648bf6
nnapi add min max support (#6117)
guoyu-wang Dec 16, 2020
0978d2b
Fix CUDA test hang: (#6138)
toothache Dec 16, 2020
aa49e47
Fix TensorRT kernel conflict issue for subgraphs of control flow oper…
stevenlix Dec 16, 2020
8fd0858
Add gradient registration for Abs. (#6139)
Dec 16, 2020
8269048
Partition initial optimizer state for Zero-1 (#6093)
ashbhandare Dec 16, 2020
7250562
Fix edge case in BFCArena where allocation failures could lead to an …
skottmckay Dec 16, 2020
344a2a8
Revert "work around of the build break in mac (#6069)" (#6150)
snnn Dec 16, 2020
0fa04bd
Fix clean_docker_image_cache.py detection of image pushes. (#6151)
edgchen1 Dec 17, 2020
503b61d
MLAS: add NEON version of int8 depthwise convolution (#6152)
tracysh Dec 17, 2020
36c03b3
Using a map of of ops to stages as input of partition function. (#5940)
Dec 17, 2020
efa1b0d
Minor fix to satisfy c++14 (#6162)
pranavsharma Dec 17, 2020
32c67c2
Deprecating Horovod and refactored Adasum computations (#5468)
Dec 18, 2020
dec703b
Update TensorRT-ExecutionProvider.md (#6161)
jayrodge Dec 18, 2020
34725ae
Bugfix for topk cuda kernel (#6164)
duli2012 Dec 18, 2020
98d8a3e
Revert "Fuse MatMulIntegerToFloat only when scales are scalar (#6008)…
yufenglee Dec 18, 2020
c339bb2
Remove ignored build warnings for pybind on Mac (#6165)
guoyu-wang Dec 18, 2020
adc2071
save_checkpoint, load_checkpoint and aggregate_checkpoints (#6136)
baijumeswani Dec 18, 2020
824ef9a
Don't try to bind unused inputs in the Training frontend (#6166)
Dec 18, 2020
86493e6
Update documentation for contributing a PR and add deprecation notice…
pranavsharma Dec 18, 2020
39aedbc
aggregate model states only for the case when mixed precision was tru…
baijumeswani Dec 18, 2020
bbb52e9
[NNAPI EP] Enable per-channel quantization for QlinearConv (#6155)
guoyu-wang Dec 19, 2020
11b0a54
Fix typo in BERT pretraining script (#6175)
Dec 19, 2020
cd3a5ac
Update get_docker_image.py to enable use without image cache containe…
edgchen1 Dec 19, 2020
2da8060
Helper for compiling EP to generate deterministic unique ids for use …
skottmckay Dec 21, 2020
f874260
Backend APIs for checkpointing (#5803)
jingyanwangms Dec 21, 2020
201d0db
Android coverage dashboard (#6163)
satyajandhyala Dec 21, 2020
ea9cfa5
Add usage details of unified MCR container image (#6182)
smkarlap Dec 21, 2020
53307a5
improve perf for softmax (#6128)
weixingzhang Dec 21, 2020
67ac6ae
Tune fast Gelu to use exp(x) instead of tanh(x) on Rocm platform (#6174)
Dec 22, 2020
234e94b
Add Status.csv to EP Perf Tool (#6167)
oliviajain Dec 22, 2020
945fae8
Lochi/quantization tool for trt (#6103)
chilo-ms Dec 22, 2020
fc27074
Implement ScatterND for CUDA EP (#6184)
hariharans29 Dec 22, 2020
04b3e0e
Condition fix in Resize operator (#6193)
hariharans29 Dec 22, 2020
a8b4826
Clean up checkpoint tests to use the new checkpoint functions (#6188)
baijumeswani Dec 22, 2020
21395f8
Implement comparing outputs that are sequence of maps of strings to f…
Dec 22, 2020
c562952
Dockerfile to build onnxruntime with ROCm 4.0
jessebenson Dec 21, 2020
0494a0f
Add ability to skip GPU tests based on GPU adapter name (#6198)
Dec 22, 2020
7347996
Openvino ep 2021.2 (#6196)
sfatimar Dec 23, 2020
1fc7f92
Fix a memory leak in test_inference.cc (#6201)
snnn Dec 25, 2020
52228a7
Use TArray in AMD element-wise kernels, rather than manually copying …
jessebenson Dec 22, 2020
7ccdfed
Remove most ROCm-specific element-wise code and reuse CUDA element-wi…
jessebenson Dec 22, 2020
8a0f5c5
Minor change to improve performance for operator Pad. (#5537)
xadupre Dec 28, 2020
2d09db6
Support double for operators Log, Reciprocal, Sum (CPU) (#6032)
xadupre Dec 28, 2020
111ac29
Support double for operators Where, LpNormalisation (#6034)
xadupre Dec 28, 2020
df7e2f3
Support double for operators Relu, Tanh, Sigmoid (#6221)
xadupre Dec 29, 2020
bbb6b41
Fix ImportError in build.py (#6231)
mgoin Dec 30, 2020
5c584b2
Removed executor todo that looks dead. (#6234)
michaelgiba Dec 31, 2020
1b23b28
Remove MKLML/openblas/jemalloc build config (#6212)
snnn Dec 31, 2020
3911105
Remove python 3.5
snnn Dec 31, 2020
c15a858
Update the readme file
snnn Dec 31, 2020
39a988c
Upgrade build.py to assert for python 3.6+
WilliamTambellini Dec 1, 2020
4cc2ffe
Support MLFloat16 type in Pow opset-12 CUDA kernel (#6233)
hariharans29 Dec 31, 2020
ecb2e11
MLAS: handle MlasGemm(M/N/K==0) cases (#6238)
tracysh Dec 31, 2020
70e2f96
Support double for operator TopK + fix one bug in TopK implementation…
xadupre Dec 31, 2020
5968a91
Support double for operator Gemm + fix bug in gemm implementation for…
xadupre Dec 31, 2020
84addcd
Support double for operator ReduceMean, ReduceLogSumExp (#6217)
xadupre Dec 31, 2020
cd14c1a
Support double for operator ArgMin (#6222)
xadupre Dec 31, 2020
d5cb17c
Update BUILD.md
snnn Dec 31, 2020
1685167
Update manylinux docker image to the latest (#6242)
snnn Jan 1, 2021
ffb4b62
Fix allocator issue for TensorRT IOBinding (#6240)
HectorSVC Jan 1, 2021
46e0e4e
Tune BiasGeluGradDx kernel in approximation mode to avoid tanh(...) o…
Jan 2, 2021
c8de3f3
Refactor EP Perf Tool (#6202)
oliviajain Jan 4, 2021
93bf7c4
Documentation for distributed CI tests pipeline (#6140)
baijumeswani Jan 4, 2021
6fd9d34
Remove a debug log in provider_test_utils.cc (#6200)
snnn Jan 4, 2021
493bf93
Add the Concat Slice Elimination transform, fix constant_folding tran…
ashbhandare Jan 5, 2021
ce6161c
Add MakeStringLite which uses current locale, update some MakeString …
edgchen1 Jan 5, 2021
addb4b8
Liqun/speech model loop to scan (#6070)
liqunfu Jan 5, 2021
eea3806
model parallel refinement (#6244)
pengwa Jan 6, 2021
d42399e
Allow querying a GraphProto's doc_string as part of ModelMetadata (#6…
hariharans29 Jan 6, 2021
2347de4
Fix Linux/Mac error message on input type mismatch (#6256)
hariharans29 Jan 6, 2021
431604e
add bfloat16 to gathergrad type constrains (#6267)
souptc Jan 6, 2021
bbc9ed9
Fix VS 2017 build break (#6276)
hariharans29 Jan 7, 2021
d761571
Deprecate Python global configuration functions [Part 2] (#6171)
edgchen1 Jan 7, 2021
481a2cd
Add script to preprocess python documentation before publishing (#6129)
xadupre Jan 7, 2021
b80e8ce
rename past to past_key_values for GPT-2 (#6269)
tianleiwu Jan 7, 2021
c109486
Rename MakeString and ParseString functions. (#6272)
edgchen1 Jan 7, 2021
04287ec
Increase timeout for Linux GPU CUDA11 build. (#6280)
edgchen1 Jan 7, 2021
a72fcbd
Add helper to compare model with different precision (#6270)
wangyems Jan 8, 2021
7fc827a
Fix Min/Max CPU kernels for float16 type (#6205)
hariharans29 Jan 8, 2021
ac5ca2b
fix data_ptr assertion error for past_sequence_length=0 in GPT-2 (#6284)
tianleiwu Jan 8, 2021
da952a9
A list of changes in transformers tool (#6224)
wangyems Jan 8, 2021
1059bfa
Workaround for static_cast<double>(half)
jessebenson Jan 8, 2021
fa851bf
Add workaround to remove ROCm-specific binary-elementwise files.
jessebenson Jan 8, 2021
5084ce0
Update nuget build (#6297)
snnn Jan 11, 2021
84024bd
Enable ONNX backend test of SequenceProto input/output (#6043)
jcwchen Jan 11, 2021
938e65d
add --sequence_lengths option (#6285)
tianleiwu Jan 11, 2021
ac5b5e5
more dtype for Equal CUDA kernel (#6288)
centwang Jan 12, 2021
c43ca45
Force reinstall onnx python package on Windows (#6309)
snnn Jan 12, 2021
a038924
update transformers required package versions (#6315)
tianleiwu Jan 12, 2021
3b3e698
Remove abs in LpPool (#6303)
luyaor Jan 12, 2021
a825766
Support 1D input for Conv + Mul/Add fusion optimizer with test (#6295)
zhanghuanrong Jan 12, 2021
ec81e29
Add longformer to python package (#6314)
tianleiwu Jan 12, 2021
b491d7c
Avoid false sharing on thread pool data structures (#6298)
tlh20 Jan 12, 2021
0ed56d4
fix opset imports for function body (#6287)
askhade Jan 12, 2021
aacc8db
Remove false positive prefast warning from threadpool (#6324)
tlh20 Jan 12, 2021
6b73bae
Java: add Semmle to Java publishing pipelines (#6326)
yuslepukhin Jan 12, 2021
f77ff1b
Quantization support for split operator with its NHWC support (#6107)
zhanghuanrong Jan 13, 2021
aeca96c
Liqun/enable pipeline parallel test (#6331)
liqunfu Jan 13, 2021
5623cc6
Use onnxruntime_USE_FULL_PROTOBUF=OFF for the cuda execution provider…
alberto-magni Jan 13, 2021
87ec1f6
MLAS: add fallback implementation for quantized GEMM (#6335)
tracysh Jan 13, 2021
56ab216
Delete float16.py (#6336)
oliviajain Jan 13, 2021
62e4045
Enable add + softmax fusion for Rocm platform (#6259)
Jan 13, 2021
f7034b9
add external data support to tensor proto utils (#6257)
askhade Jan 13, 2021
d367941
changed wording. (#6337)
Jan 13, 2021
cfd6f10
Remove OpSchema dummy definition. Only needed for Function now, and w…
skottmckay Jan 13, 2021
fcd9fc9
remove gemmlowp submodule (#6341)
tracysh Jan 13, 2021
b220fee
[NNAPI] Add pow support (#6310)
guoyu-wang Jan 14, 2021
042053c
Add support for running Android emulator from build.py on Windows. (#…
edgchen1 Jan 14, 2021
e35db19
fix the pipeline failure (#6346)
guoyu-wang Jan 14, 2021
4df356d
Train BERT Using BFloat16 on A100 (#6090)
centwang Jan 14, 2021
5b9d993
Fix DerefNullPtr issues raised by SDLNativeRules. (#6348)
pranavsharma Jan 14, 2021
c24f295
update quantize to support basic optimization and e2e example for ima…
yufenglee Jan 14, 2021
fd21c84
Enable graph save for orttrainer (#6333)
ashbhandare Jan 14, 2021
ea6789b
Add PREfast to python packaging pipeline (#6343)
snnn Jan 14, 2021
5d9552c
fix longformer benchmark io_binding output_buffers (#6345)
wangyems Jan 14, 2021
e54e2f9
Use readelf for minimal build binary size checks. (#6338)
skottmckay Jan 14, 2021
6d0fb3e
Java: Set C language warnings to W4 and adjust JNI code (#6347)
yuslepukhin Jan 14, 2021
8ce252c
Pipeline Parallel Experimental Python API (#5815)
wschin Jan 15, 2021
961bb62
Add create session to WinML telemetry to track WinML Usage (#6356)
Jan 15, 2021
c8e37e3
Fix one more SDL warning (#6359)
pranavsharma Jan 15, 2021
f5a4f7f
fix -Wdangling-gsl (#6357)
askhade Jan 15, 2021
eab164e
Add python example of TensorRT INT8 inference on ResNet model (#6255)
stevenlix Jan 15, 2021
4db4982
This added telemetry isn't needed (#6363)
Jan 16, 2021
5b6753c
Wezuo/memory analysis (#5658)
wezuo Jan 19, 2021
baac7c9
Support MLFloat16 in CumSum Cuda op for Opset 14 (#6355)
tianleiwu Jan 19, 2021
ac36596
fix convert_common version retrival (#6382)
wangyems Jan 19, 2021
d7bdd96
Refine auto_pad based pad computation in ConvTranspose (#6305)
hariharans29 Jan 20, 2021
a1b5bfc
Fix SDL warning (#6390)
hariharans29 Jan 20, 2021
453431f
Add max_norm for gradient clipping. (#6289)
pengwa Jan 20, 2021
69af044
Add the custom op project information (#6334)
wenbingl Jan 20, 2021
33f60a0
Dont use default string marshalling in C# (#6219)
hariharans29 Jan 21, 2021
d9e4795
Fix Windows x86 compiler warnings in the optimizers project (#6377)
hariharans29 Jan 21, 2021
8574854
[Perf] Optimize Tile CPU and CUDA kernels for a corner case (#6376)
hariharans29 Jan 21, 2021
eb946c4
Unblock Android CI code coverage failure (#6393)
guoyu-wang Jan 21, 2021
99a38f4
fix build on cuda11 (#6394)
centwang Jan 21, 2021
98cc7b5
Load the model path correctly (#6369)
MartinMoon Jan 21, 2021
bba185a
Fix some compile warnings (#6316)
snnn Jan 22, 2021
4442d94
OpenVino docker file changes to bypass privileged mode
smkarlap Jan 22, 2021
60c772e
Megatron checkpointing (#6293)
ashbhandare Jan 22, 2021
61ecf52
Fix generate_submodule_cgmanifest.py Windows issues. (#6404)
edgchen1 Jan 22, 2021
3c3d363
Continue memory planning when unknown shape tensor is encountered. (#…
codemzs Jan 22, 2021
6507b4f
Reintroduce experimental api changes and fix remote build break (#6385)
Jan 22, 2021
e1dc268
Add support for custom ops to minimal build. (#6228)
skottmckay Jan 25, 2021
c20965f
enable pipeline to run quantization tests (#6416)
yufenglee Jan 25, 2021
24f1bd6
Minor cmake change (#6431)
hariharans29 Jan 25, 2021
6ed1240
Liqun/liqun/enable pipeline parallel test2 (#6399)
liqunfu Jan 25, 2021
f3a0344
Farewell TrainableDropout (#5793)
codemzs Jan 26, 2021
7e42840
fix null dereference warning (#6437)
yufenglee Jan 26, 2021
76dbd88
Expose graph ModelPath to TensorRT shared library (#6353)
stevenlix Jan 26, 2021
afd7b8b
add tool for generating test data for longformer (#6415)
tianleiwu Jan 27, 2021
0d20104
only build experimental api in redist (#6465)
smk2007 Jan 27, 2021
9835b46
Add an option to save the training graph after optimization (#6410)
ryotatomioka Jan 27, 2021
b5d1a49
Share allocator between CUDA EP & TRT EP. (#6332)
HectorSVC Jan 27, 2021
fd43806
fix max norm clipping test in python packaging pipeline test (#6468)
pengwa Jan 27, 2021
c05adb1
Initial version of CoreML EP (#6392)
guoyu-wang Jan 27, 2021
d5f51c4
Bug 31463811: Servicing: Redist (Nuget) conflicts with Microsoft.AI.M…
smk2007 Jan 27, 2021
f68eb35
dequantize 1st input of lstm back if it is quantized (#6444)
yufenglee Jan 27, 2021
0100f33
[java] Adds support for OrtEnvironment thread pools (#6406)
Craigacp Jan 27, 2021
1ce1a51
fix SDL native rule warning #6246 (#6461)
fs-eire Jan 27, 2021
ed1ebd2
fix SDL rule (#6464)
fs-eire Jan 27, 2021
b6ac35f
use tickcount64 (#6447)
Jan 27, 2021
7a0ab9c
Update pypi package metadata (#6354)
faxu Jan 28, 2021
91b19b8
Delete nuget extra configs (#6477)
snnn Jan 28, 2021
d850fa6
Op kernel type reduction infrastructure. (#6466)
edgchen1 Jan 28, 2021
77d0eb3
Fixing a leak in OnnxSequences with String keys or values. (#6473)
Craigacp Jan 28, 2021
2e228d7
Increase the distributes tests pipeline timeout to 120 minutes (#6479)
baijumeswani Jan 28, 2021
752627c
[CoreML EP] Add CI for CoreML EP (macOS) and add coreml_flags for EP …
guoyu-wang Jan 28, 2021
c84bb9d
Add ability to track per operator types in reduced build config. (#6428)
skottmckay Jan 28, 2021
00afd00
merge e2e with distributed pipeline (#6443)
liqunfu Jan 28, 2021
ea2b560
Fix test breaks in Windows ingestion pipeline (#6476)
smk2007 Jan 28, 2021
3f60b27
Speed up the Mac CI runs (#6483)
guoyu-wang Jan 28, 2021
ce46f37
expose learningmodelpixelrange property (#5877)
zhangxiang1993 Jan 28, 2021
d4e1f5a
Fix of support api version bug for [de]quantize (#6492)
guoyu-wang Jan 29, 2021
21b4842
SDL fixes: add proper casts/format specifiers (#6446)
Jan 29, 2021
3b1227c
SDL annotation fixes (#6448)
Jan 29, 2021
1a5b75a
[OpenVINO-EP] Remove support for OpenVINO 2020.2 (#6493)
suryasidd Jan 29, 2021
7abb5b6
Support pad operator in quantization and quantized nhwc transformer. …
zhanghuanrong Jan 29, 2021
066520f
Improve work distribution for Expand operator, and sharded LoopCounte…
tlh20 Jan 29, 2021
d3203ad
Update document of transformer optimization (#6487)
tianleiwu Jan 29, 2021
71389ff
nuphar test to avoid test data download to improve passing rate (#6467)
liqunfu Jan 29, 2021
a19c48f
Fuse cuda conv with activation (#6351)
RandySheriffH Jan 29, 2021
06a6c63
[CoreML EP] Add support for some activations/Transpose, move some sha…
guoyu-wang Jan 29, 2021
8306150
Refine transformers profiler output (#6502)
tianleiwu Jan 29, 2021
8c6d76a
Update to match new test setup. (#6496)
skottmckay Jan 29, 2021
76bc0e4
Enable dense sequence optimized version of Pytorch exported BERT-L on…
Jan 29, 2021
7f57317
Optimize GatherGrad for AMD GPU (#6381)
weixingzhang Jan 29, 2021
76f5d9e
add explicit barriers for buffer overread and overrwrite (#6484)
Jan 29, 2021
531eb06
fix sdl bugs for uninitialized variables and returns (#6450)
Jan 29, 2021
3a30ad7
handle hr error conditions (#6449)
Jan 29, 2021
a36f627
Dnnl training (#6045)
georgen117 Jan 30, 2021
7c5bfba
Lochi/refactor yolov3 quantization (#6290)
chilo-ms Jan 30, 2021
f2872ff
Print a warning message for using newer c_api header on old binary (#…
guoyu-wang Jan 30, 2021
e5cbcec
Fix issues with ArmNN build setup (#6495)
skottmckay Jan 30, 2021
5b69cbe
Fix Windows CI builds by updating test scripts to work with numpy 1.2…
skottmckay Feb 1, 2021
891181d
Fix ORTModule branch for orttraining-* pipelines
Jan 29, 2021
6b890c2
Merge remote-tracking branch 'origin/master' into thiagofc/fix-orttra…
Feb 1, 2021
0432fa7
Update pytorch nightly version dependency
Feb 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add external data support to tensor proto utils (#6257)
* update unpack tensor utilities to support loading external data

* more updates

* fix test

* fix nuphar build

* minor build fix

* add tests

* fix Android CI

* fix warning

* fix DML build failure and some warnings

* more updates

* more updates

* plus few updates

* plus some refactoring

* changes per review

* plus some change

* remove temp code

* plus updates to safeint usage

* build fix

* fix for safeint
  • Loading branch information
askhade authored Jan 13, 2021
commit f7034b9bca705b31fa69f31486416795e0eccbd9
4 changes: 4 additions & 0 deletions include/onnxruntime/core/graph/graph.h
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,9 @@ class Node {
/** Gets the domain of the OperatorSet that specifies the operator returned by #OpType. */
const std::string& Domain() const noexcept { return domain_; }

/** Gets the path of the owning model if any. */
const Path& ModelPath() const noexcept;

/** Gets the Node's execution priority.
@remarks Lower value means higher priority */
int Priority() const noexcept { return priority_; };
Expand Down Expand Up @@ -149,6 +152,7 @@ class Node {

/** Gets the function body if applicable otherwise nullptr. */
const Function* GetFunctionBody() const noexcept { return func_body_; }

#endif

/**
Expand Down
3 changes: 3 additions & 0 deletions include/onnxruntime/core/graph/graph_viewer.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,9 @@ class GraphViewer {
/** Gets the Graph description. */
const std::string& Description() const noexcept;

/** Gets the path of the owning model if any **/
const Path& ModelPath() const noexcept { return graph_->ModelPath(); }

/**
Gets a tensor created from an initializer.
@param tensor_name The tensor name
Expand Down
201 changes: 164 additions & 37 deletions onnxruntime/core/framework/tensorprotoutils.cc

Large diffs are not rendered by default.

69 changes: 59 additions & 10 deletions onnxruntime/core/framework/tensorprotoutils.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,9 @@
#include <type_traits>

#include "core/common/common.h"
#include "core/common/path.h"
#include "core/common/status.h"
#include "core/framework/endian_utils.h"
#include "core/framework/allocator.h"
#include "core/framework/ml_value.h"
#include "core/framework/mem_buffer.h"
Expand Down Expand Up @@ -58,22 +60,33 @@ ONNXTensorElementDataType GetTensorElementType(const ONNX_NAMESPACE::TensorProto
template <size_t alignment>
common::Status GetSizeInBytesFromTensorProto(const ONNX_NAMESPACE::TensorProto& tensor_proto, size_t* out);

template <typename T>
Status UnpackTensor(const ONNX_NAMESPACE::TensorProto& tensor, const void* raw_data, size_t raw_data_len,
/*out*/ T* p_data, size_t expected_size);

// Convert the NodeProto from a Constant node into a TensorProto that can be used as an initializer
// Convert the AttributeProto from a Constant node into a TensorProto that can be used as an initializer
// If AttributeProto contains a TensorProto, this tensor proto is converted as is including the case when the
// the data location is external. i.e. it does not load the external data.
// However if AttributeProto contains SparseTensorProto then it converts the data into dense tensor proto
// (including loading external data when applicable).
// model_path is used for contructing full path for external_data
common::Status ConstantNodeProtoToTensorProto(const ONNX_NAMESPACE::NodeProto& node,
const Path& model_path,
ONNX_NAMESPACE::TensorProto& tensor);

// Convert a SparseTensorProto to a dense TensorProto
// If the SparseTensorProto contains external data then it loads the data and converts to dense tensor proto
// The resulting TensorProto will contain the data as raw data.
// model_path is used for contructing full path for external_data
common::Status SparseTensorProtoToDenseTensorProto(const ONNX_NAMESPACE::SparseTensorProto& sparse,
const Path& model_path,
ONNX_NAMESPACE::TensorProto& dense);

#if !defined(ORT_MINIMAL_BUILD)
// Convert a TensorProto to a SparseTensorProto
// If the tensorproto contains external data then it loads the data and converts to sparse tensor
// The resulting SparseTensorProto will contain the data as raw data
// model_path is used for contructing full path for external_data
common::Status DenseTensorToSparseTensorProto(const ONNX_NAMESPACE::TensorProto& dense,
const Path& model_path,
ONNX_NAMESPACE::SparseTensorProto& sparse);
#endif // !ORT_MINIMAL_BUILD
#endif // !ORT_MINIMAL_BUILD

inline bool HasDimValue(const ONNX_NAMESPACE::TensorShapeProto_Dimension& dim) {
return dim.value_case() == ONNX_NAMESPACE::TensorShapeProto_Dimension::kDimValue;
Expand Down Expand Up @@ -109,6 +122,13 @@ inline bool HasRawData(const ONNX_NAMESPACE::TensorProto& ten_proto) {
ten_proto.has_raw_data(); // XXX: Figure out how to do in proto3
}

inline bool HasExternalData(const ONNX_NAMESPACE::TensorProto& ten_proto) {
// Can not be UNDEFINED and can not be STRING but test for STRING is usually performed separately
// to return an error
return ten_proto.data_type() != ONNX_NAMESPACE::TensorProto::UNDEFINED &&
ten_proto.data_location() == ONNX_NAMESPACE::TensorProto_DataLocation_EXTERNAL;
}

inline bool HasDataType(const ONNX_NAMESPACE::TensorProto& ten_proto) {
return ten_proto.data_type() != ONNX_NAMESPACE::TensorProto::UNDEFINED;
}
Expand All @@ -126,10 +146,9 @@ inline bool HasElemType(const ONNX_NAMESPACE::TypeProto_SparseTensor& ten_proto)
}

inline bool HasName(const ONNX_NAMESPACE::SparseTensorProto& ten_proto) {
return ten_proto.values().has_name(); // XXX
return ten_proto.values().has_name(); // XXX
}


inline bool HasKeyType(const ONNX_NAMESPACE::TypeProto_Map& map_proto) {
return map_proto.key_type() != ONNX_NAMESPACE::TensorProto::UNDEFINED;
}
Expand Down Expand Up @@ -219,9 +238,37 @@ inline bool HasName(const ONNX_NAMESPACE::NodeProto& node_proto) {
return node_proto.has_name();
}

// UnpackTensor from either raw data or the type specific data field.
#if !defined(ORT_MINIMAL_BUILD)
// Unpack tensor which contains external data. Uses the tensor_proto_dir to construct the full path for external data.
// If tensor_proto_dir == nullptr then uses the current directory instead.
// This function does not unpack string_data of a tensor
template <typename T>
Status UnpackTensorWithExternalData(const ONNX_NAMESPACE::TensorProto& tensor,
const ORTCHAR_T* tensor_proto_dir, size_t expected_size,
/*out*/ T* p_data);
#endif // !defined(ORT_MINIMAL_BUILD)

// UnpackTensor from raw data or the type specific data field. Does not handle external data.
// If the tensor does not contain raw data then raw_data should be nullptr and raw_data_len should be 0.
template <typename T>
Status UnpackTensor(const ONNX_NAMESPACE::TensorProto& tensor, /*out*/ T* p_data, size_t expected_size) {
Status UnpackTensor(const ONNX_NAMESPACE::TensorProto& tensor, const void* raw_data, size_t raw_data_len,
/*out*/ T* p_data, size_t expected_size);

// UnpackTensor from raw data, external data or the type specific data field.
// Uses the model path to construct the full path for loading external data. In case when model_path is empty
// it uses current directory.
template <typename T>
Status UnpackTensor(const ONNX_NAMESPACE::TensorProto& tensor, const Path& model_path, /*out*/ T* p_data, size_t expected_size) {
#if !defined(ORT_MINIMAL_BUILD)
if (HasExternalData(tensor)) {
auto tensor_proto_path = model_path.IsEmpty() ? nullptr : model_path.ParentPath().ToPathString().c_str();
return UnpackTensorWithExternalData(tensor, tensor_proto_path, expected_size, p_data);
}
#else
ORT_UNUSED_PARAMETER(model_path);
ORT_RETURN_IF(HasExternalData(tensor), "TensorProto with external data is not supported in ORT minimal build.");
#endif

return HasRawData(tensor)
? UnpackTensor(tensor, tensor.raw_data().data(), tensor.raw_data().size(), p_data, expected_size)
: UnpackTensor(tensor, nullptr, 0, p_data, expected_size);
Expand All @@ -231,11 +278,13 @@ Status UnpackTensor(const ONNX_NAMESPACE::TensorProto& tensor, /*out*/ T* p_data
* Unpack the data from an initializer tensor
* Please note, this function does not unpack string_data of an initializer tensor
* @param initializer given initializer tensor
* @param initializer_dir model_path to construct external data dir path. When this is empty, current dir is used.
* @param unpacked_tensor the data from the initaizlier in uint8_t* form
* @param tensor_byte_size the byte size of the unpacked_tensor
* @returns Status::OK() if data is unpacked successfully
*/
common::Status UnpackInitializerData(const ONNX_NAMESPACE::TensorProto& initializer,
const Path& model_path,
std::unique_ptr<uint8_t[]>& unpacked_tensor,
size_t& tensor_byte_size) ORT_MUST_USE_RESULT;

Expand Down
30 changes: 22 additions & 8 deletions onnxruntime/core/graph/graph.cc
Original file line number Diff line number Diff line change
Expand Up @@ -430,6 +430,10 @@ void Node::SetPriority(int priority) noexcept {
priority_ = priority;
}

const Path& Node::ModelPath() const noexcept {
return graph_->ModelPath();
}

#if !defined(ORT_MINIMAL_BUILD)

const Function* Node::GetFunctionBody(bool try_init_func_body) {
Expand Down Expand Up @@ -966,6 +970,7 @@ Graph::Graph(const Model& owning_model,
is_loaded_from_model_file_(GraphLoadedFromModelFile(graph_proto_)) {
ORT_ENFORCE(graph_proto != nullptr, "graph_proto cannot be null");
ArgNameToTypeMap name_to_type_map;
const auto& model_path = ModelPath();

// Process 'Constant' nodes
// Put the 'TensorProto' stored in the 'Constant' nodes attribute into the graphs initializer list
Expand All @@ -975,7 +980,7 @@ Graph::Graph(const Model& owning_model,
}

const gsl::not_null<TensorProto*> tensor{graph_proto_->add_initializer()};
auto status = utils::ConstantNodeProtoToTensorProto(node, *tensor);
auto status = utils::ConstantNodeProtoToTensorProto(node, model_path, *tensor);
ORT_ENFORCE(status.IsOK(), status.ToString());
if (node.attribute(0).type() == AttributeProto_AttributeType_SPARSE_TENSOR) {
auto p = sparse_tensor_names_.emplace(tensor->name());
Expand All @@ -1000,7 +1005,7 @@ Graph::Graph(const Model& owning_model,
for (const auto& sparse_tensor : graph_proto_->sparse_initializer()) {
ORT_ENFORCE(utils::HasName(sparse_tensor), "Sparse initializer must have a name. This model is invalid");
const gsl::not_null<TensorProto*> tensor{graph_proto_->add_initializer()};
auto status = utils::SparseTensorProtoToDenseTensorProto(sparse_tensor, *tensor);
auto status = utils::SparseTensorProtoToDenseTensorProto(sparse_tensor, model_path, *tensor);
ORT_ENFORCE(status.IsOK(), status.ToString());
auto p = sparse_tensor_names_.emplace(tensor->name());
ORT_ENFORCE(p.second, "Duplicate sparse_tensor_initializer: '", tensor->name(), "' Model is invalid.");
Expand Down Expand Up @@ -2810,18 +2815,20 @@ common::Status Graph::SaveToOrtFormat(flatbuffers::FlatBufferBuilder& builder,
std::vector<flatbuffers::Offset<fbs::Tensor>> initializers_data;
assert(sparse_tensor_names_.size() <= name_to_initial_tensor_.size());
initializers_data.reserve(name_to_initial_tensor_.size() - sparse_tensor_names_.size());
const auto& model_path = ModelPath();

for (const auto& pair : name_to_initial_tensor_) {
if (sparse_tensor_names_.find(pair.first) == sparse_end) {
flatbuffers::Offset<fbs::Tensor> fbs_tensor;
ORT_RETURN_IF_ERROR(
experimental::utils::SaveInitializerOrtFormat(builder, *pair.second, fbs_tensor));
experimental::utils::SaveInitializerOrtFormat(builder, *pair.second, model_path, fbs_tensor));
initializers_data.push_back(fbs_tensor);
} else {
SparseTensorProto sparse_initializer;
ORT_RETURN_IF_ERROR(utils::DenseTensorToSparseTensorProto(*pair.second, sparse_initializer));
ORT_RETURN_IF_ERROR(utils::DenseTensorToSparseTensorProto(*pair.second, model_path, sparse_initializer));
flatbuffers::Offset<fbs::SparseTensor> fbs_sparse_tensor;
ORT_RETURN_IF_ERROR(
experimental::utils::SaveSparseInitializerOrtFormat(builder, sparse_initializer, fbs_sparse_tensor));
experimental::utils::SaveSparseInitializerOrtFormat(builder, sparse_initializer, model_path, fbs_sparse_tensor));
sparse_initializers_data.push_back(fbs_sparse_tensor);
}
}
Expand Down Expand Up @@ -2995,6 +3002,10 @@ ONNX_NAMESPACE::GraphProto Graph::ToGraphProto() const {

GraphProto result;
ToGraphProtoInternal(result);
// Path of the owning model
// This is used for constructing full path for external data
// if it exists
const auto& model_path = ModelPath();

// We want to make sure that sparse initializers do not appear
// as dense duplicates within the initializers list.
Expand All @@ -3006,7 +3017,7 @@ ONNX_NAMESPACE::GraphProto Graph::ToGraphProto() const {
*mutable_initializer->Add() = initializer;
} else {
auto& sparse_initializer = *result.add_sparse_initializer();
auto status = utils::DenseTensorToSparseTensorProto(initializer, sparse_initializer);
auto status = utils::DenseTensorToSparseTensorProto(initializer, model_path, sparse_initializer);
ORT_ENFORCE(status.IsOK(), "Failed to convert dense initializer to sparse");
}
}
Expand Down Expand Up @@ -3495,13 +3506,14 @@ Status Graph::InlineFunction(Node& node) {
}

RemoveNode(node.Index());
const auto& model_path = ModelPath();
for (const auto& subgraph_node : subgraph.Nodes()) {
if (subgraph_node.OpType() == kConstant) {
// Copy constant nodes _value to name_to_initial_tensor_
ONNX_NAMESPACE::NodeProto subgraph_node_proto{};
subgraph_node.ToProto(subgraph_node_proto);
const gsl::not_null<TensorProto*> tensor{graph_proto_->add_initializer()};
ORT_RETURN_IF_ERROR(utils::ConstantNodeProtoToTensorProto(subgraph_node_proto, *tensor));
ORT_RETURN_IF_ERROR(utils::ConstantNodeProtoToTensorProto(subgraph_node_proto, model_path, *tensor));
name_to_initial_tensor_[tensor->name()] = tensor;
} else {
std::vector<NodeArg*> inputs, outputs;
Expand Down Expand Up @@ -3697,12 +3709,14 @@ common::Status Graph::LoadFromOrtFormat(const onnxruntime::experimental::fbs::Gr

if (fbs_sparse_initializers) {
sparse_tensor_names_.reserve(fbs_sparse_initializers->size());
const auto& model_path = ModelPath();

for (const auto* fbs_sparse_tensor : *fbs_sparse_initializers) {
ORT_RETURN_IF(nullptr == fbs_sparse_tensor, "Sparse Initializer tensor is missing. Invalid ORT format model.");
SparseTensorProto sparse_initializer;
ORT_RETURN_IF_ERROR(experimental::utils::LoadSparseInitializerOrtFormat(*fbs_sparse_tensor, sparse_initializer));
TensorProto& initializer = *deserialized_proto_data_.add_initializer();
ORT_RETURN_IF_ERROR(utils::SparseTensorProtoToDenseTensorProto(sparse_initializer, initializer));
ORT_RETURN_IF_ERROR(utils::SparseTensorProtoToDenseTensorProto(sparse_initializer, model_path, initializer));
auto p = name_to_initial_tensor_.emplace(initializer.name(), &initializer);
if (!p.second) {
LOGS(logger_, WARNING) << "Duplicate initializer (dense, sparse or ConstantNode): '" << initializer.name()
Expand Down
12 changes: 7 additions & 5 deletions onnxruntime/core/graph/graph_flatbuffers_utils.cc
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ SaveDims(flatbuffers::FlatBufferBuilder& builder, const DimsFieldType& dims) {

Status SaveInitializerOrtFormat(flatbuffers::FlatBufferBuilder& builder,
const TensorProto& initializer,
const Path& model_path,
flatbuffers::Offset<fbs::Tensor>& fbs_tensor) {
auto name = SaveStringToOrtFormat(builder, initializer.has_name(), initializer.name());
auto doc_string = SaveStringToOrtFormat(builder, initializer.has_doc_string(), initializer.doc_string());
Expand All @@ -46,7 +47,7 @@ Status SaveInitializerOrtFormat(flatbuffers::FlatBufferBuilder& builder,
std::unique_ptr<uint8_t[]> unpacked_tensor;
size_t tensor_byte_size = 0;
ORT_RETURN_IF_ERROR(
onnxruntime::utils::UnpackInitializerData(initializer, unpacked_tensor, tensor_byte_size));
onnxruntime::utils::UnpackInitializerData(initializer, model_path, unpacked_tensor, tensor_byte_size));
raw_data = builder.CreateVector(unpacked_tensor.get(), tensor_byte_size);
}

Expand All @@ -65,16 +66,17 @@ Status SaveInitializerOrtFormat(flatbuffers::FlatBufferBuilder& builder,

Status SaveSparseInitializerOrtFormat(flatbuffers::FlatBufferBuilder& builder,
const ONNX_NAMESPACE::SparseTensorProto& initializer,
const Path& model_path,
flatbuffers::Offset<fbs::SparseTensor>& fbs_sparse_tensor) {
// values
const auto& values = initializer.values();
flatbuffers::Offset<fbs::Tensor> values_off;
ORT_RETURN_IF_ERROR(SaveInitializerOrtFormat(builder, values, values_off));
ORT_RETURN_IF_ERROR(SaveInitializerOrtFormat(builder, values, model_path, values_off));

// Indicies
const auto& indicies = initializer.indices();
flatbuffers::Offset<fbs::Tensor> indicies_off;
ORT_RETURN_IF_ERROR(SaveInitializerOrtFormat(builder, indicies, indicies_off));
ORT_RETURN_IF_ERROR(SaveInitializerOrtFormat(builder, indicies, model_path, indicies_off));

// Shape
auto shape = SaveDims(builder, initializer.dims());
Expand Down Expand Up @@ -122,7 +124,7 @@ Status SaveAttributeOrtFormat(flatbuffers::FlatBufferBuilder& builder,
case fbs::AttributeType::TENSOR: {
flatbuffers::Offset<fbs::Tensor> fbs_tensor;
ORT_RETURN_IF_ERROR(
experimental::utils::SaveInitializerOrtFormat(builder, attr_proto.t(), fbs_tensor));
experimental::utils::SaveInitializerOrtFormat(builder, attr_proto.t(), graph->ModelPath(), fbs_tensor));
GET_FBS_ATTR(builder, type, t, fbs_tensor);
} break;
case fbs::AttributeType::GRAPH: {
Expand Down Expand Up @@ -152,7 +154,7 @@ Status SaveAttributeOrtFormat(flatbuffers::FlatBufferBuilder& builder,
for (const auto& tensor : attr_proto.tensors()) {
flatbuffers::Offset<fbs::Tensor> fbs_tensor;
ORT_RETURN_IF_ERROR(
experimental::utils::SaveInitializerOrtFormat(builder, tensor, fbs_tensor));
experimental::utils::SaveInitializerOrtFormat(builder, tensor, graph->ModelPath(), fbs_tensor));
fbs_tensors_vec.push_back(fbs_tensor);
}
auto tensors = builder.CreateVector(fbs_tensors_vec);
Expand Down
7 changes: 4 additions & 3 deletions onnxruntime/core/graph/graph_flatbuffers_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ namespace onnxruntime {

class Graph;
class Node;
class Path;

namespace logging {
class Logger;
class Logger;
}

namespace experimental {
Expand All @@ -36,11 +37,11 @@ namespace utils {
// TODO, add ORT_MUST_USE_RESULT when it is moved to a different header
onnxruntime::common::Status SaveInitializerOrtFormat(
flatbuffers::FlatBufferBuilder& builder, const ONNX_NAMESPACE::TensorProto& initializer,
flatbuffers::Offset<fbs::Tensor>& fbs_tensor);
const Path& model_path, flatbuffers::Offset<fbs::Tensor>& fbs_tensor);

onnxruntime::common::Status SaveSparseInitializerOrtFormat(
flatbuffers::FlatBufferBuilder& builder, const ONNX_NAMESPACE::SparseTensorProto& initializer,
flatbuffers::Offset<fbs::SparseTensor>& fbs_sparse_tensor);
const Path& model_path, flatbuffers::Offset<fbs::SparseTensor>& fbs_sparse_tensor);

// Convert a given AttributeProto into fbs::Attribute
// Note, we current do not support graphs, and sparse_tensor(s)
Expand Down
6 changes: 3 additions & 3 deletions onnxruntime/core/optimizer/matmul_scale_fusion.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@ namespace onnxruntime {
namespace {
template <typename T>
struct ExtractScalarAsFloatDispatchTarget {
Status operator()(const ONNX_NAMESPACE::TensorProto& tensor_proto, float& scalar_float) {
Status operator()(const ONNX_NAMESPACE::TensorProto& tensor_proto, const Path& model_path, float& scalar_float) {
T scalar;
ORT_RETURN_IF_ERROR(utils::UnpackTensor(tensor_proto, &scalar, 1));
ORT_RETURN_IF_ERROR(utils::UnpackTensor(tensor_proto, model_path, &scalar, 1));
scalar_float = static_cast<float>(scalar);
return Status::OK();
}
Expand Down Expand Up @@ -48,7 +48,7 @@ optional<float> GetScalarConstantInitializer(const Graph& graph, const NodeArg&
Status, ExtractScalarAsFloatDispatchTarget,
uint32_t, uint64_t, int32_t, int64_t, MLFloat16, float, double, BFloat16>
dispatcher{initializer->data_type()};
ORT_THROW_IF_ERROR(dispatcher.Invoke(*initializer, scalar));
ORT_THROW_IF_ERROR(dispatcher.Invoke(*initializer, graph.ModelPath(), scalar));

return {scalar};
}
Expand Down
Loading