Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disclaimer #3376

Closed
wants to merge 270 commits into from
Closed

disclaimer #3376

wants to merge 270 commits into from

Conversation

mergennachin
Copy link
Contributor

cccclai and others added 30 commits April 9, 2024 13:28
Summary:
Pull Request resolved: #2944

Need change from D55354487 to get mutable buffer + pt2e working

Reviewed By: JacobSzwejbka

Differential Revision: D55922254

fbshipit-source-id: 5ea4471eb0e22149a0dbb4e921fe447cceb13bf1
Summary:
Pull Request resolved: #2883

## Summary (cases handled)

We introduce support for the convolution cases covered by ATen-VK's transpose implementation. This is achieved by
- reusing the existing [`conv_transpose2d.glsl`](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/glsl/conv_transpose2d.glsl), and
- [moving special weights prepacking from CPU](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/ops/Convolution.cpp#L134-L235) to the GPU in `conv_transpose2d_prepack_weights.glsl`.

We also include resizing support for dynamic shapes. Note that only height and width of the input can vary.

## Cases not handled

The implementation is on-par with ATen-VK's Transpose. This means the following cases are missing:
1. **Groups G > 1.**
2. **Batch (input) N > 1.**
3. **Dilation > 1.**
ghstack-source-id: 221721754
exported-using-ghexport
bypass-github-export-checks

Reviewed By: copyrightly, SS-JIA

Differential Revision: D55667336

fbshipit-source-id: 3b7b7c912ef947610624e2e1c5b753de393234a0
Summary:
Pull Request resolved: #2884

## Summary
We introduce support for the convolution cases covered by [ATen-VK's default Depthwise implementation](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/ops/Convolution.cpp#L68). This is achieved by
- reusing the [existing `conv2d_dw.glsl`](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/glsl/conv2d_dw.glsl), and
- [moving special weights prepacking from CPU](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/ops/Convolution.cpp#L80-L132) to the GPU in `conv2d_dw_prepack_weights.glsl`.

The implementation is on-par with ATen-VK's Depthwise. This means it only covers:
- `in_channels == groups`, `out_channels == groups`

A full implementation would cover, for any positive integer K:
- `in_channels == groups`, `out_channels == groups * K`
ghstack-source-id: 221721752
exported-using-ghexport
bypass-github-export-checks

Reviewed By: SS-JIA

Differential Revision: D55813511

fbshipit-source-id: c0726798bd36cc5ff2326836c28a5f7d23494f5e
Summary:
Pull Request resolved: #2854

## Context

Currently, when executing a `ComputeGraph` with prepacked tensors with [Vulkan Validation Layers](https://github.com/KhronosGroup/Vulkan-ValidationLayers) turned on, the following Validation Errors can be observed. Note that Validation Layers can be turned on by running Vulkan binaries on Mac with the `vkconfig` app opened.

```
UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout(ERROR / SPEC): msgNum: 1303270965 - Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x7fb76dbbf988, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x7fb76dbbf988[] expects VkImage 0xd79c8a0000000f09[] (subresource: aspectMask 0x1 array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_GENERAL--instead, current layout is VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL.
    Objects: 1
        [0] 0x7fb76dbbf988, type: 6, name: NULL
```

The reason for this is that prepacked textures are written to with `WRITE` memory access during packing, which means they will be in the `VK_IMAGE_LAYOUT_GENERAL` layout. However, they will subsequently be read from during `graph.execute()`, meaning the texture will have transitioned to `VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL`, but will be bound using the `VK_IMAGE_LAYOUT_GENERAL` layout. Subsequent calls to `execute()` will therefore see that the prepacked texture has been bound with the wrong layout, since after the first graph execution the texture will have the `VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL` layout.

The solution is to submit a no-op shader dispatch during prepacking to trigger a transition to the `READ_ONLY_OPTIMAL` layout.
ghstack-source-id: 221871426

bypass-github-pytorch-ci-checks

Reviewed By: jorgep31415

Differential Revision: D55772003

fbshipit-source-id: f9c69e6e571ca0d0d28a6c25716766af98e82d41
…ut`s (#2948)

Summary:
Pull Request resolved: #2948

## Context

Introduce the following convenience `constexpr`:

* `api::kBuffer`, `api::kTexture3D`, and `api::kTexture2D`
* `api::kWidthPacked`, `api::kHeightPacked`, and `api::kChannelsPacked`

Also remove the `api::StorageType::UNKNOWN` enum entry as it doesn't really serve any purpose.
ghstack-source-id: 221871428

bypass-github-pytorch-ci-checks

Reviewed By: copyrightly, jorgep31415

Differential Revision: D55811278

fbshipit-source-id: 26dc1706ac2605c13f247d08a21863ff3ef94488
Summary:
Pull Request resolved: #2949

It is supposed to be unlikely for assert/check conditions to fail; let's tell the compiler about that.

Reviewed By: mergennachin

Differential Revision: D55929730

fbshipit-source-id: 5677c19cd8342cbd77a9c0b973059ed3d5ee800b
Summary:
Pull Request resolved: #2772

Just a spelling mistake.

Reviewed By: JacobSzwejbka

Differential Revision: D55542731

fbshipit-source-id: c12bcab53661561bf0d8223d5cae9ed92b39e599
Summary:
Pull Request resolved: #2773

Noticed this page didn't line up right. Now it does.

Reviewed By: mergennachin, kirklandsign

Differential Revision: D55542836

fbshipit-source-id: a25a376ce9e77f3bc360e9ab6cf15c9ae9ecc7bf
Summary:
Pull Request resolved: #2885

We port an optimization from ATen-VK for specific weight sizes: [`conv2d_dw_output_tile.glsl`](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/glsl/conv2d_dw_output_tile.glsl)
ghstack-source-id: 221887576
exported-using-ghexport
bypass-github-export-checks

Reviewed By: SS-JIA

Differential Revision: D55814588

fbshipit-source-id: 86a85d122abbcebfed41466bc0a4907a6ddc80f9
Summary:
Pull Request resolved: #2886

We port an optimization from ATen-VK for specific weight sizes: [`conv2d_pw.glsl`](https://github.com/pytorch/pytorch/blob/09c72eaa3f69f90402c86a30abf4fc621298578c/aten/src/ATen/native/vulkan/glsl/conv2d_pw.glsl)
ghstack-source-id: 221887670
exported-using-ghexport
bypass-github-export-checks

Reviewed By: SS-JIA

Differential Revision: D55814587

fbshipit-source-id: 419d82ddcf2dce59b2d1ec5abf313356fce074e6
Summary:
Minor updates to the prerequisite section of the LLM getting started guide. Passing -s to pyenv install prevents a prompt if python 3.10 is already installed (it will just silently continue in this case when the flag is passed). Additionally, under pyenv, we should be using python, not python3. I also added a little bit of wording on env management.

Pull Request resolved: #2940

Test Plan: Ran LLM guide prerequisite section on an m1 mac with pyenv-virtualenv.

Reviewed By: byjlw

Differential Revision: D55913382

Pulled By: GregoryComer

fbshipit-source-id: 7f04262b025db83b8621c972c90d3cdc3f029377
Summary:
Version hash reported by
https://github.com/facebook/buck2/releases/download/2024-02-15/buck2-x86_64-apple-darwin.zst

Pull Request resolved: #2868

Reviewed By: Olivia-liu

Differential Revision: D55914146

Pulled By: GregoryComer

fbshipit-source-id: b9882900acfd4cb6f74eda90a7c99bdb119ec122
Summary:
Pull Request resolved: #2825

Add capability to print the node list with arguments to allow better debugging.

Reviewed By: SS-JIA

Differential Revision: D55510335

fbshipit-source-id: 151e3a6f249417dfe644172c1b5f0e83a3b110dd
Summary:
Pull Request resolved: #2887

The final touches to get ET-VK convolution on-par with ATen-VK's convolution.

## Idea
In our shaders, we add the bias to our sum.
```
${VEC4_T[DTYPE]} sum = texelFetch(bias_in, ivec2(pos.z, 0), 0);
```
To keep our shaders as is, we implement having no bias by allocating a buffer of zeros. Then, our shader adds zero to our sum.

## Issue
If `Bias=False`, dummy buffer of zeros is not serialized with the graph. The bias ValueRef is deserialized in the runtime as `TypeTag::NONE`, not `TypeTag::TENSORREF`.

## Solution
If `TypeTag::NONE` is given, (1) create the `vTensor` using the `out_channels` value from the weights, (2) allocate a StagingBuffer of that size, and (3) `memset` its data to zero. Failure to do (3) will result in undefined behavior.

ghstack-source-id: 221926167
exported-using-ghexport
bypass-github-export-checks

Reviewed By: SS-JIA

Differential Revision: D55814589

fbshipit-source-id: ce7b82c31bb11540ed2d98ab14131841fcee93e4
Summary:
Pull Request resolved: #2920

TSIA
ghstack-source-id: 221926168
exported-using-ghexport
bypass-github-export-checks

Reviewed By: SS-JIA

Differential Revision: D55829466

fbshipit-source-id: 48b4f15c41141093dd061c43e6b769eb4c25c81b
Summary:
Pull Request resolved: #2807

The operator `aten.sum.dim_IntList` could take an empty list as the parameter for `dims`. We modify `vulkan_graph_builder.py` to accommodate the empty list.

Moreover, the op `aten.sum.default` is implemented as a [decomposition](https://www.internalfb.com/code/fbsource/[96e496f9db8f92967b4394bd4f60e39ab916740b]/xplat/caffe2/torch/_decomp/decompositions.py?lines=4676) into `aten.sum.dim_IntList` with empty `dims`. So we will support `aten.sum.default` with the changes.

Context: `torch.sum(x, ())` and `torch.sum(x)` are two ways to compute the sum of all elements in tensor `x`.

Reviewed By: SS-JIA, jorgep31415

Differential Revision: D55630993

fbshipit-source-id: 923d276118e893ff6885b92eb7b4c7cb7a95b374
Summary:
Pull Request resolved: #2961

Fix these 3 CI job failures caused by #2934 (D55907752):

* Apple / build-frameworks-ios / macos-job
* trunk / test-arm-backend-delegation / linux-job
* trunk / test-coreml-delegate / macos-job

Reviewed By: kirklandsign

Differential Revision: D55950023

fbshipit-source-id: 6166d9112e6d971d042df1400442395d8044c3b3
Summary:
Pull Request resolved: #2964

## Context

Some research into efficient string concatenation suggests that streams in C++ are not quite efficient. The best way to concatenate strings seems to be creating a `std::string` and reserving sufficient capacity for the `std::string`. This diff deprecates the usage of `std::stringstream` when constructing kernel names in favor of using `std::string` directly.

Reviewed By: copyrightly

Differential Revision: D55951475

fbshipit-source-id: a1a584669e80984b85d11b7d6d4f7593290e562b
)

Summary:
Pull Request resolved: #2952

* Some auto-formatting by my VSCode (remove extra spaces)
* Remove imports that have been imported in previous part of the doc
* Other minor changes to keep consistency across the doc
* Link a screenshot instead of using the raw table because the original table is illegible:
 {F1482781056}

Reviewed By: GregoryComer

Differential Revision: D55938344

fbshipit-source-id: 699abb9ebe1196ab73d90a3d08d60be7aa0d8688
Summary:
* Update tutorial due to recent changes.
* Clean up setup.sh for app helper lib build.

Pull Request resolved: #2962

Reviewed By: cccclai

Differential Revision: D55951189

Pulled By: kirklandsign

fbshipit-source-id: 2c95e8580145b039f503e7cd99a4003867f8dbb0
Summary:
- Add per channel weight quantization for linear op
- Bias quantization for per channel weight Linear op is not support yet

Pull Request resolved: #2822

Reviewed By: kirklandsign

Differential Revision: D55731629

Pulled By: cccclai

fbshipit-source-id: 831a47c897b34e1a749325df56a8bbd0acda80e1
Summary:
Pull Request resolved: #2936

Fix the way we use `at::from_blob()` and add proper namespace to `CompileTimeFunctionPointer` so to not confused with `at::CompileTimeFunctionPointer`.

bypass-github-pytorch-ci-checks
bypass-export-ci-checks

Reviewed By: lucylq

Differential Revision: D55907751

fbshipit-source-id: ad793e30ec72f48e7300d75820209035d42cae6c
Summary:
Pull Request resolved: #2935

Currently `EXECUTORCH_BUILD_CUSTOM` is not being respected properly.

If this option is false, we should not build `llama2/custom_ops` anywhere.

If this option is true, we should build `llama2/custom_ops` in both llama runner binary and pybind.

This PR consolidates it.

bypass-github-pytorch-ci-checks
bypass-export-ci-checks

Reviewed By: lucylq

Differential Revision: D55907750

fbshipit-source-id: 03a7a8cbd499c734060de385d6edb193cf35470d
Summary:
Pull Request resolved: #2954

Change the tokenizer APIs to:

```
Result<std::vector<uint64_t>> encode(const std::string& input, int8_t bos, int8_t eos);
Result<std::string> decode(uint64_t prev_token, uint64_t token);
```

Notice that: we use `uint64_t` for token id just to be safe. We return a std::vector of tokens for encode() API.

Reviewed By: lucylq

Differential Revision: D55944780

fbshipit-source-id: 9b44437e7061424526f4e0b049a3449129f0ba53
Summary:
Pull Request resolved: #2033

Update the OSS Xtensa repo with more up to date compiler and quantizer things. Introduce a test folder and a conv1d test.

Reviewed By: tarun292, cccclai

Differential Revision: D54034581

fbshipit-source-id: c2bf0c43897a2ef7dff291698370d2583433a6ba
…LLM (#2977)

Summary:
Pull Request resolved: #2977

As titled

Reviewed By: Gasoonjia

Differential Revision: D55992093

fbshipit-source-id: 7864c330bd86af5d4127cacfd47e96f1e6666bfb
Summary:
Pull Request resolved: #2981

As titled, a quick follow up of D55907750

Reviewed By: lucylq

Differential Revision: D55996735

fbshipit-source-id: f535b013b7b900c5a2c2ed79f6b6738dcf1f91ec
Summary:
After pytorch/test-infra#5086, the working directory is now set correctly, so `pushd` isn't needed anymore.  More importantly, trying to change the directory ends up failing all macOS CI jobs because that subdirectory doesn't exist.

Pull Request resolved: #2980

Reviewed By: larryliu0820

Differential Revision: D55996299

Pulled By: huydhn

fbshipit-source-id: 05758603d7628cc0a01fd577a49202d45c84e6c5
Summary:
I'm trying to setup a simple perf test when running llama2 on Android.  It's naively sent a prompt and record the TPS.  Open for comment about the test here before setting this up on CI.

### Testing

Copy the exported model and the tokenizer as usual, then cd to the app and run `./gradlew :app:connectAndroidTest`.  The test will fail if the model is failed to load or if the TPS is lower than 7 as measure by https://github.com/pytorch/executorch/tree/main/examples/models/llama2

Pull Request resolved: #2963

Reviewed By: kirklandsign

Differential Revision: D55951637

Pulled By: huydhn

fbshipit-source-id: 34c189aefd7e31514fcf49103352ef3cf8e5b2c9
Summary:
It was a workaround to skip `aten.index_put` op in Core ML delegation, at the cost of partitioning the Llama model into 13 pieces.

For better performance, we prefer to delegate the whole model to Core ML. Since Core ML has added the [necessary support](apple/coremltools#2190), it is time to revert this workaround

Pull Request resolved: #2975

Reviewed By: kirklandsign

Differential Revision: D56002979

Pulled By: cccclai

fbshipit-source-id: e7a7c8c43706cb57eba3e6f720b3d713bec5065b
dbort and others added 27 commits April 24, 2024 09:32
Summary:
Use relative markdown links instead of full URLs. This way, the docs will always point to a consistent branch.

Pull Request resolved: #3244

Test Plan: Clicked on all modified links in the rendered docs preview: https://docs-preview.pytorch.org/pytorch/executorch/3244/llm/getting-started.html

Reviewed By: Gasoonjia

Differential Revision: D56479234

Pulled By: dbort

fbshipit-source-id: 45fb25f017c73df8606c3fb861acafbdd82fec8c
Summary:
- Added Llama 3 8B
- Added llm_manual in the list
- changed name from Extensa to Cadence

Pull Request resolved: #3275

Reviewed By: Gasoonjia

Differential Revision: D56524960

Pulled By: iseeyuan

fbshipit-source-id: 2b4464028fe3cdf3c2b524d233fa3e87b2561dda
Differential Revision:
D56480274

Original commit changeset: 451116a0f907

Original Phabricator Diff: D56480274

fbshipit-source-id: e9603e5076113560b1224a56432abf321f82e284
Summary:
Pull Request resolved: #3300

This diff solves part of Ali's comments in our tracer sheet (https://docs.google.com/spreadsheets/d/1PoJt7P9qMkFSaMmS9f9j8dVcTFhOmNHotQYpwBySydI/edit#gid=0). Specifically speaking:

"NanoGPT" -> "nanoGPT"
"CoreML" -> "Core ML"
"ExecuTorch Codebase" -> "ExecuTorch codebase"
"Android Phone" -> "Android phone"
"How to build Mobile Apps" -> "How to Build Mobile Apps"

also shorten the following two column names for avoid overlapping.
"occurrences_in_delegated_graphs" ->  "# in_delegated_graphs" "occurrences_in_non_delegated_graphs" -> # in_non_delegated_graphs

Reviewed By: Jack-Khuu

Differential Revision: D56513601

fbshipit-source-id: 7015c2c5b94b79bc6c57c533ee812c9e58ab9d56
Summary: .

Reviewed By: cccclai

Differential Revision: D56532283

fbshipit-source-id: 62d7c9e8583fdb5c9a1b2e781e80799c06682aae
Summary: As titled

Reviewed By: lucylq, Gasoonjia, guangy10

Differential Revision: D56532035

fbshipit-source-id: ddf4f3864db0f200b97e67673a7086dac790eb82
Summary:
- add note for embedding quantize, for llama3
- re-order export args to be the same as llama2, group_size missing `--`

Pull Request resolved: #3315

Reviewed By: cccclai

Differential Revision: D56528535

Pulled By: lucylq

fbshipit-source-id: 4453070339ebdb3d782b45f96fe43d28c7006092
Summary: Pull Request resolved: #3298

Reviewed By: Olivia-liu

Differential Revision: D56509749

Pulled By: tarun292

fbshipit-source-id: 36b56e7cc039144105d64431697a16a793029af8
Summary: .

Reviewed By: cccclai

Differential Revision: D56535633

fbshipit-source-id: 070a3b0af9dea234f8ae4be01c37c03b4e0a56e6
…ner (#3324)

Summary:
**Summary of changes**:
- Update MPS documentation to reflect all changes since previous release
- Add helper script to build `mps_executor_runner`

**Testing**:
- Verified that mps_executor_runner builds correctly:
```
./examples/apple/mps/scripts/build_mps_executor_runner.sh
/examples/apple/mps/scripts/build_mps_executor_runner.sh --Debug
```
Verified that the docs are building correctly:
```
cd docs
make html
```

cc shoumikhin, cccclai

Pull Request resolved: #3324

Reviewed By: shoumikhin

Differential Revision: D56535774

Pulled By: cccclai

fbshipit-source-id: 5974795732dbe1089e3d63cd1b618cadf7a2573e
… as Custom Metal kernels are yet not enabled) (#3328)

Summary:
Remove the sorting of the nodes from partitioning (not needed for now as Custom Metal kernels are yet not enabled)

**Testing**:
Verified that tracing works correctly with release branch:  `python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3"`

cc shoumikhin , cccclai

Pull Request resolved: #3328

Reviewed By: shoumikhin

Differential Revision: D56540389

Pulled By: cccclai

fbshipit-source-id: e8a53f624b58ac4d2348c87e08acd5f2fb3de5b2
Summary:
Pull Request resolved: #3299

1. Introduce a `CopyNode` for generic copy-with-offset operations.
2. `aten.repeat` on all dimensions.
2.1 Use `CopyNode` where possible.
2.2. Specialized `repeat_channel` shader to handle packings
3. Update codegen to support `Methods` variant only operations. Need a new route to trigger the dispatch.
ghstack-source-id: 223812048

Reviewed By: copyrightly

Differential Revision: D56499329

fbshipit-source-id: 72936e621940588ce398dd62669ec9aa637e98ba
Summary: bring buck2 installation back, and scrub any "-DBUCK2=buck2" in our docs, to unblock users from using buck2

Reviewed By: guangy10

Differential Revision: D56540769

fbshipit-source-id: 363e592c17dd2747a693e59d8d6b6d20f43c8451
Summary:
- We register `select`, `unsqueeze` and `view` in `vulkan_partitioner.py` in order to run vulkan_delegate test (Python e2e test). The latter two might be used to implement `bmm` and `addmm`, so I want to make sure they work.
- We register `reshape` in `View.cpp` explicitly. `reshape` is implemented through `_reshape_alias` (see [this](https://www.internalfb.com/code/fbsource/[a3dd6401f00d73f09bbdea63887fef54ea2c6dd2]/fbcode/caffe2/aten/src/ATen/native/native_functions.yaml?lines=4872-4881)) which is [decomposed as `view`](https://www.internalfb.com/code/fbsource/[bbb783ae1cff98b3b549da3edd845dde946d3da8]/xplat/caffe2/torch/_decomp/decompositions.py?lines=3669-3672). For codegen test, we still need to register the op, otherwise there is error
```
C++ exception with description "Exception raised from get_op_fn at xplat/executorch/backends/vulkan/runtime/graph/ops/OperatorRegistry.cpp:20: (it != table_.end()) is false! Could not find operator with name aten.reshape.default" thrown in the test body.
```

Reviewed By: yipjustin, liuk22

Differential Revision: D56454941

fbshipit-source-id: c83e6fb97d9cf9019cc6e786508f353a22236931
Summary: Pull Request resolved: #3340

Reviewed By: orionr, kimishpatel, cccclai

Differential Revision: D56553088

Pulled By: mergennachin

fbshipit-source-id: 2994dd3ab2692c5b972316af1617bd06d647af96
Summary:
Right now we are not building it and it is causing missing ops in torchchat.

This PR adds it into pybinding.

Pull Request resolved: #3263

Reviewed By: lucylq

Differential Revision: D56500693

Pulled By: larryliu0820

fbshipit-source-id: 0ed0e28fcccb6002ef48e6a38b60e92d8af4def6
Summary: Pull Request resolved: #3345

Reviewed By: dbort

Differential Revision: D56557091

Pulled By: svekars

fbshipit-source-id: 4300ca86d01ec110fc6934588cd691c12661a730
Summary: Pull Request resolved: #2695

Reviewed By: mergennachin

Differential Revision: D55091814

fbshipit-source-id: 04b2a888c6bbdaa195cb916c6564aa93daca2514
Summary: In executorch we will dtype-specialize the kernels and also run on a single device with export. Therefore _to_copy is not needed in edge dialect.

Reviewed By: tugsbayasgalan

Differential Revision: D56579169

fbshipit-source-id: 5a2e3cd453a11bd2ad009b439587b0fc589f7fe4
Summary:
For executorch users, we see a common pattern that they have to:

```bash
bash install_requirements.sh --pybind xnnpack

cmake -S . -Bcmake-out ...

cmake --build ...
```

This is repeating cmake build twice, the first one is inside setup.py.

Here I'm adding a way to allow setup.py to install the libraries seperately, by passing `CMAKE_ARGS` and `CMAKE_BUILD_ARGS` into setup.py, through `install_requirements.sh`.

After this change, user can do:

```bash
export CMAKE_ARGS="-DCMAKE_INSTALL_PREFIX=<install dir> \
  -DEXECUTORCH_BUILD_OPTIMIZED=ON \
  ..."

export CMAKE_BUILD_ARGS="--target install"

bash install_requirements.sh --pybind xnnpack
```

Then we should be able to find `libxnnpack.a` `liboptimized_ops_lib.a` etc under install dir.

Pull Request resolved: #3349

Reviewed By: mikekgfb

Differential Revision: D56560786

Pulled By: larryliu0820

fbshipit-source-id: fb6cd230df2317067f07ae0f1e72d0596b7b454b
Reviewed By: cccclai

Differential Revision: D56543186

fbshipit-source-id: 4fed6b9b3ede3cdcb67a9a52150e3f22cc02b180
Summary:
Currently, we always build two copies of the flatcc targets, just in case we happen to be cross-compiling. But because the flatcc project puts its binaries in the source directory, those two copies can interfere with each other.

We don't need to build two copies when not cross-compiling, so add a new option to avoid the second "host" build.

Eventually we should only enable this when cross-compiling, but for now disable it when building the pip package (which is never cross-compiled).

Pull Request resolved: #3356

Test Plan: `rm -rf pip-out && ./install_requirements.sh` succeeded. Looking in the `pip-out/temp.*/cmake-out` directory, there is no `_host_build` directory, but the etdump headers were successfully generated under `pip-out/temp.*/cmake-out/sdk/include/executorch/sdk/etdump/`.

Reviewed By: malfet, larryliu0820

Differential Revision: D56582507

Pulled By: dbort

fbshipit-source-id: 4ce6c680657bc57cfcf016826364a3f46c4c953e
Summary: Pull Request resolved: #3358

Reviewed By: dbort

Differential Revision: D56584847

Pulled By: svekars

fbshipit-source-id: 77c4105edf15503bf1b29c1f120111a73b973c4c
Summary:
`libextension_data_loader.a` is not installed properly. This PR removes the prefix so that it can be properly installed

Pull Request resolved: #3355

Test Plan: See `libextension_data_loader.a` showing up under executorch/cmake-out/lib.

Reviewed By: lucylq, mikekgfb

Differential Revision: D56580943

Pulled By: larryliu0820

fbshipit-source-id: b771192d03799fd576e8591ec7c45fae23f20762
Summary: Pull Request resolved: #3364

Test Plan: https://docs-preview.pytorch.org/pytorch/executorch/3364/index.html

Reviewed By: svekars

Differential Revision: D56596949

Pulled By: dbort

fbshipit-source-id: f6c71e072bcefbb7d04354d1ef78d780c14facb5
Summary: The cpp op schema does not match the registered one. Fix that.

Reviewed By: tarun292, cccclai

Differential Revision: D56594373

fbshipit-source-id: cb4853030715245e7a0177c0f193c4558f19584d
Copy link

pytorch-bot bot commented Apr 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3376

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.