Skip to content

Fixed QNN data format config issue. #480

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -166,8 +166,8 @@ def generate_data_format_config(
for output in onnx_model.graph.output:
if "past_key" in output.name or "past_value" in output.name:
kv_nodes.append(output.name)
kv_overrides = {}

kv_overrides = {}
kv_overrides["graphs"] = [
{
"graph_name": model_dlc_name + "_configuration_1",
Expand Down
24 changes: 23 additions & 1 deletion docs/source/quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ python -m QEfficient.cloud.execute --model_name gpt2 --qpc_path qeff_models/gpt2
You can run the finetune with set of predefined existing datasets on QAIC using the eager pipeline

```bash
python -m QEfficient.cloud.finetune --device qaic:0 --use-peft --output_dir ./meta-sam --num_epochs 2 --context_length 256
python -m QEfficient.cloud.finetune --device qaic:0 --use-peft --output_dir ./meta-sam --num_epochs 2 --context_length 256
```
For more details on finetune, checkout the subsection.

Expand Down Expand Up @@ -138,6 +138,28 @@ Users can compile a model with QNN SDK by following the steps below:
* Enabled QNN by passing enable_qnn flag, add --enable_qnn in the cli command.
* An optional config file can be passed to override the default parameters.

**Default Parameters**

QNN Converter Stage:

"--float_bias_bitwidth 32 --float_bitwidth 16 --preserve_io_datatype --onnx_skip_simplification --target_backend AIC"

QNN Context Binary Stage:

LOG_LEVEL = "error"
COMPILER_COMPILATION_TARGET = "hardware"
COMPILER_CONVERT_TO_FP16 = True
COMPILER_DO_DDR_TO_MULTICAST = True
COMPILER_HARDWARE_VERSION = "2.0"
COMPILER_PERF_WARNINGS = False
COMPILER_PRINT_DDR_STATS = False
COMPILER_PRINT_PERF_METRICS = False
COMPILER_RETAINED_STATE = True
COMPILER_STAT_LEVEL = 10
COMPILER_STATS_BATCH_SIZE = 1
COMPILER_TIME_PASSES = False


**CLI Inference Command**

Without QNN Config
Expand Down
Loading