Skip to content

Commit

Permalink
MobileNet: Update QNN EP example (#1563)
Browse files Browse the repository at this point in the history
## Describe your changes
MobileNet QNN EP example was outdated. QNN EP no longer requires x86 env
for dev and arm env for inference. There is only one windows x86 package
that can do both.
- Removed mobilenet_qnn_ep.py launcher script since a workflow json is
enough.
- Added dynamic -> static shape step as a pass instead of during model
download. This makes the static shape requirement clear to the user.
- Simplified data config.
- Download files uses command list instead of string to run subprocess.
Otherwise, there are potential issues with paths being split
incorrectly.

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.
- [ ] Is this PR including examples changes? If yes, please remember to
update [example
documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md)
in a follow-up PR.

## (Optional) Issue link
  • Loading branch information
jambayk authored Jan 22, 2025
1 parent e510074 commit da3b967
Show file tree
Hide file tree
Showing 10 changed files with 59 additions and 200 deletions.
60 changes: 13 additions & 47 deletions examples/mobilenet/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# MobileNet optimization with QDQ Quantization on Qualcomm NPU
This folder contains a sample use case of Olive to optimize a MobileNet model for Qualcomm NPU (QNN Execution Provider)
using static QDQ quantization.
This folder contains a sample use case of Olive to optimize a MobileNet model for Qualcomm NPU (QNN Execution Provider) using static QDQ quantization.

## Prerequisites for Quantization
### Clone the repository and install Olive (x86 python)
This example requires an x86 python environment on a Windows ARM machine.

Refer to the instructions in the [examples README](../README.md) to clone the repository and install Olive.

### Install onnxruntime (x86 python)
This example requires onnxruntime>=1.17.0. Please install the latest version of onnxruntime:
## Prerequisites
### Clone the repository and install Olive

Refer to the instructions in the [examples README](../README.md) to clone the repository and install Olive.

### Install onnxruntime-qnn
```bash
python -m pip install "onnxruntime>=1.17.0"
python -m pip install onnxruntime-qnn
```

### Pip requirements
Expand All @@ -20,50 +20,16 @@ Install the necessary python packages:
python -m pip install -r requirements.txt
```

## Prerequisites for Evaluation

### Download and unzip QNN SDK
Download the Qualcomm AI Engine Direct SDK (QNN SDK) from https://qpm.qualcomm.com/main/tools/details/qualcomm_ai_engine_direct.

Complete the steps to configure the QNN SDK for QNN EP as described in the [QNN EP Documentation](https://onnxruntime.ai/docs/execution-providers/QNN-ExecutionProvider.html#running-a-quantized-model-on-windows-arm64).

Set the environment variable `QNN_LIB_PATH` as `QNN_SDK\lib\aarch64-windows-msvc`.

### Install onnxruntime-qnn (ARM64 python)
If you want to evaluate the quantized model on the NPU, you will need to install the onnxruntime-qnn package. This package is only available for Windows ARM64 python so you will need a separate ARM64 python installation to install it.

Using an ARM64 python installation, create a virtual environment and install the onnxruntime-qnn package:
```bash
python -m venv qnn-ep-env
qnn-ep-env\Scripts\activate
python -m pip install ort-nightly-qnn --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/ORT-Nightly/pypi/simple/
deactivate
### Download data and model
To download the necessary data and model files:
```

Set the environment variable `QNN_ENV_PATH` to the directory where the python executable is located:
```bash
set QNN_ENV_PATH=C:\path\to\qnn-ep-env\Scripts
python download_files.py
```

**Note:** Using a virtual environment is optional but recommended to better manage the dependencies.

## Run the sample

### Quantize the model
Run the following command to quantize the model:
```bash
python mobilenet_qnn_ep.py
```

### Quantize and evaluate the model
Run the following command to quantize the model and evaluate it on the NPU:
```bash
python mobilenet_qnn_ep.py --evaluate
olive run --config mobilenet_qnn_ep.json
```

**NOTE:** You can also only dump the workflow configuration file by adding the `--config_only` flag to the command.

The configuration file will be saved as `mobilenet_qnn_ep_{eval|no_eval}.json` in the current directory and can be run using the Olive CLI.
```bash
olive run --config mobilenet_qnn_ep_{eval|no_eval}.json
```
**NOTE:** The model optimization part of the workflow can also be done on a Linux machine with a different onnxruntime package installed. Remove the `"evaluators"` and `"evaluator`" sections from the `mobilenet_qnn_ep.json` configuration file to skip the evaluation step.
2 changes: 1 addition & 1 deletion examples/mobilenet/README_QNN_SDK.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ olive configure-qualcomm-sdk --py_version 3.8 --sdk qnn

### Prepare workflow config json
```
python prepare_config.py --use_raw_qnn_sdk
python prepare_config.py
```

### Pip requirements
Expand Down
7 changes: 2 additions & 5 deletions examples/mobilenet/download_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,7 @@ def download_model():
tar_ref.extractall(stage_dir) # lgtm
original_model_path = stage_dir / mobilenet_name / f"{mobilenet_name}.onnx"
model_path = models_dir / f"{mobilenet_name}.onnx"
run_subprocess(
f"python -m onnxruntime.tools.make_dynamic_shape_fixed {original_model_path} {model_path} --dim_param"
" batch_size --dim_value 1"
)
shutil.copy(original_model_path, model_path)


def download_eval_data():
Expand All @@ -69,7 +66,7 @@ def download_eval_data():

# download evaluation data
github_source = "https://github.com/EliSchwartz/imagenet-sample-images.git"
run_subprocess(f"git clone {github_source} {stage_dir}")
run_subprocess(["git", "clone", github_source, stage_dir], check=True)

# sort jpegs
jpegs = list(stage_dir.glob("*.JPEG"))
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,15 @@
"local_system": {
"type": "LocalSystem",
"accelerators": [ { "execution_providers": [ "QNNExecutionProvider" ] } ]
},
"qnn_ep_env": {
"type": "IsolatedORT",
"python_environment_path": "<qnn_env_path>",
"accelerators": [ { "execution_providers": [ "QNNExecutionProvider" ] } ],
"preprend_to_path": [ "<qnn_lib_path>" ]
}
},
"data_configs": [
{
"name": "metric_data_config",
"name": "mobilenet_data_config",
"user_script": "user_script.py",
"load_dataset_config": { "type": "qnn_evaluation_dataset", "data_dir": "data/eval" },
"post_process_data_config": { "type": "qnn_post_process" },
"load_dataset_config": { "type": "mobilenet_dataset", "data_dir": "data/eval" },
"post_process_data_config": { "type": "mobilenet_post_process" },
"dataloader_config": { "batch_size": 1 }
},
{
"name": "quant_data_config",
"user_script": "user_script.py",
"load_dataset_config": { "type": "simple_dataset" },
"dataloader_config": { "type": "mobilenet_calibration_reader", "data_dir": "data/eval" }
}
],
"evaluators": {
Expand All @@ -33,7 +21,7 @@
{
"name": "accuracy",
"type": "accuracy",
"data_config": "metric_data_config",
"data_config": "mobilenet_data_config",
"sub_types": [
{
"name": "accuracy_score",
Expand All @@ -45,25 +33,26 @@
{
"name": "latency",
"type": "latency",
"data_config": "metric_data_config",
"data_config": "mobilenet_data_config",
"sub_types": [ { "name": "avg", "priority": 2 } ]
}
]
}
},
"passes": {
"dynamic_shape_to_fixed": { "type": "DynamicToFixedShape", "dim_param": [ "batch_size" ], "dim_value": [ 1 ] },
"qnn_preprocess": { "type": "QNNPreprocess" },
"quantization": {
"type": "OnnxStaticQuantization",
"data_config": "quant_data_config",
"data_config": "mobilenet_data_config",
"activation_type": "QUInt16",
"weight_type": "QUInt8"
}
},
"target": "<target>",
"evaluator": "<evaluator>",
"host": "local_system",
"target": "local_system",
"evaluator": "common_evaluator",
"evaluate_input_model": false,
"cache_dir": "cache",
"clean_cache": true,
"output_dir": "models/mobilenet_qnn_ep"
}
89 changes: 0 additions & 89 deletions examples/mobilenet/mobilenet_qnn_ep.py

This file was deleted.

11 changes: 1 addition & 10 deletions examples/mobilenet/prepare_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.
# --------------------------------------------------------------------------
import argparse
import json
import platform
from pathlib import Path
Expand Down Expand Up @@ -41,12 +40,4 @@ def raw_qnn_config():


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--use_raw_qnn_sdk",
action="store_true",
help="If set, use the raw qnn sdk instead of the qnn EP",
)
args = parser.parse_args()
if args.use_raw_qnn_sdk:
raw_qnn_config()
raw_qnn_config()
1 change: 1 addition & 0 deletions examples/mobilenet/raw_qnn_sdk_template.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
}
},
"passes": {
"dynamic_shape_to_fixed": { "type": "DynamicToFixedShape", "dim_param": [ "batch_size" ], "dim_value": [ 1 ] },
"converter": { "type": "QNNConversion" },
"quantization": { "type": "QNNConversion", "extra_args": "--input_list <input_list.txt>" },
"build_model_lib": { "type": "QNNModelLibGenerator", "lib_targets": "x86_64-linux-clang" }
Expand Down
36 changes: 18 additions & 18 deletions examples/mobilenet/user_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,13 @@

import numpy as np
import torch
from torch.utils.data import DataLoader, Dataset
from torch.utils.data import Dataset

from olive.data.registry import Registry
from olive.platform_sdk.qualcomm.utils.data_loader import FileListProcessedDataLoader

# QNN EP dataset and post-process functions


class MobileNetDataset(Dataset):
def __init__(self, data_dir: str):
Expand All @@ -28,7 +30,21 @@ def __len__(self):
def __getitem__(self, idx):
data = torch.unsqueeze(self.data[idx], dim=0)
label = self.labels[idx] if self.labels is not None else -1
return {"input": data}, label
# need to remove the batch dimension, will be added by the dataloader
return {"input": data.squeeze(0)}, label


@Registry.register_dataset()
def mobilenet_dataset(data_dir, **kwargs):
return MobileNetDataset(data_dir)


@Registry.register_post_process()
def mobilenet_post_process(output):
return output.argmax(axis=1)


# QNN SDK dataloader and post-process functions


@Registry.register_dataloader()
Expand All @@ -40,22 +56,6 @@ def qnn_dataloader(dataset, data_dir: str, batch_size: int, **kwargs):
)


@Registry.register_dataset()
def qnn_evaluation_dataset(data_dir, **kwargs):
return MobileNetDataset(data_dir)


@Registry.register_post_process()
def qnn_post_process(output):
return output.argmax(axis=1)


@Registry.register_post_process()
def qnn_sdk_post_process(output):
return np.array([output.argmax(axis=-1)])


@Registry.register_dataloader()
def mobilenet_calibration_reader(dataset, batch_size, data_dir, **kwargs):
dataset = MobileNetDataset(data_dir)
return DataLoader(dataset, batch_size=batch_size, shuffle=False)
Loading

0 comments on commit da3b967

Please sign in to comment.