-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
looking for qfd360_sl_model.pt for facedetlite model.py #146
Comments
When you run Huggingface repo hosts the three target formats -QNN, ONN, Lite RT and not the torch script model / weights. Please let us know if you hit any issues when running the export scripts. |
Thank you for writing.
Google Gemini tells me that the export,py will "compile" the specified AI
model for inference on the selected snapdragon hardware. (See below). But,
it is not clear that it will be possible to send arbitrary inputs and
receive outputs to the compiled model on the remote snapdragon hardware
using export.py "Submits an inference job to run the compiled model on
sample inputs and collect output data" inference only on " sample inputs "
But it is not clear that I will be able to combine the operations of two
or more AI Hub Models and use arbitrary inputs.
So, for development, I wanted to operate the AI Hub models/weights on local
Windows PC (not snapdragon). It seems that there is a python script for
each model, e.g., ShuffleNetV2_model.py somewhere for each AI Hub model,
that can in some manner be run on my local PC. I will probably be able to
use Gemini to build a standalone ShuffleNetV2_model.py and inter-operate
two or more AI hub models on local PC for development.
Google Gemini tells me:
The code you provided defines a script for exporting a ShuffleNetV2 model,
optimizing it for on-device inference, and optionally profiling and
inferencing it on a target device using the Qualcomm AI Hub. Let's break
down what it exports:
*Key Exports:*
1.
*Compiled Model:*
-
The primary output of this script is a compiled version of the
ShuffleNetV2 model.
-
This compiled model is optimized for a specific target device and
runtime (e.g., TFLite, QNN).
-
It's the essential artifact needed to deploy the model for on-device
inference.
-
The compiled model is saved to the specified output_dir (or
build/shufflenet_v2 by default).
2.
*Profiling Data (Optional):*
-
If skip_profiling is False, the script submits a profiling job to
Qualcomm AI Hub.
-
The job runs the compiled model on a real device and collects
performance metrics (latency, memory usage, etc.).
-
This data can be downloaded and analyzed to understand the model's
performance characteristics on the target hardware.
-
The print_profile_metrics_from_job function displays a summary of
this data.
3.
*Inference Results (Optional):*
-
If skip_inferencing is False, the script submits an inference job to
Qualcomm AI Hub.
-
The job runs the compiled model on the target device using sample
input data.
-
The output of the inference is downloaded.
-
The print_inference_metrics function compares the on-device inference
results with the original PyTorch model's output to assess accuracy.
4.
*Metadata and Reports:*
-
The script generates metadata about the compile, profile, and
inference jobs, which is stored in hub.client.CompileJob,
hub.client.ProfileJob, and hub.client.InferenceJob objects,
respectively. These are part of the ExportResult returned by
export_model.
-
It also prints a command-line example (using print_on_target_demo_cmd)
that demonstrates how to run the compiled model on the target
device using
the qai_hub_model_tools package, a locally available set of tools.
This helps users get started with deploying and testing their exported
models.
*In essence, the script exports the following:*
-
*Tangible Artifacts:*
-
The compiled model file, ready for on-device deployment.
-
Profiling data (if requested).
-
Inference output data (if requested).
-
*Intangible Outputs:*
-
Job metadata for tracking and management.
-
Printed summaries of profiling and inference results.
-
A command-line example for on-device execution.
*How it Works (Simplified):*
1.
*Model Preparation:* Loads the ShuffleNetV2 model from qai_hub_models,
traces it using torch.jit.trace, and prepares it for compilation.
2.
*Compilation:* Submits a compile job to Qualcomm AI Hub to convert the
traced model into a device-optimized format.
3.
*Profiling (Optional):* Submits a profile job to run the compiled model
on a real device and collect performance metrics.
4.
*Inference (Optional):* Submits an inference job to run the compiled
model on sample inputs and collect output data.
5.
*Download & Summary:* Downloads the compiled model and optionally the
profiling/inference results, then prints summaries and instructions.
*In Summary:*
The script's main purpose is to export a ready-to-deploy, optimized version
of the ShuffleNetV2 model for a specified target device using the Qualcomm
AI Hub platform. It also provides tools and information to help users
profile, test, and deploy their models effectively.
…On Tue, Jan 7, 2025 at 3:25 PM Shreya Jain ***@***.***> wrote:
When you run export.py for this model, the weights used for the model
would be downloaded to your local machine. Then, the model format is loaded
with the downloaded weights and traced torch script model is created. This
model is then uploaded to AI Hub for compiled to be run on device.
Similarly for quantized models, you can run the export script to get the
model files.
Huggingface repo hosts the three target formats -QNN, ONN, Lite RT and not
the torch script model / weights.
—
Reply to this email directly, view it on GitHub
<#146 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BHT6KT5PJRO73FX6OJHH5Z32JQZ2ZAVCNFSM6AAAAABUSQOAJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZWGE2TONZXGY>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The example model.py at https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/face_det_lite/model.py
and
https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized
refer3ences a parameter checkpoint file named qfd360_sl_model.pt
DEFAULT_WEIGHTS = "qfd360_sl_model.pt"
But, this checkpoint file is not provided in the adjacent https://github.com/quic/ai-hub-models/tree/main/qai_hub_models/models/face_det_lite
At this other location, https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized/tree/main
there are "quantized" model weights associated with qualcomm Lightweight-Face-Detection-Quantized
So, there is a file mismatch between model.py (looking for qfd360_sl_model.pt) and the elsewhere available pretrained model parameters. So, 1) please explain how to convert model.py to load parameters from the available at https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized/tree/main
and 2) please provide the referenced qfd360_sl_model.pt at https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/face_det_lite/
The text was updated successfully, but these errors were encountered: