looking for qfd360_sl_model.pt for facedetlite model.py #146

MartialTerran · 2025-01-03T23:59:41Z

The example model.py at https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/face_det_lite/model.py
and
https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized
refer3ences a parameter checkpoint file named qfd360_sl_model.pt
DEFAULT_WEIGHTS = "qfd360_sl_model.pt"

But, this checkpoint file is not provided in the adjacent https://github.com/quic/ai-hub-models/tree/main/qai_hub_models/models/face_det_lite

At this other location, https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized/tree/main
there are "quantized" model weights associated with qualcomm Lightweight-Face-Detection-Quantized

So, there is a file mismatch between model.py (looking for qfd360_sl_model.pt) and the elsewhere available pretrained model parameters. So, 1) please explain how to convert model.py to load parameters from the available at https://huggingface.co/qualcomm/Lightweight-Face-Detection-Quantized/tree/main
and 2) please provide the referenced qfd360_sl_model.pt at https://github.com/quic/ai-hub-models/blob/main/qai_hub_models/models/face_det_lite/

shreyajn · 2025-01-07T20:24:55Z

When you run export.py for this model, the weights used for the model would be downloaded to your local machine. Then, the model format is loaded with the downloaded weights and traced torch script model is created. This model is then uploaded to AI Hub for compiled to be run on device. Similarly for quantized models, you can run the export script to get the model files.

Huggingface repo hosts the three target formats -QNN, ONN, Lite RT and not the torch script model / weights.

Please let us know if you hit any issues when running the export scripts.

MartialTerran · 2025-01-08T08:03:15Z

Thank you for writing. Google Gemini tells me that the export,py will "compile" the specified AI model for inference on the selected snapdragon hardware. (See below). But, it is not clear that it will be possible to send arbitrary inputs and receive outputs to the compiled model on the remote snapdragon hardware using export.py "Submits an inference job to run the compiled model on sample inputs and collect output data" inference only on " sample inputs " But it is not clear that I will be able to combine the operations of two or more AI Hub Models and use arbitrary inputs. So, for development, I wanted to operate the AI Hub models/weights on local Windows PC (not snapdragon). It seems that there is a python script for each model, e.g., ShuffleNetV2_model.py somewhere for each AI Hub model, that can in some manner be run on my local PC. I will probably be able to use Gemini to build a standalone ShuffleNetV2_model.py and inter-operate two or more AI hub models on local PC for development. Google Gemini tells me: The code you provided defines a script for exporting a ShuffleNetV2 model, optimizing it for on-device inference, and optionally profiling and inferencing it on a target device using the Qualcomm AI Hub. Let's break down what it exports: *Key Exports:* 1. *Compiled Model:* - The primary output of this script is a compiled version of the ShuffleNetV2 model. - This compiled model is optimized for a specific target device and runtime (e.g., TFLite, QNN). - It's the essential artifact needed to deploy the model for on-device inference. - The compiled model is saved to the specified output_dir (or build/shufflenet_v2 by default). 2. *Profiling Data (Optional):* - If skip_profiling is False, the script submits a profiling job to Qualcomm AI Hub. - The job runs the compiled model on a real device and collects performance metrics (latency, memory usage, etc.). - This data can be downloaded and analyzed to understand the model's performance characteristics on the target hardware. - The print_profile_metrics_from_job function displays a summary of this data. 3. *Inference Results (Optional):* - If skip_inferencing is False, the script submits an inference job to Qualcomm AI Hub. - The job runs the compiled model on the target device using sample input data. - The output of the inference is downloaded. - The print_inference_metrics function compares the on-device inference results with the original PyTorch model's output to assess accuracy. 4. *Metadata and Reports:* - The script generates metadata about the compile, profile, and inference jobs, which is stored in hub.client.CompileJob, hub.client.ProfileJob, and hub.client.InferenceJob objects, respectively. These are part of the ExportResult returned by export_model. - It also prints a command-line example (using print_on_target_demo_cmd) that demonstrates how to run the compiled model on the target device using the qai_hub_model_tools package, a locally available set of tools. This helps users get started with deploying and testing their exported models. *In essence, the script exports the following:* - *Tangible Artifacts:* - The compiled model file, ready for on-device deployment. - Profiling data (if requested). - Inference output data (if requested). - *Intangible Outputs:* - Job metadata for tracking and management. - Printed summaries of profiling and inference results. - A command-line example for on-device execution. *How it Works (Simplified):* 1. *Model Preparation:* Loads the ShuffleNetV2 model from qai_hub_models, traces it using torch.jit.trace, and prepares it for compilation. 2. *Compilation:* Submits a compile job to Qualcomm AI Hub to convert the traced model into a device-optimized format. 3. *Profiling (Optional):* Submits a profile job to run the compiled model on a real device and collect performance metrics. 4. *Inference (Optional):* Submits an inference job to run the compiled model on sample inputs and collect output data. 5. *Download & Summary:* Downloads the compiled model and optionally the profiling/inference results, then prints summaries and instructions. *In Summary:* The script's main purpose is to export a ready-to-deploy, optimized version of the ShuffleNetV2 model for a specified target device using the Qualcomm AI Hub platform. It also provides tools and information to help users profile, test, and deploy their models effectively.

…

On Tue, Jan 7, 2025 at 3:25 PM Shreya Jain ***@***.***> wrote: When you run export.py for this model, the weights used for the model would be downloaded to your local machine. Then, the model format is loaded with the downloaded weights and traced torch script model is created. This model is then uploaded to AI Hub for compiled to be run on device. Similarly for quantized models, you can run the export script to get the model files. Huggingface repo hosts the three target formats -QNN, ONN, Lite RT and not the torch script model / weights. — Reply to this email directly, view it on GitHub <#146 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/BHT6KT5PJRO73FX6OJHH5Z32JQZ2ZAVCNFSM6AAAAABUSQOAJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNZWGE2TONZXGY> . You are receiving this because you authored the thread.Message ID: ***@***.***>

mestrona-3 added the bug Something isn't working label Jan 7, 2025

mestrona-3 added question Please ask any questions on Slack. This issue will be closed once responded to. and removed bug Something isn't working labels Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

looking for qfd360_sl_model.pt for facedetlite model.py #146

looking for qfd360_sl_model.pt for facedetlite model.py #146

MartialTerran commented Jan 3, 2025

shreyajn commented Jan 7, 2025 •

edited

Loading

MartialTerran commented Jan 8, 2025 via email

looking for qfd360_sl_model.pt for facedetlite model.py #146

looking for qfd360_sl_model.pt for facedetlite model.py #146

Comments

MartialTerran commented Jan 3, 2025

shreyajn commented Jan 7, 2025 • edited Loading

MartialTerran commented Jan 8, 2025 via email

shreyajn commented Jan 7, 2025 •

edited

Loading