Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ai benchmark improvements #26

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

rairatne
Copy link

@rairatne rairatne commented Jun 1, 2023

Modified nn-hal to improve memory utilization and scores on the Ai_Benchmark app.
Improvement includes:
- better memory usage while doing parallel inference
- more operations enabled/added with float 16 support
- offload to remote infer if available
- offload to remote only if the model is non-quant type
- for now, remote-infer is only supported if nn-api calls execute Synchronously
- enable parallel remote inference
- supports dynamic input shapes and data-types for remote infer

Tracked-On: OAM-109729

- loadmodel rpc call added after sending IR files
- included data_type parameter for remote input data
- add check for remote output length

Tracked-On: OAM-110555
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
rairatne and others added 7 commits June 2, 2023 14:24
- increases grpc message size limit to INT_MAX
- increases deadline for remote model load time to 3 minutes
- modified mRemoteCheck from global to class member
- improved remote checks
- increase chunk size from 1 MB to 10 MB

Tracked-On:OAM-110557
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
Signed-off-by: Anoob Anto K <[email protected]>
- Added Hard Swish
- Enabled Resize Bilinear for float 16 and Quant
- Enabled Resize Nearest Neighbor for float 16 and Quant
- Resolved quan type conversion for Quant Asymm and
Signed for Split

Tracked-On: OAM-110564
Signed-off-by: Anoob Anto K <[email protected]>
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
This fixes the following errors

- Upcasting non-compliant model
- Upcasting non-compliant operand type TENSOR_QUANT8_ASYMM_SIGNED from V1_3::OperandType to V1_2::OperandType

Tracked-On: OAM-110572
Signed-off-by: Anoob Anto K <[email protected]>
- modified mDetectionClient to class object from global
- added Tokens to identify specific model request over
grpc
- added release rpc call to do cleanup on remote side
- [To be fixed]removed remote infer for asyncExceute and
fencedExecute as implementaion was not proper

Tracked-On: OAM-110559
Signed-off-by: Anoob Anto K <[email protected]>
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
Remove static variables
- Separate ModelInfo objects for each operation
- unmap runtime memory pool at end of each execute call
- optimised network graph creator so that it can be released once
graph is created and loaded

Tracked-On: OAM-110558
Signed-off-by: Anoob Anto K <[email protected]>
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
- if remote infer fails, disable parallel attempts for
remote inference

- disable remote infer for quant type models

Tracked-On: OAM-110563
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
Signed-off-by: Anoob Anto K <[email protected]>
- Split previous loadNetwork into two parts
    - create Network: It loads the generated graph Network
    and dump it as xml and bin
    - loadNetwork: which now reads the xml and bin and
    and create infer request

- fallback to native inference if remote infer fails.

Note: fallback causes load network to trigger load for
native infer which increase infer time in fallback scenario,
in case of only native infer(no remote infer) compile_model
is called twice, thus resulting in longer model
time.
Sub-Task JIRA: OAM-110562

Tracked-On: OAM-109729
Signed-off-by: Ratnesh Kumar Rai <[email protected]>
@rairatne rairatne force-pushed the ai_benchmark_improvements branch from 6ead7e3 to ac80d67 Compare June 2, 2023 09:14
@sysopenci sysopenci added the Stale Stale label for inactive open prs label Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Stale Stale label for inactive open prs Valid commit message
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants