forked from microsoft/onnxruntime
-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add migx ep fp8 int4 #78
Open
TedThemistokleous
wants to merge
10
commits into
rocm6.3_internal_testing
Choose a base branch
from
add_migx_ep_fp8_int4
base: rocm6.3_internal_testing
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
TedThemistokleous
force-pushed
the
add_migx_ep_fp8_int4
branch
from
December 4, 2024 23:05
9b44fcc
to
7076f74
Compare
TedThemistokleous
force-pushed
the
add_migx_ep_fp8_int4
branch
from
December 18, 2024 21:26
4850dfd
to
11ff644
Compare
streamhsa
force-pushed
the
add_migx_ep_fp8_int4
branch
from
December 25, 2024 03:52
c19fa76
to
d1a2609
Compare
Map things to int8 right now as we don't explicitly set an int4 input type and pack/unpack int4 operands
Mirror the same calibration code we use for int8 and just change which quantize we call through the MIGraphx API
- Add additional flags for fp8 thats shared for int8 - Add lockout warning message when int8/fp8 used at the same time
Previous runs using session options failed as we were missing pulling in inputs from the python interface. This plus additional logging allowed me to track what options were invoked via env and what were added during the start of an inference session
TedThemistokleous
force-pushed
the
add_migx_ep_fp8_int4
branch
from
January 22, 2025 22:13
87f1f91
to
a314c7f
Compare
rebasing off ort_value changes used for llama_V2 pipe to verify end to end with an int4 model. |
need this so the user knows there's any of the environment variables running in the background to ensure proper consistently between runs.
TedThemistokleous
force-pushed
the
add_migx_ep_fp8_int4
branch
from
January 24, 2025 04:54
a314c7f
to
56246ac
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Datatype support for int4 and fp8 (all formats)
Motivation and Context
Allows us to support operators of these data types to be handled by Onnxruntime MIGraphX EP