Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int8 input not supported for average pooling in MLIR_TRT #457

Open
farazkh80 opened this issue Dec 18, 2024 · 7 comments
Open

int8 input not supported for average pooling in MLIR_TRT #457

farazkh80 opened this issue Dec 18, 2024 · 7 comments
Assignees
Labels
mlir-tensorrt Pull request for the mlir-tensorrt project

Comments

@farazkh80
Copy link
Collaborator

Happened when running test_dtype_constraints[avgpool-valid:T1-int8]

summary = 'MTRTException: InternalError: failed to run compilation on module with symbol name: ins_t9521_outs_t9522_988\n\nAddit...s._api.MTRTException: InternalError: failed to run compilation on module with symbol name: ins_t9521_outs_t9522_988\n.'
details = ["IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Node [tensorrt.pooling] (t9522)cannot be quantized b...s=[1, 1, 1, 1], padding=[(0, 0), (0, 0), (0, 0), (0, 0)])\n      | ", '\n', '\nThis operation was introduced to ', ...]

    def raise_error(summary: str, details: List[Any] = []):
        """
        Raises a Tripy exception with a formatted message.
    
        Args:
            summary: A summary of the error message. This will be displayed before any other details.
            details: Details on the error. This function handles objects in this list as follows:
                - If they include a `stack_info` member, then information on the first user frame is displayed,
                    including file/line information as well as the line of code.
    
                    IMPORTANT: Any stack frames from the function registry are not displayed since
                    the function registry is an implementation detail used to dispatch to the real functions
                    we care about. Additionally, any code defined in the functions listed in ``EXCLUDE_FUNCTIONS``
                    is omitted.
    
                - In all other cases, the object is just converted to a string.
    
        Raises:
            TripyException
        """
    
        pre_summary = ""
        stack_info = utils.get_stack_info()
        user_frame_index = stack_info.get_first_user_frame_index()
        if user_frame_index is not None:
            stack_info.fetch_source_code()
            pre_summary = str_from_source_info(stack_info[user_frame_index])
    
        detail_msg = ""
        for detail in details:
            stack_info_message = None
            if hasattr(detail, "stack_info"):
                stack_info_message = str_from_stack_info(detail.stack_info)
            elif isinstance(detail, utils.StackInfo):
                stack_info_message = str_from_stack_info(detail)
    
            if stack_info_message is not None:
                detail_msg += stack_info_message
            else:
                detail_msg += str(detail)
    
        msg = f"{pre_summary}{summary}\n" + indent(detail_msg, " " * 4)
        # We use `from None` to suppress output from previous exceptions, since we want to handle them internally.
>       raise TripyException(msg) from None
E       tripy.common.exception.TripyException: 
E       
E       --> /tripy/tests/wrappers/test_interface.py:221 in _run_dtype_constraints_subtest()
E             |
E         221 |     ret_val.eval()
E             | 
E       
E       MTRTException: InternalError: failed to run compilation on module with symbol name: ins_t9521_outs_t9522_988
E       
E       Additional context:
E       Traceback (most recent call last):
E         File "/tripy/tripy/backend/mlir/compiler.py", line 86, in compile
E           executable = compiler.compiler_stablehlo_to_executable(
E       mlir_tensorrt.runtime._mlir_libs._api.MTRTException: InternalError: failed to run compilation on module with symbol name: ins_t9521_outs_t9522_988
E       .
E           IBuilder::buildSerializedNetwork: Error Code 1: Internal Error (Node [tensorrt.pooling] (t9522)cannot be quantized by arg0. You might want to add a DQ node before [tensorrt.pooling] (t9522).
E           )
E           (t9522)error: failed to translate function 'tensorrt_cluster' to a TensorRT engine
E       
E           This error occured while trying to compile the following FlatIR expression:
E                 |
E                 | t_inter2: [rank=(4), shape=((-1, -1, -1, -1)), dtype=(int8), loc=(gpu:0)] = ReduceWindowOp(t9521, t_inter3, reduce_mode='avg', window_dims=[1, 1, 2, 2], window_strides=[1, 1, 1, 1], padding=[(0, 0), (0, 0), (0, 0), (0, 0)])
E                 | 
E       
E           This operation was introduced to create the output of reduce `avg` operation..
E       
E           Note: This originated from the following expression:
E       
E           --> <string>:7 in <module>()
E       
E           Input 0:
E       
E           --> /tripy/tests/wrappers/object_builders.py:35 in tensor_builder()
E                 |
E              35 |         out = tp.cast(out, dtype=namespace[dtype])
E                 |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@pranavm-nvidia
Copy link
Collaborator

@farazkh80 could you post the trace if you still have it?

@yizhuoz004
Copy link
Collaborator

How to reproduce this error? This test passes locally.

@pranavm-nvidia
Copy link
Collaborator

@yizhuoz004 the error only happens when we use an input to the pooling layer. My suspicion is that it's being constant folded in other cases. It will probably repro if you tp.compile the tp.avgpool and use an int8 input.

@farazkh80
Copy link
Collaborator Author

here is the trace

inputs:
    t35: [shape=([1, 1, 8, 8]), dtype=(int8), loc=(gpu:0)]
t36 = pooling(t35, kind=Kind.AVG, kernel_dims=[2, 2], stride=[1, 1], padding=[(0, 0), (0, 0)])
outputs:
    t36: [shape=([-1, -1, -1, -1]), dtype=(int8), loc=(gpu:0)]

@yizhuoz004
Copy link
Collaborator

Can reproduce by making the int8 tensor as an input. This is most likely a TRT constraint, will file a bug. We can waive it for now. Also torch avg pooling does not support int8, this should be a rare use case.

@yizhuoz004
Copy link
Collaborator

There are issues in both TRT and MLIR-TRT.
TRT: this error is not expected when there is no explict Q/DQ node.
MLIR-TRT: stablehlo -> tensorrt translation contains unnecessary elementwise layers, for a single avg pooling layer:

  tensorrt.module @trt_engines {
    func.func @tensorrt_cluster(%arg0: tensor<1x1x8x8xi8>) -> (tensor<1x1x7x7xi8> {tensorrt.shape_profile = #profile}) attributes {cluster.tensorrt} {
      %cst_i8 = tensorrt.constant dense<4> : tensor<1x1x1x1xi8>
      %cst_i8_0 = tensorrt.constant dense<4> : tensor<1x1x7x7xi8>
      %0 = tensorrt.pooling {averageCountExcludesPadding = true, poolingType = #tensorrt.pooling_type<kAVERAGE>, postPadding = array<i64: 0, 0>, prePadding = array<i64: 0, 0>, stride = array<i64: 1, 1>, windowSize = array<i64: 2, 2>} ins(%arg0 : tensor<1x1x8x8xi8>) -> tensor<1x1x7x7xi8>
      %1 = tensorrt.element_wise <kPROD>(%0, %cst_i8 : tensor<1x1x7x7xi8>, tensor<1x1x1x1xi8>) -> tensor<1x1x7x7xi8>
      %2 = tensorrt.element_wise <kDIV>(%1, %cst_i8_0 : tensor<1x1x7x7xi8>, tensor<1x1x7x7xi8>) -> tensor<1x1x7x7xi8>
      return %2 : tensor<1x1x7x7xi8>
    }
  }

@shelkesagar29 shelkesagar29 self-assigned this Jan 15, 2025
@shelkesagar29 shelkesagar29 added the mlir-tensorrt Pull request for the mlir-tensorrt project label Jan 15, 2025
@shelkesagar29
Copy link
Collaborator

Fixed a push internally. Should be available to OSS soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir-tensorrt Pull request for the mlir-tensorrt project
Projects
None yet
Development

No branches or pull requests

4 participants