You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sudo docker run --rm --runtime=nvidia --gpus all -it --shm-size=8gb --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --name=torch_tensorrt --ipc=host --net=host torch_tensorrt:latest
python
import torch_tensorrt
Expected behavior
The torch_tensorrt should be imported succesfully.
Environment
Build information about Torch-TensorRT can be found by turning on debug messages
Torch-TensorRT Version (e.g. 1.0.0): 2.6.0a0 (since I directly cloned the main branch)
PyTorch Version (e.g. 1.0): N/A (installed within docker)
CPU Architecture: X86-64 (Intel I9-13900K)
OS (e.g., Linux): Ubuntu 22.04 Desktop
How you installed PyTorch (conda, pip, libtorch, source): Managed by Dockerfile
Build command you used (if compiling from source): See the above Steps to reproduce
Are you using local sources or building from archives: N/A
Python version: 3.10
CUDA version: local 12.6, inside docker it seems to be 12.4
GPU models and configuration: RTX4090
Any other relevant information: N/A
Additional context
The INFO: pip is looking at multiple versions of torch issue was not exist for the very first few attempts of my docker build, I wonder if this is a cache conflict or caused by other issues.
The text was updated successfully, but these errors were encountered:
To my knowledge, you would get the undefined symbol error in two cases: 1) mismatched torch version and libtorch where you can find in MODULE.bazel 2) didn't use --use-cxx11-abi. For CUDA 12.6, if you want to build torch-trt from source, you need to run something like python setup.py develop --use-cxx11-abi
Thanks for the tip! After a painful try of 3-4 days, I finally gave up and used the nvidia pytorch docker instead. It was a mess for me to sort out the library, especially when multiple torch packages were installed (both in conda and local python env) in my system. Indeed I did not use --use-cxx11-abi, not sure if that was the real reason behind the issue. Since I have found another solution, I shall close this issue.
Bug Description
I have tried installing the repo with docker by:
sudo DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=10.7.0 -f docker/Dockerfile -t torch_tensorrt:latest .
At my first attempt, the docker building process shows a warning:
INFO: pip is looking at multiple versions of torch
then it downloads many torch versions without installing them and the process stucks forever.
After some investigation I added the
RUN pip install --upgrade pip
into Dockerfile, right under the line ofAND right above the line of
Now the build process can finish, but once I go inside the container with
sudo docker run --rm --runtime=nvidia --gpus all -it --shm-size=8gb --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --name=torch_tensorrt --ipc=host --net=host torch_tensorrt:latest
and try to import torch_tensorrt inside python, it gives the following error:
OSError: /root/.pyenv/versions/3.10.16/lib/python3.10/site-packages/torch_tensorrt/lib/libtorchtrt.so: undefined symbol: _ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_RKSs
I have also tried to build from source, which ended up with exactly the same error as in docker (undefined symbol).
I wonder are there any issues with my OS? I am using Ubuntu 22.04, with cuda 12.6 installed.
To Reproduce
Steps to reproduce the behavior:
git clone https://github.com/pytorch/TensorRT.git
RUN pip install --upgrade pip
into Dockerfile as described abovesudo DOCKER_BUILDKIT=1 docker build --build-arg TENSORRT_VERSION=10.7.0 -f docker/Dockerfile -t torch_tensorrt:latest .
sudo docker run --rm --runtime=nvidia --gpus all -it --shm-size=8gb --env="DISPLAY" --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" --name=torch_tensorrt --ipc=host --net=host torch_tensorrt:latest
Expected behavior
The torch_tensorrt should be imported succesfully.
Environment
conda
,pip
,libtorch
, source): Managed by DockerfileAdditional context
The
INFO: pip is looking at multiple versions of torch
issue was not exist for the very first few attempts of my docker build, I wonder if this is a cache conflict or caused by other issues.The text was updated successfully, but these errors were encountered: