ROCm support #293

fxmarty · 2024-06-19T11:36:08Z

As per title. Support AMD GPUs through TEI backend.

For now, only embedding model with cls/mean pooling is tested, and tests need to actually be implemented.

MI210/MI250/MI300 can dispatch on CK flash attention 2, but other GPUs will default to manual attention implem (or SDPA). Only bert looks to be supported in the python backend.

fxmarty · 2024-06-19T13:50:23Z

moved to #295

fxmarty added 7 commits June 17, 2024 14:35

add dockerfile

2fc644c

working cls pooling

37d2931

add layers

8584b6d

Merge branch 'main' into rocm-support

6cdd454

support mean pooling in python backend

2a2993a

fix dockerfile and install

36b3a72

add tests

a8c02db

fxmarty mentioned this pull request Jun 19, 2024

Support TEI on AMD GPUs #108

Open

fxmarty added 3 commits June 19, 2024 12:02

tests instructions

35cc5b8

add rocm image builder

309d255

add reference tensors

ae3da10

fxmarty closed this Jun 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROCm support #293

ROCm support #293

fxmarty commented Jun 19, 2024

fxmarty commented Jun 19, 2024

ROCm support #293

ROCm support #293

Conversation

fxmarty commented Jun 19, 2024

fxmarty commented Jun 19, 2024