Just use huggingface #6

dustydecapod · 2023-03-08T06:43:17Z

All of the models are on huggingface already. https://huggingface.co/decapoda-research

there's even an open, working pr to add support to the transformers lib.

shawwn · 2023-03-08T18:29:17Z

Sure, use whatever works. This repo is intended to serve as a point of communication about llama, and also as an extra mirror.

Note that Facebook has been issuing takedown requests against huggingface llama repositories, so those may get knocked offline.

loretoparisi · 2023-03-11T18:50:32Z

All of the models are on huggingface already. https://huggingface.co/decapoda-research

there's even an open, working pr to add support to the transformers lib.

It's worth to note that those models files have been converted to be used in the HF library, so if we take the 7B models files here

According the authors the model has been infact

LLaMA-7B converted to work with Transformers/HuggingFace. This is under a special license, please see the LICENSE file for details.

So supposed we want to use model's file in C++ inference here I'm not sure if would work.

tljstewart · 2023-03-13T07:17:36Z

@loretoparisi Ya i'm thinking along the same lines and trying to make sense here. There are 8bit and 4bit quantized, the original and the huggingface versions... I think C++ inference use the original weights and converts them, to ggml format the authors own format and also does the quantization...?

Can this be confirmed?

Also, I am currently using ipfs downloading current time is 2d9h42m for 65B.... as the magnet link in this repo seems to be down as well as huggingface...

Any thoughts on the model formats with C++ or a way to download the weights faster?

loretoparisi · 2023-03-13T07:32:54Z

yes confirmed. You first convert weights to ggml FP16 or FP32, then quantize to 4bit and run inference (cpu only).

tljstewart · 2023-03-13T07:47:02Z

Ah ok, so your suppose to get the original released weights and the C++ code converts it? Also I found an original weight torrent link and its going extremely fast, ETA 3hour for 235GB.

webtorrent download o8a7xw.torrent

loretoparisi · 2023-03-13T07:50:53Z

yes this is exactly what I did from the download here.

risos8200 · 2023-07-16T11:50:34Z

You can also use https://huggingface.co/huggyllama, works with llama.cpp.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Just use huggingface #6

Just use huggingface #6

dustydecapod commented Mar 8, 2023

shawwn commented Mar 8, 2023

loretoparisi commented Mar 11, 2023

tljstewart commented Mar 13, 2023

loretoparisi commented Mar 13, 2023

tljstewart commented Mar 13, 2023 •

edited

Loading

loretoparisi commented Mar 13, 2023 •

edited

Loading

risos8200 commented Jul 16, 2023

Just use huggingface #6

Just use huggingface #6

Comments

dustydecapod commented Mar 8, 2023

shawwn commented Mar 8, 2023

loretoparisi commented Mar 11, 2023

tljstewart commented Mar 13, 2023

loretoparisi commented Mar 13, 2023

tljstewart commented Mar 13, 2023 • edited Loading

loretoparisi commented Mar 13, 2023 • edited Loading

risos8200 commented Jul 16, 2023

tljstewart commented Mar 13, 2023 •

edited

Loading

loretoparisi commented Mar 13, 2023 •

edited

Loading