Created some models #10

AragonerUA · 2023-09-14T13:06:06Z

No description provided.

F47-503

The main request is to move models to gpu before inference

F47-503 · 2023-09-14T13:10:22Z

models/huggingface_models.py

+    model = AutoModel.from_pretrained("daryl149/llama-2-7b-chat-hf")
+
+
+    # tokenizer = LlamaTokenizer.from_pretrained("/output/path")


Better do not leave commented unused code like this

models/huggingface_models.py

bzz · 2023-09-14T15:33:58Z

server/models/huggingface_models.py

+    pipeline = transformers.pipeline(
+        "text-generation",
+        model=model,
+        torch_dtype=torch.float16,


let's use torch.bfloat16 but still that would be ~12Gb

I've used int4 quantisation

nf4_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16 ) model_nf4 = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=nf4_config) tokenizer = AutoTokenizer.from_pretrained(model_name) pipeline = transformers.pipeline( "text-generation", model=model_nf4, torch_dtype=torch.bfloat16, tokenizer=tokenizer )

server/models/huggingface_models.py

The reason why I cannot test locally - I have Metal architecture. The most common error is the inability to use int4 in Metal, but it is used in model implementation, so I simply have not got access to this part of the code.

BitsAndBytes, without local testing

bzz · 2023-09-15T10:00:18Z

ATM this can not be merged until the conflicts are resolved.

This reverts commit db6c9d0.

bzz

LGTM, thanks!

Created some models

f5fcd09

AragonerUA requested review from bzz and F47-503 September 14, 2023 13:06

F47-503 requested changes Sep 14, 2023

View reviewed changes

AragonerUA added 2 commits September 14, 2023 17:28

Starcoder + llama

e39fdf8

Moved models folder to server folder

1c2d1b1

bzz reviewed Sep 14, 2023

View reviewed changes

server/models/huggingface_models.py Show resolved Hide resolved

AragonerUA and others added 5 commits September 14, 2023 19:04

Moved models to GPU

0a5a16d

Not tested version with the requirements

d6279b0

The reason why I cannot test locally - I have Metal architecture. The most common error is the inability to use int4 in Metal, but it is used in model implementation, so I simply have not got access to this part of the code.

Update huggingface_models.py

7334eab

BitsAndBytes, without local testing

Starcoder + codet5

3504116

server, models: compatibility with models ensured

db6c9d0

F47-503 added 3 commits September 17, 2023 06:21

Revert "server, models: compatibility with models ensured"

d2357e0

This reverts commit db6c9d0.

Merge branch 'main' into models

6a968ad

server, models: server with models compatibility ensured

ce05e30

bzz approved these changes Sep 17, 2023

View reviewed changes

bzz merged commit a428958 into main Sep 27, 2023
3 of 4 checks passed

bzz deleted the models branch September 27, 2023 07:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Created some models #10

Created some models #10

AragonerUA commented Sep 14, 2023

F47-503 left a comment

F47-503 Sep 14, 2023

bzz Sep 14, 2023 •

edited

Loading

bzz commented Sep 15, 2023

bzz left a comment

		model = AutoModel.from_pretrained("daryl149/llama-2-7b-chat-hf")


		# tokenizer = LlamaTokenizer.from_pretrained("/output/path")

Created some models #10

Created some models #10

Conversation

AragonerUA commented Sep 14, 2023

F47-503 left a comment

Choose a reason for hiding this comment

F47-503 Sep 14, 2023

Choose a reason for hiding this comment

bzz Sep 14, 2023 • edited Loading

Choose a reason for hiding this comment

bzz commented Sep 15, 2023

bzz left a comment

Choose a reason for hiding this comment

bzz Sep 14, 2023 •

edited

Loading