support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ? #196

mahald · 2024-10-28T00:03:36Z

can you add support for this bad boy: https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ?

werruww · 2024-11-03T02:55:01Z

from airllm import AutoModel
import torch

MAX_LENGTH = 15

could use hugging face model repo id:

model = AutoModel.from_pretrained("unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit", delete_original=True)

input_text = [
'What is the capital of United States?',
]

input_tokens = model.tokenizer(input_text,
return_tensors="pt",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH,
padding=False)

generation_output = model.generate(
input_tokens['input_ids'].cuda(),
max_new_tokens=2,
use_cache=True,
return_dict_in_generate=True)

output = model.tokenizer.decode(generation_output.sequences[0])

print(output)

AssertionError: Torch not compiled with CUDA enabled

colab tpu

wajeehulhassanvii · 2024-11-07T06:25:09Z

from airllm import AutoModel import torch

MAX_LENGTH = 15

could use hugging face model repo id:

model = AutoModel.from_pretrained("unsloth/Llama-3.1-Nemotron-70B-Instruct-bnb-4bit", delete_original=True)

input_text = [ 'What is the capital of United States?', ]

input_tokens = model.tokenizer(input_text, return_tensors="pt", return_attention_mask=False, truncation=True, max_length=MAX_LENGTH, padding=False)

generation_output = model.generate( input_tokens['input_ids'].cuda(), max_new_tokens=2, use_cache=True, return_dict_in_generate=True)

output = model.tokenizer.decode(generation_output.sequences[0])

print(output)

AssertionError: Torch not compiled with CUDA enabled

colab tpu

I have a RTX 3090, any idea how much disk space would be required to run nemotron. Also, how can I load the model from a different directory?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ? #196

support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ? #196

mahald commented Oct 28, 2024

werruww commented Nov 3, 2024

wajeehulhassanvii commented Nov 7, 2024

could use hugging face model repo id:

support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ? #196

support for https://huggingface.co/nvidia/Nemotron-4-340B-Instruct ? #196

Comments

mahald commented Oct 28, 2024

werruww commented Nov 3, 2024

could use hugging face model repo id:

wajeehulhassanvii commented Nov 7, 2024

could use hugging face model repo id: