2 in a single week that is crazy

Ascend910@lemmy.ml · 2 days ago

2 in a single week that is crazy

Cort@lemmy.world · 20 hours ago

Would a 12g 3060 work?

brucethemoose@lemmy.world · edit-2 7 hours ago

Yes! Try this model: https://huggingface.co/arcee-ai/Virtuoso-Small-v2

Or the 14B thinking model: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

But for speed and coherence, instead of ollama, I’d recommend running it through Aphrodite or TabbyAPI as a backend, depending if you prioritize speed or long inputs. They both act as generic OpenAI endpoints.

I’ll even step you through it and upload a quantization for your card, if you want, as it looks like there’s not a good-sized exl2 on huggingface.