-
Notifications
You must be signed in to change notification settings - Fork 911
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can i use 65b model and how? #149
Comments
To run this model, I modified chat.cpp and changed in LLAMA_N_PARTS { 8192, 1 } to { 8192, 8 }... Then recompiled and used -m ggml-model-q4_0.bin . Worked smoothly. ~42GB RAM used |
Yes it works for me , but in Windows chat.exe version. Somehow it works for me. Thank you! |
I would like to experiment with this. I have Asrock board with 128Gb, that's the only manufacturer as i think which still announcing new motherboards with multi-slots huge RAM on consumers market (the other choice is only Microstar server boards). I kinda anticipated all this in 2020th, but usable Ai's surfaced only now.
Have you posted your compiled file anywhere? I'm using Win 10 x64. |
No, I didn't.. I used standard windows instructions... cmake-based... to build binary |
I spent several days on unsuccessful attempts to compile chat.cpp. I beg you, can you share a ready -made application for launching 65B model? |
I download this files:
ggml-model-q4_0.bin
ggml-model-q4_0.bin.1
ggml-model-q4_0.bin.2
ggml-model-q4_0.bin.3
ggml-model-q4_0.bin.4
ggml-model-q4_0.bin.5
ggml-model-q4_0.bin.6
ggml-model-q4_0.bin.7
Can i run some command like ./chat -m ggml-model-q4_0.bin and load all of this models to use 65b power?
The text was updated successfully, but these errors were encountered: