llama.cpp qwen2

作者: 香菜香菜我是折耳根 | 来源:发表于2024-08-18 20:16 被阅读0次

开发环境

1、准备模型
brew install git-lfs
git clone https://www.modelscope.cn/qwen/Qwen-7B-Chat.git

2、准备llama.cpp
brew install ccache
git clone git@github.com:ggerganov/llama.cpp.git
cd llama.cpp
make

conda create -n llama-cpp python=3.10
conda activate llama-cpp
pip install -r requirements.txt

pip install tiktoken

3、模型转换
将下载的Qwen模型转换为GGUF文件格式。

这里可以写篇文章介绍GGUF、Qwen模型表示

python convert-hf-to-gguf.py ~/workspaces/ai/Qwen1.5-7B-Chat/

4、量化模型
./quantize ~/workspaces/ai/Qwen-7B-Chat/ggml-model-f16.gguf ./models/qwen-chat-ggml-model-Q4_K_M.gguf Q4_K_M

5、测试
./main -m models/qwen-chat-ggml-model-Q4_K_M.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e

Ascend NPU

网友评论

本文标题：llama.cpp qwen2

本文链接：https://www.haomeiwen.com/subject/pbnpkjtx.html

备案信息：桂公网安备 45052102000051号 · 桂ICP备13007215号-3

本站所收录作品、热点评论等信息部分来源互联网，目的只是为了系统归纳学习和传递资讯

所有作品版权归原创作者所有，与本站立场无关，如不慎侵犯了你的权益，请联系我们告知，我们将做删除处理！