Debian 12 安装 Nvidia 驱动和 Ollama

本文最后更新于 2024年9月23日 凌晨

根据同事反馈,高版本的 NVIDIA 驱动兼容性有问题,需要安装 Nvidia 驱动 525.147.05 ,过程中可能需要升级内核。

安装 Nvidia 驱动

查看 Debian 上显卡安装情况。

1
2
lspci -nn | egrep -i "3d|display|vga"  
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation AD102 [GeForce RTX 4090] [10de:2684] (rev a1)

查看驱动安装具体的情况。

1
2
3
4
5
6
7
8
9
10
11
12
lsmod | grep nouveau  
nouveau              2433024  0
mxm_wmi                16384  1 nouveau
i2c_algo_bit           16384  1 nouveau
drm_display_helper    184320  1 nouveau
drm_ttm_helper         16384  1 nouveau
ttm                    94208  2 drm_ttm_helper,nouveau
drm_kms_helper        204800  2 drm_display_helper,nouveau
drm                   614400  5 drm_kms_helper,drm_display_helper,drm_ttm_helper,ttm,nouveau
video                  65536  2 asus_wmi,nouveau
wmi                    36864  5 video,asus_wmi,wmi_bmof,mxm_wmi,nouveau
button                 24576  1 nouveau

看来安装的是开源版本的驱动 nouveau,需要先禁用。

1
2
3
4
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/nouveau-blacklist.conf
sudo update-initramfs -u
sudo update-grub
sudo reboot

重启后,执行 lsmod | grep nouveau 发现已经返回为空了,成功禁用。

执行命令 sudo apt install nvidia-driver firmware-misc-nonfree 安装 NVIDIA Proprietary Driver 报错。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Consult /var/lib/dkms/nvidia-current/525.147.05/build/make.log for more information.  
dpkg: error processing package nvidia-kernel-dkms (--configure):
installed nvidia-kernel-dkms package post-installation script subprocess returned error exit status 10
dpkg: dependency problems prevent configuration of nvidia-driver:
nvidia-driver depends on nvidia-kernel-dkms (= 525.147.05-4~deb12u1) | nvidia-kernel-525.147.05 | nvidia-open-kernel-525.147.05 | nvidia-open-kernel-525.147.05; however:
 Package nvidia-kernel-dkms is not configured yet.
 Package nvidia-kernel-525.147.05 is not installed.
 Package nvidia-kernel-dkms which provides nvidia-kernel-525.147.05 is not configured yet.
 Package nvidia-open-kernel-525.147.05 is not installed.
 Package nvidia-open-kernel-525.147.05 is not installed.

dpkg: error processing package nvidia-driver (--configure):
dependency problems - leaving unconfigured
Processing triggers for libc-bin (2.36-9+deb12u4) ...
Processing triggers for initramfs-tools (0.142) ...
update-initramfs: Generating /boot/initrd.img-6.1.0-18-amd64
Processing triggers for update-glx (1.2.2) ...
Processing triggers for glx-alternative-nvidia (1.2.2) ...
update-alternatives: using /usr/lib/nvidia to provide /usr/lib/glx (glx) in auto mode
Processing triggers for glx-alternative-mesa (1.2.2) ...
Processing triggers for libc-bin (2.36-9+deb12u4) ...
Processing triggers for initramfs-tools (0.142) ...
update-initramfs: Generating /boot/initrd.img-6.1.0-18-amd64
Errors were encountered while processing:
nvidia-kernel-dkms
nvidia-driver
E: Sub-process /usr/bin/dpkg returned an error code (1)

确认 debian 版本 lsb_release -a

1
2
3
4
5
No LSB modules are available.  
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm

根据 stackexchange 上的回答 ,安全升级 Debian 内核的方法是使用 backports 安装。

1
2
3
4
echo "deb http://deb.debian.org/debian bookworm-backports main" | sudo tee /etc/apt/sources.list.d/debian-backports.list
sudo apt update
sudo apt install -t bookworm-backports linux-image-amd64
sudo reboot

重新启动后,执行 uname -a 发现内核已经成功升级了。

1
2
uname -a                                                                          
Linux debian 6.7.12+bpo-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.7.12-1~bpo12+1 (2024-05-06) x86_64 GNU/Linux

重新安装 NVIDIA Proprietary Driver sudo apt install nvidia-driver firmware-misc-nonfree ,这次没有报错了。

1
2
3
nvidia-smi    

NVIDIA-SMI 525.147.05   Driver Version: 525.147.05   CUDA Version: 12.0

NVIDIA Proprietary Driver 525 感觉有问题,过了一段时间后机器出现重启现象,dmesg 显示错误 ACPI BIOS Error (bug)

上网搜索错误,有人反馈是 525 驱动问题(不确定)。Debain 系统 Nvidia 驱动有更新,执行 apt upgrade 后成功升级到 535 。

1
2
3
# nvidia-smi

NVIDIA-SMI 535.183.01             Driver Version: 535.183.01   CUDA Version: 12.2

升级 Nvidia 驱动到 535 后,暂未出现重启现象。

安装 ollama

执行下面的命令安装 Ollama

1
curl -fsSL https://ollama.com/install.sh | sh

下载速度很慢,还是挂线路。

1
2
3
export https_proxy=http://127.0.0.1:7890
export http_proxy=http://127.0.0.1:7890
curl -fsSL https://ollama.com/install.sh | sh

挂上线路后,很快 Ollama 就安装成功了。

1
2
3
4
5
6
7
8
9
10
11
>>> Downloading ollama...  
######################################################################## 100.0%#=#=-#  #                                                                      
>>> Installing ollama to /usr/local/bin...
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA GPU installed.

ollama 下载 llama3 8b 和 qwen2 7b 模型,执行下面的命令:

1
2
ollama pull llama3
ollama pull qwen2:7b

测试 llama3 模型,运行正常。

1
2
3
ollama run llama3  
>>> hi
Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?

升级 Ollama

Ollama 0.3.0 支持通过 llama3.1 进行工具调用,有必要升级。参见 [4]

1
2
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/local/bin/ollama
sudo chmod +x /usr/local/bin/ollama

升级完毕,需要重启 Ollama 服务。

1
2
sudo systemctl daemon-reload
sudo systemctl restart ollama

配置 Ollama

如果需要在浏览器插件(比如沉浸翻译)中调用 Ollama api,涉及 Cross-Origin 访问,需要修改 Ollama 配置。

官方文档提到了相关的设置 [5],用 vim 直接修改 /etc/systemd/system/ollama.service 中,添加下面内容:

1
Environment="OLLAMA_HOST=*"

重启 Ollama 服务

1
2
sudo systemctl daemon-reload
sudo systemctl restart ollama

在远程主机上,查看 Ollama 端口侦听情况

1
2
apt install net-tools
netstat -antp | grep -i ollama

Ollama 默认侦听 127.0.0.1:11434

1
tcp        0      0 127.0.0.1:11434         0.0.0.0:*               LISTEN      50508/ollama

利用 SSH 将远程主机 Ollama 侦听的端口 11434 转发到本地 127.0.0.1:11434

1
ssh -N -g -L 127.0.0.1:11434:127.0.0.1:11434 root@1.1.1.1  # 将 1.1.1.1 替换成你的 ip

卸载 Ollama

停止 Ollama 服务

1
2
3
sudo systemctl stop ollama
sudo systemctl disable ollama
sudo rm /etc/systemd/system/ollama.service

删除二进制文件

1
sudo rm $(which ollama)

删除 Ollama 用户

1
2
3
sudo rm -r /usr/share/ollama
sudo userdel ollama
sudo groupdel ollama

OpenAI Translator 配置

在 OpenAI Translator 中填入下面的配置,即可正常使用。

1
2
3
4
5
6
Default service provider : OpenAI
API Key: ollama
API Model: Custom
Custom Model Name: Your Model Name
API URL: http://127.0.0.1:11434
API URL Path: /v1/chat/completions

参考资料

[0] NvidiaGraphicsDrivers
https://wiki.debian.org/NvidiaGraphicsDrivers

[1] Debain backports Instructions
https://backports.debian.org/Instructions/

[2] Ollama on Linux
https://github.com/ollama/ollama/blob/main/docs/linux.md

[3] Is it possible & safe to use latest kernel with Debian?
https://unix.stackexchange.com/questions/725783/is-it-possible-safe-to-use-latest-kernel-with-debian

[4] Ollama v0.3.0 release note
https://github.com/ollama/ollama/releases/tag/v0.3.0

[5] Ollama FAQ
https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server


Debian 12 安装 Nvidia 驱动和 Ollama
https://usmacd.com/cn/Debian_Nvidia_Ollama/
作者
henices
发布于
2024年7月26日
许可协议