lsmod | grep nouveau nouveau24330240 mxm_wmi163841 nouveau i2c_algo_bit163841 nouveau drm_display_helper1843201 nouveau drm_ttm_helper163841 nouveau ttm942082 drm_ttm_helper,nouveau drm_kms_helper2048002 drm_display_helper,nouveau drm6144005 drm_kms_helper,drm_display_helper,drm_ttm_helper,ttm,nouveau video655362 asus_wmi,nouveau wmi368645 video,asus_wmi,wmi_bmof,mxm_wmi,nouveau button245761 nouveau
Consult /var/lib/dkms/nvidia-current/525.147.05/build/make.log for more information. dpkg: error processing package nvidia-kernel-dkms (--configure): installed nvidia-kernel-dkms package post-installation script subprocess returned error exit status 10 dpkg: dependency problems prevent configuration of nvidia-driver: nvidia-driver depends on nvidia-kernel-dkms (= 525.147.05-4~deb12u1) | nvidia-kernel-525.147.05 | nvidia-open-kernel-525.147.05 | nvidia-open-kernel-525.147.05; however: Package nvidia-kernel-dkms is not configured yet. Package nvidia-kernel-525.147.05 is not installed. Package nvidia-kernel-dkms which provides nvidia-kernel-525.147.05 is not configured yet. Package nvidia-open-kernel-525.147.05 is not installed. Package nvidia-open-kernel-525.147.05 is not installed.
dpkg: error processing package nvidia-driver (--configure): dependency problems - leaving unconfigured Processing triggers for libc-bin (2.36-9+deb12u4) ... Processing triggers for initramfs-tools (0.142) ... update-initramfs: Generating /boot/initrd.img-6.1.0-18-amd64 Processing triggers for update-glx (1.2.2) ... Processing triggers for glx-alternative-nvidia (1.2.2) ... update-alternatives: using /usr/lib/nvidia to provide /usr/lib/glx (glx) in auto mode Processing triggers for glx-alternative-mesa (1.2.2) ... Processing triggers for libc-bin (2.36-9+deb12u4) ... Processing triggers for initramfs-tools (0.142) ... update-initramfs: Generating /boot/initrd.img-6.1.0-18-amd64 Errors were encountered while processing: nvidia-kernel-dkms nvidia-driver E: Sub-process /usr/bin/dpkg returned an error code (1)
确认 debian 版本 lsb_release -a
1 2 3 4 5
No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux12 (bookworm) Release:12 Codename: bookworm
funcAsMap()map[string]EnvVar { ret := map[string]EnvVar{ "OLLAMA_DEBUG": {"OLLAMA_DEBUG", Debug(), "Show additional debug information (e.g. OLLAMA_DEBUG=1)"}, "OLLAMA_FLASH_ATTENTION": {"OLLAMA_FLASH_ATTENTION", FlashAttention(), "Enabled flash attention"}, "OLLAMA_HOST": {"OLLAMA_HOST", Host(), "IP Address for the ollama server (default 127.0.0.1:11434)"}, "OLLAMA_KEEP_ALIVE": {"OLLAMA_KEEP_ALIVE", KeepAlive(), "The duration that models stay loaded in memory (default \"5m\")"}, "OLLAMA_LLM_LIBRARY": {"OLLAMA_LLM_LIBRARY", LLMLibrary(), "Set LLM library to bypass autodetection"}, "OLLAMA_MAX_LOADED_MODELS": {"OLLAMA_MAX_LOADED_MODELS", MaxRunners(), "Maximum number of loaded models per GPU"}, "OLLAMA_MAX_QUEUE": {"OLLAMA_MAX_QUEUE", MaxQueue(), "Maximum number of queued requests"}, "OLLAMA_MODELS": {"OLLAMA_MODELS", Models(), "The path to the models directory"}, "OLLAMA_NOHISTORY": {"OLLAMA_NOHISTORY", NoHistory(), "Do not preserve readline history"}, "OLLAMA_NOPRUNE": {"OLLAMA_NOPRUNE", NoPrune(), "Do not prune model blobs on startup"}, "OLLAMA_NUM_PARALLEL": {"OLLAMA_NUM_PARALLEL", NumParallel(), "Maximum number of parallel requests"}, "OLLAMA_ORIGINS": {"OLLAMA_ORIGINS", Origins(), "A comma separated list of allowed origins"}, "OLLAMA_RUNNERS_DIR": {"OLLAMA_RUNNERS_DIR", RunnersDir(), "Location for runners"}, "OLLAMA_SCHED_SPREAD": {"OLLAMA_SCHED_SPREAD", SchedSpread(), "Always schedule model across all GPUs"}, "OLLAMA_TMPDIR": {"OLLAMA_TMPDIR", TmpDir(), "Location for temporary files"}, } if runtime.GOOS != "darwin" { ret["CUDA_VISIBLE_DEVICES"] = EnvVar{"CUDA_VISIBLE_DEVICES", CudaVisibleDevices(), "Set which NVIDIA devices are visible"} ret["HIP_VISIBLE_DEVICES"] = EnvVar{"HIP_VISIBLE_DEVICES", HipVisibleDevices(), "Set which AMD devices are visible"} ret["ROCR_VISIBLE_DEVICES"] = EnvVar{"ROCR_VISIBLE_DEVICES", RocrVisibleDevices(), "Set which AMD devices are visible"} ret["GPU_DEVICE_ORDINAL"] = EnvVar{"GPU_DEVICE_ORDINAL", GpuDeviceOrdinal(), "Set which AMD devices are visible"} ret["HSA_OVERRIDE_GFX_VERSION"] = EnvVar{"HSA_OVERRIDE_GFX_VERSION", HsaOverrideGfxVersion(), "Override the gfx used for all detected AMD GPUs"} ret["OLLAMA_INTEL_GPU"] = EnvVar{"OLLAMA_INTEL_GPU", IntelGPU(), "Enable experimental Intel GPU detection"} } return ret }
Environment Variables: OLLAMA_DEBUG Show additional debug information (e.g. OLLAMA_DEBUG=1) OLLAMA_HOST IP Address for the ollama server (default 127.0.0.1:11434) OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default "5m") OLLAMA_MAX_LOADED_MODELS Maximum number of loaded models per GPU OLLAMA_MAX_QUEUE Maximum number of queued requests OLLAMA_MODELS The path to the models directory OLLAMA_NUM_PARALLEL Maximum number of parallel requests OLLAMA_NOPRUNE Do not prune model blobs on startup OLLAMA_ORIGINS A comma separated list of allowed origins OLLAMA_SCHED_SPREAD Always schedule model across all GPUs
OLLAMA_FLASH_ATTENTION Enabled flash attention OLLAMA_KV_CACHE_TYPE Quantization typefor the K/V cache (default: f16) OLLAMA_LLM_LIBRARY Set LLM library to bypass autodetection OLLAMA_GPU_OVERHEAD Reserve a portion of VRAM per GPU (bytes) OLLAMA_LOAD_TIMEOUT How long to allow model loads to stall before giving up (default "5m")
root@pm-65c50001:~# ollama run llama3:8b-instruct-fp16 >>> hi Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?
time=2024-08-29T10:50:24.720+08:00 level=INFO source=sched.go:445 msg="loaded runners" count=1 time=2024-08-29T10:50:24.720+08:00 level=INFO source=server.go:593 msg="waiting for llama runner to start responding" time=2024-08-29T10:50:24.720+08:00 level=INFO source=server.go:627 msg="waiting for server to become available" status="llm server error" WARNING: /proc/sys/kernel/numa_balancing is enabled, this has been observed to impair performance
上面的 Issue 提到 有影响,手工禁用并使用 Ollama 0.3.8 可以解决问题。
1 2
echo 0 > /proc/sys/kernel/numa_balancing # or sysctl -w kernel.numa_balancing=0