ubuntu系统安装11.6+版本的cuda
可以参考这两篇博客
ubuntu22.04多版本安装cuda及快速切换(cuda11.1和11.8)_ubuntu调整cuda版本
【Linux】在一台机器上同时安装多个版本的CUDA(切换CUDA版本)_linux安装多个cuda
安装CUDA
https://developer.nvidia.com/cuda-toolkit-archive
找到11.8版本的cuda
依次选择Linux x86_64 Ubuntu 22.04 runfile(local)
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.runsudo sh cuda_11.8.0_520.61.05_linux.run sudo sh cuda_11.8.0_520.61.05_linux.run
运行上述两个命令
用enter把Driver去掉勾选 然后用方向键移动到最下面的enter然后回车
选择no
出现这样说明cuda安装成功
我们暂时先不去修改环境变量,为了不影响现有的CUDA环境。之后会用脚本进行切换
安装cuDNN
https://developer.nvidia.com/rdp/cudnn-archive
从这里下载cuDNN
我们输入
cd /usr/local/ ls
现在有三个版本的cuda文件
我们将下载的cudnn解压到对应的文件夹下面并且赋予执行权限
sudo cp include/cudnn.h /usr/local/cuda-11.8/include sudo cp lib/libcudnn* /usr/local/cuda-11.8/lib64 sudo chmod a+r /usr/local/cuda-11.8/include/cudnn.h sudo chmod a+r /usr/local/cuda-11.8/lib64/libcudnn*
切换CUDA版本
我们使用以下命令在/usr/local/ 目录下新建一个switch-cuda的脚本
sudo vim /usr/local/switch-cuda.sh
把以下代码复制粘贴进去
#!/usr/bin/env bash # Copyright (c) 2018 Patrick Hohenecker # # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell # copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. # author: Patrick Hohenecker <mail@paho.at> # version: 2018.1 # date: May 15, 2018 set -e # ensure that the script has been sourced rather than just executed if [[ "${BASH_SOURCE[0]}" = "${0}" ]]; then echo "Please use 'source' to execute switch-cuda.sh!" exit 1 fi INSTALL_FOLDER="/usr/local" # the location to look for CUDA installations at TARGET_VERSION=${1} # the target CUDA version to switch to (if provided) # if no version to switch to has been provided, then just print all available CUDA installations if [[ -z ${TARGET_VERSION} ]]; then echo "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):" ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; do echo "* ${line}" done set +e return # otherwise, check whether there is an installation of the requested CUDA version elif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; then echo "No installation of CUDA ${TARGET_VERSION} has been found!" set +e return fi # the path of the installation to use cuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" # filter out those CUDA entries from the PATH that are not needed anymore path_elements=(${PATH//:/ }) new_path="${cuda_path}/bin" for p in "${path_elements[@]}"; do if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then new_path="${new_path}:${p}" fi done # filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymore ld_path_elements=(${LD_LIBRARY_PATH//:/ }) new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64" for p in "${ld_path_elements[@]}"; do if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then new_ld_path="${new_ld_path}:${p}" fi done # update environment variables export CUDA_HOME="${cuda_path}" export CUDA_ROOT="${cuda_path}" export LD_LIBRARY_PATH="${new_ld_path}" export PATH="${new_path}" echo "Switched to CUDA ${TARGET_VERSION}." set +e return
按ESC 输入:x 回车保存退出
然后我们再进入/usr/local/
/usr/local/ source switch-cuda.sh
自动显示我们目前已经安装的CUDA版本
然后我们输入以下命令切换到刚才安装的11.8版本
source switch-cuda.sh 11.8
然后我们看到系统提示我们已经切换到了11.8版本的cuda,为了确认,我们再检查一下
输入命令
nvcc -V
显示已经切换了版本
Mamba安装
参考教程
【Mamba安装】99%的人都出错!带你手把手解决selective_scan_cuda冲突问题
环境要求
GitHub - state-spaces/s4: Structured state space sequence models
这个是原始版本的Mamba的环境要求
可以看到,他的要求是
- Python 3.9+
- Pytorch 1.10+
- cuda 11.6+
这个是Vim(Vision Mamba)的部署要求
我们这里以Vim的安装部署要求来,因为Vim依赖于Mamba
- Python 3.10
- Pytorch 2.1.1
- cuda 11.8
新建一个虚拟环境
我们新建一个叫做Vim的虚拟环境
conda create -n Vim python=3.10 conda activate Vim
在这里去找pytorch的安装版本
https://pytorch.org/
找到2.1.1版本的pytorch
conda install pytorch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 pytorch-cuda=11.8 -c pytorch -c nvidia
安装packaging
conda install packaging
安装causal-conv1d和mamba-ssm(重点!!!)
首先可以参考一下以前的一些经验和踩坑记录
[最佳实践] conda环境内安装cuda 和 Mamba的安装
运行Mamba项目时无法直接用pip install安装causal_conv1d和mamba_ssm_pip install causal-conv1d
总结一下,常规的方法是直接用pip install
命令如下:
pip install causal_conv1d python setup.py install
但是有特别大的概率会报各种错误,比如:
Building wheel for causal-conv1d (setup.py) ... error
error: command '/usr/bin/gcc' failed with exit code 1
RuntimeError: Error compiling objects for extension
ERROR: Could not build wheels for causal-conv1d, which is required to install pyproject.toml-based projects
一种解决方案是git clone源码然后编译
另外一个办法是下载编译好的whl在本地安装或者直接下载源码编译
Releases · Dao-AILab/causal-conv1d
下载了causal_con1d 的v1.2.0版本的whl
以及
GitHub - state-spaces/mamba: Mamba SSM architecture
下载了mamba-ssm1.2.0的源码
放到虚拟环境的文件夹下,打开mamba-1.2.0文件夹
打开setup.py
修改。
首先把这三行给注释掉
然后再加上这三行
然后是第264行 修改
改成下面的
不用ninja去构建
保存退出
然后安装causal_conv1d
pip install causal_conv1d-1.2.0.post2+cu118torch2.1cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
显示成功了
为了验证是否成功
import causal_conv1d
目测是没有问题的,但是很多博客显示这里还会报错,如果报错,就下载causal_conv1d的源码进行编译
编译的时候
Anaconda3/envs/Vim/lib/python3.10/site-packages/torch/utils下
找到cpp_extension.py 修改
手动加入11.8
修改后
然后我们编译 mamba-ssm
安装命令是
pip install . --no-cache-dir --verbose
报错
显示是cuda版本的问题
在Vim虚拟环境下输入nvcc -V,显示的cuda 版本是11.1
很奇怪
然后在Vim的环境下 cd /usr/local/
source switch-cuda.sh 11.8 就是运行之前写的那个切换cuda版本的脚本
然后再输入nvcc -V,显示的cuda 版本是11.8
然后重新编译,OK了
编译好了