Windows 11+Visual Studio 2022 环境OpenCV+CUDA 12.5安装及踩坑笔记

avatar
作者
筋斗云
阅读量:0

周六日在家捣腾了一下,把过程记录下来。

前置条件

  1. Visual Studio C++ 生成工具
  2. 和本机显卡适配的CUDA
  3. 与CUDA匹配的cuDNN
  4. Python 3
  5. NumPy
  6. OpenCV源代码以及对应版本的OpenCV-contrib模块源码
  7. CMake

Visual Studio

下载Visual Studio(我本机的是VS2022),通过Visual Studio Installer安装程序,安装C++工具集(或C++工作负荷),详细安装过程可参考这里

CUDA和cuDNN

下载安装最新版的CUDA Toolkit,注意与本地GPU兼容,或者检查本地路径,看是否已经安装CUDA工具包。以我本机为例,CUDA12.5安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5。同上,登录NVIDIA账号下载cuDNN,并将cuDNN文件中的内容复制到CUDA Toolkit所在目录(如C:\Program Files\NVIDIA\CUDNN\vX.X)的binincludelib/x64等文件夹下,我本机的cuDNN9.2.1版。

Python、NumPy及pip

安装Python3.x版本,由于需要使用numpy矩阵替代cv:Mat,,故还需安装numpy,保证已经安装好numpy(pip install numpy)并确保包括opencv-python和opencv-contrib-python等opencv包卸载干净。

pip uninstall opencv-python

pip uninstall opencv-contrib-python

删除cv2目录——YOUR_PYTHON_PATH/Lib/site-packages/cv2

OpenCV

从github仓库下载,或克隆仓库到本地,内容包括OpenCV及版本匹配的opencv-contrib。

CMake配置

给opencv和opencv-contrib创建build目录,然后配置cmake。Cmake配置可参考官网链接:OpenCV: OpenCV configuration options reference

这是一个漫长的过程,中途需要下载3rdparty文件夹里引用的第三方内容,个别库还可能出错,需要手工下载。

本例我们把Python也选上:

General configuration for OpenCV 4.10.0 =====================================  Version control: unknown    Platform:  Timestamp: 2024-07-20T06:31:04Z  Host: Windows 10.0.22631 AMD64  CMake: 3.29.0  CMake generator: Visual Studio 17 2022  CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/MSBuild/Current/Bin/amd64/MSBuild.exe  MSVC: 1940  Configuration: Debug Release    CPU/HW features:  Baseline: SSE SSE2 SSE3  requested: SSE3  Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX  requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX  SSE4_1 (18 files): + SSSE3 SSE4_1  SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2  FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX  AVX (9 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX  AVX2 (38 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2  AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX    C/C++:  Built as dynamic libs?: YES  C++ standard: 11  C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe (ver 19.40.33812.0)  C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG  C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1  C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe  C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /O2 /Ob2 /DNDEBUG  C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /Zi /Ob0 /Od /RTC1  Linker flags (Release): /machine:x64 /INCREMENTAL:NO  Linker flags (Debug): /machine:x64 /debug /INCREMENTAL  ccache: NO  Precompiled headers: YES  Extra dependencies:  3rdparty dependencies:    OpenCV modules:  To be built: calib3d core dnn features2d flann gapi highgui imgcodecs imgproc java ml objdetect photo python3 stitching ts video videoio  Disabled: world  Disabled by dependency: -  Unavailable: python2  Applications: tests perf_tests apps  Documentation: NO  Non-free algorithms: NO    Windows RT support: NO    GUI: WIN32UI  Win32 UI: YES  VTK support: NO    Media I/O:  ZLib: build (ver 1.3.1)  JPEG: build-libjpeg-turbo (ver 3.0.3-70)  SIMD Support Request: YES  SIMD Support: NO  WEBP: build (ver encoder: 0x020f)  PNG: build (ver 1.6.43)  SIMD Support Request: YES  SIMD Support: YES (Intel SSE)  TIFF: build (ver 42 - 4.6.0)  JPEG 2000: build (ver 2.5.0)  OpenEXR: build (ver 2.3.0)  HDR: YES  SUNRASTER: YES  PXM: YES  PFM: YES    Video I/O:  DC1394: NO  FFMPEG: YES (prebuilt binaries)  avcodec: YES (58.134.100)  avformat: YES (58.76.100)  avutil: YES (56.70.100)  swscale: YES (5.9.100)  avresample: YES (4.0.0)  GStreamer: NO  DirectShow: YES  Media Foundation: YES  DXVA: YES    Parallel framework: Concurrency    Trace: YES (with Intel ITT)    Other third-party libraries:  Intel IPP: 2021.11.0 [2021.11.0]  at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/icv  Intel IPP IW: sources (2021.11.0)  at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/iw  Lapack: NO  Eigen: NO  Custom HAL: NO  Protobuf: build (3.19.1)  Flatbuffers: builtin/3rdparty (23.5.9)    OpenCL: YES (NVD3D11)  Include path: D:/Data/source/collection/OpenCV/4100/opencv-4.10.0/3rdparty/include/opencl/1.2  Link libraries: Dynamic load    Python 3:  Interpreter: D:/miniconda3/python.exe (ver 3.11.7)  Libraries: D:/miniconda3/libs/python311.lib (ver 3.11.7)  Limited API: NO  numpy: D:/miniconda3/Lib/site-packages/numpy/core/include (ver 1.26.1)  install path: D:/miniconda3/Lib/site-packages/cv2/python-3.11    Python (for build): D:/miniconda3/python.exe    Java:  ant: NO  Java: YES (ver 1.8.0.371)  JNI: D:/Program Files/jdk-1.8/include D:/Program Files/jdk-1.8/include/win32 D:/Program Files/jdk-1.8/include  Java wrappers: YES (JAVA)  Java tests: NO    Install to: D:/Data/source/collection/OpenCV/4100/build/install  -----------------------------------------------------------------    Configuring done (136.2s)

配置后好再修改以下两个参数,其中CUDA_ARCH_BIN找到CUDA Toolkit后,目前的版本会自动选上。

之前还生成了VTK,故加上了VTK路径(这是VTK的cmake生成路径):

点击“生成”按钮,生成VS工程。

从图可见,Java和Python的绑定工程都有了。

Visual Studio生成

执行ALL_BUILD生成工程,执行INSTALL进行安装。

生成和安装是一个漫长的等待……

注:要直接安装到Python环境中,需要用管理员身份打开VS,然后生成INSTALL项目。

安装效果及应用

#pragma comment(lib, "opencv_core4100.lib")  #include <iostream> #include <opencv2/core/cuda.hpp>  int main() {     int deviceCount = cv::cuda::getCudaEnabledDeviceCount();     std::cout << "CUDA Device Number: " << deviceCount << std::endl;     cv::cuda::printCudaDeviceInfo(0); }

Python 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)] on win32  Type "help", "copyright", "credits" or "license" for more information.  >>> import cv2  >>> print(cv2.__version__)  4.10.0  >>> print(cv2.cuda.getCudaEnabledDeviceCount())  1  >>> cv2.cuda.printCudaDeviceInfo(0)  *** CUDA Device Query (Runtime API) version (CUDART static linking) ***    Device count: 1    Device 0: "NVIDIA GeForce RTX 3070 Ti Laptop GPU"    CUDA Driver Version / Runtime Version          12.50 / 12.50    CUDA Capability Major/Minor version number:    8.6    Total amount of global memory:                 8192 MBytes (8589410304 bytes)    GPU Clock Speed:                               1.41 GHz    Max Texture Dimension Size (x,y,z)             1D=(131072), 2D=(131072,65536), 3D=(16384,16384,16384)    Max Layered Texture Size (dim) x layers        1D=(32768) x 2048, 2D=(32768,32768) x 2048    Total amount of constant memory:               65536 bytes    Total amount of shared memory per block:       49152 bytes    Total number of registers available per block: 65536    Warp size:                                     32    Maximum number of threads per block:           1024    Maximum sizes of each dimension of a block:    1024 x 1024 x 64    Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535    Maximum memory pitch:                          2147483647 bytes    Texture alignment:                             512 bytes    Concurrent copy and execution:                 Yes with 1 copy engine(s)    Run time limit on kernels:                     Yes    Integrated GPU sharing Host Memory:            No    Support host page-locked memory mapping:       Yes    Concurrent kernel execution:                   Yes    Alignment requirement for Surfaces:            Yes    Device has ECC support enabled:                No    Device is using TCC driver mode:               No    Device supports Unified Addressing (UVA):      Yes    Device PCI Bus ID / PCI location ID:           1 / 0    Compute Mode:        Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 12.50, CUDA Runtime Version = 12.50, NumDevs = 1

遇到的几个问题

  1. Visual Studio 已安装的Python版本影响

之前安装VS2022时装了Python开发负荷(Python3.9),导致cmake的时候绑死了该环境,且指向conda里的Python环境,其Libraries还是指向3.9。卸载VS中的Python可以解决。

    2. 缺Nvidia Video Codec SDK导致的警告

CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:26 (message):
cudacodec::VideoReader requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVID=OFF

CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:30 (message):
cudacodec::VideoWriter requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVENC=OFF

下载nvidia Video Codec SDK,并把lib和头文件(interface目录)分别复制到cuda toolkit的lib/x64和include目录,问题解决。

    3. CUDA版本问题导致的错误CMake Error at cmake/OpenCVDetectCUDAUtils.cmake :297 (list)  list GET given empty list

因我的Visual Studio是17.10.4,在CUDA12.2上构建,则会出现这个问题,因为根据官方文档,CUDA Toolkit 12.2 只支持到17.0的Visual Studio,如下图:

CUDA Installation Guide Microsoft Windows (nvidia.com)

更换为CUDA 12.5,可以解决这个问题:

https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-microsoft-windows/index.html

    5. Python使用CV2时dll缺失错误

ImportError: DLL load failed while importing cv2: 找不到指定的模块。

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 181, in <module>

    bootstrap()

  File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 153, in bootstrap

    native_module = importlib.import_module("cv2")

                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "D:\miniconda3\Lib\importlib\__init__.py", line 126, in import_module

    return _bootstrap._gcd_import(name[level:], package, level)

提示dll缺失,使用ProcessMonitor,添加python.exe过滤器,重现错误,追出出错原因:

发现原来是自己编译VTK带来的锅,自己搞的VTK,含着泪也要把它搞定,So,加到cv2的config.py中,但导致别的错误(都怪自己,把VTK的debug版和release版放一起了),单独抽取当中的release版,加入到环境变量或cv2的config.py,或者直接拷贝到site-packages->cv2->python-3.11目录。搞定,问题解决。

参考资料

Quick and Easy OpenCV Python Installation with Cuda GPU in Under 10 Minutes (youtube.com)

GitHub - chrismeunier/OpenCV-CUDA-installation: Saving the process to install OpenCV for Python 3 with CUDA bindings

Unable to enable Cudacodec VideoReader · Issue #11220 · opencv/opencv · GitHub

OpenCV: OpenCV configuration options reference

CUDA Installation Guide for Microsoft Windows (nvidia.com)

windows10+VS2022编译安装opencv-python_重新编译opencv-python-CSDN博客

广告一刻

为您即时展示最新活动产品广告消息,让您随时掌握产品活动新动态!