周六日在家捣腾了一下,把过程记录下来。
前置条件
- Visual Studio C++ 生成工具
- 和本机显卡适配的CUDA
- 与CUDA匹配的cuDNN
- Python 3
- NumPy
- OpenCV源代码以及对应版本的OpenCV-contrib模块源码
- CMake
Visual Studio
下载Visual Studio(我本机的是VS2022),通过Visual Studio Installer安装程序,安装C++工具集(或C++工作负荷),详细安装过程可参考这里。
CUDA和cuDNN
下载安装最新版的CUDA Toolkit,注意与本地GPU兼容,或者检查本地路径,看是否已经安装CUDA工具包。以我本机为例,CUDA12.5安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5。同上,登录NVIDIA账号下载cuDNN,并将cuDNN文件中的内容复制到CUDA Toolkit所在目录(如C:\Program Files\NVIDIA\CUDNN\vX.X)的bin、include和lib/x64等文件夹下,我本机的cuDNN是9.2.1版。
Python、NumPy及pip
安装Python3.x版本,由于需要使用numpy矩阵替代cv:Mat,,故还需安装numpy,保证已经安装好numpy(pip install numpy)并确保包括opencv-python和opencv-contrib-python等opencv包卸载干净。
pip uninstall opencv-python
pip uninstall opencv-contrib-python
删除cv2目录——YOUR_PYTHON_PATH/Lib/site-packages/cv2
OpenCV
从github仓库下载,或克隆仓库到本地,内容包括OpenCV及版本匹配的opencv-contrib。
CMake配置
给opencv和opencv-contrib创建build目录,然后配置cmake。Cmake配置可参考官网链接:OpenCV: OpenCV configuration options reference。
这是一个漫长的过程,中途需要下载3rdparty文件夹里引用的第三方内容,个别库还可能出错,需要手工下载。
本例我们把Python也选上:
General configuration for OpenCV 4.10.0 ===================================== Version control: unknown Platform: Timestamp: 2024-07-20T06:31:04Z Host: Windows 10.0.22631 AMD64 CMake: 3.29.0 CMake generator: Visual Studio 17 2022 CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/MSBuild/Current/Bin/amd64/MSBuild.exe MSVC: 1940 Configuration: Debug Release CPU/HW features: Baseline: SSE SSE2 SSE3 requested: SSE3 Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX SSE4_1 (18 files): + SSSE3 SSE4_1 SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX AVX (9 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX AVX2 (38 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX C/C++: Built as dynamic libs?: YES C++ standard: 11 C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe (ver 19.40.33812.0) C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1 C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /O2 /Ob2 /DNDEBUG C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /Zi /Ob0 /Od /RTC1 Linker flags (Release): /machine:x64 /INCREMENTAL:NO Linker flags (Debug): /machine:x64 /debug /INCREMENTAL ccache: NO Precompiled headers: YES Extra dependencies: 3rdparty dependencies: OpenCV modules: To be built: calib3d core dnn features2d flann gapi highgui imgcodecs imgproc java ml objdetect photo python3 stitching ts video videoio Disabled: world Disabled by dependency: - Unavailable: python2 Applications: tests perf_tests apps Documentation: NO Non-free algorithms: NO Windows RT support: NO GUI: WIN32UI Win32 UI: YES VTK support: NO Media I/O: ZLib: build (ver 1.3.1) JPEG: build-libjpeg-turbo (ver 3.0.3-70) SIMD Support Request: YES SIMD Support: NO WEBP: build (ver encoder: 0x020f) PNG: build (ver 1.6.43) SIMD Support Request: YES SIMD Support: YES (Intel SSE) TIFF: build (ver 42 - 4.6.0) JPEG 2000: build (ver 2.5.0) OpenEXR: build (ver 2.3.0) HDR: YES SUNRASTER: YES PXM: YES PFM: YES Video I/O: DC1394: NO FFMPEG: YES (prebuilt binaries) avcodec: YES (58.134.100) avformat: YES (58.76.100) avutil: YES (56.70.100) swscale: YES (5.9.100) avresample: YES (4.0.0) GStreamer: NO DirectShow: YES Media Foundation: YES DXVA: YES Parallel framework: Concurrency Trace: YES (with Intel ITT) Other third-party libraries: Intel IPP: 2021.11.0 [2021.11.0] at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/icv Intel IPP IW: sources (2021.11.0) at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/iw Lapack: NO Eigen: NO Custom HAL: NO Protobuf: build (3.19.1) Flatbuffers: builtin/3rdparty (23.5.9) OpenCL: YES (NVD3D11) Include path: D:/Data/source/collection/OpenCV/4100/opencv-4.10.0/3rdparty/include/opencl/1.2 Link libraries: Dynamic load Python 3: Interpreter: D:/miniconda3/python.exe (ver 3.11.7) Libraries: D:/miniconda3/libs/python311.lib (ver 3.11.7) Limited API: NO numpy: D:/miniconda3/Lib/site-packages/numpy/core/include (ver 1.26.1) install path: D:/miniconda3/Lib/site-packages/cv2/python-3.11 Python (for build): D:/miniconda3/python.exe Java: ant: NO Java: YES (ver 1.8.0.371) JNI: D:/Program Files/jdk-1.8/include D:/Program Files/jdk-1.8/include/win32 D:/Program Files/jdk-1.8/include Java wrappers: YES (JAVA) Java tests: NO Install to: D:/Data/source/collection/OpenCV/4100/build/install ----------------------------------------------------------------- Configuring done (136.2s)
配置后好再修改以下两个参数,其中CUDA_ARCH_BIN找到CUDA Toolkit后,目前的版本会自动选上。
之前还生成了VTK,故加上了VTK路径(这是VTK的cmake生成路径):
点击“生成”按钮,生成VS工程。
从图可见,Java和Python的绑定工程都有了。
Visual Studio生成
执行ALL_BUILD生成工程,执行INSTALL进行安装。
生成和安装是一个漫长的等待……
注:要直接安装到Python环境中,需要用管理员身份打开VS,然后生成INSTALL项目。
安装效果及应用
#pragma comment(lib, "opencv_core4100.lib") #include <iostream> #include <opencv2/core/cuda.hpp> int main() { int deviceCount = cv::cuda::getCudaEnabledDeviceCount(); std::cout << "CUDA Device Number: " << deviceCount << std::endl; cv::cuda::printCudaDeviceInfo(0); }
Python 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import cv2 >>> print(cv2.__version__) 4.10.0 >>> print(cv2.cuda.getCudaEnabledDeviceCount()) 1 >>> cv2.cuda.printCudaDeviceInfo(0) *** CUDA Device Query (Runtime API) version (CUDART static linking) *** Device count: 1 Device 0: "NVIDIA GeForce RTX 3070 Ti Laptop GPU" CUDA Driver Version / Runtime Version 12.50 / 12.50 CUDA Capability Major/Minor version number: 8.6 Total amount of global memory: 8192 MBytes (8589410304 bytes) GPU Clock Speed: 1.41 GHz Max Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072,65536), 3D=(16384,16384,16384) Max Layered Texture Size (dim) x layers 1D=(32768) x 2048, 2D=(32768,32768) x 2048 Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per block: 1024 Maximum sizes of each dimension of a block: 1024 x 1024 x 64 Maximum sizes of each dimension of a grid: 2147483647 x 65535 x 65535 Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Concurrent kernel execution: Yes Alignment requirement for Surfaces: Yes Device has ECC support enabled: No Device is using TCC driver mode: No Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 1 / 0 Compute Mode: Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.50, CUDA Runtime Version = 12.50, NumDevs = 1
遇到的几个问题
- Visual Studio 已安装的Python版本影响
之前安装VS2022时装了Python开发负荷(Python3.9),导致cmake的时候绑死了该环境,且指向conda里的Python环境,其Libraries还是指向3.9。卸载VS中的Python可以解决。
2. 缺Nvidia Video Codec SDK导致的警告
CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:26 (message):
cudacodec::VideoReader requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVID=OFF
CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:30 (message):
cudacodec::VideoWriter requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVENC=OFF
下载nvidia Video Codec SDK,并把lib和头文件(interface目录)分别复制到cuda toolkit的lib/x64和include目录,问题解决。
3. CUDA版本问题导致的错误CMake Error at cmake/OpenCVDetectCUDAUtils.cmake :297 (list) list GET given empty list
因我的Visual Studio是17.10.4,在CUDA12.2上构建,则会出现这个问题,因为根据官方文档,CUDA Toolkit 12.2 只支持到17.0的Visual Studio,如下图:
CUDA Installation Guide Microsoft Windows (nvidia.com)
更换为CUDA 12.5,可以解决这个问题:
https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-microsoft-windows/index.html
5. Python使用CV2时dll缺失错误
ImportError: DLL load failed while importing cv2: 找不到指定的模块。
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 181, in <module>
bootstrap()
File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\miniconda3\Lib\importlib\__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
提示dll缺失,使用ProcessMonitor,添加python.exe过滤器,重现错误,追出出错原因:
发现原来是自己编译VTK带来的锅,自己搞的VTK,含着泪也要把它搞定,So,加到cv2的config.py中,但导致别的错误(都怪自己,把VTK的debug版和release版放一起了),单独抽取当中的release版,加入到环境变量或cv2的config.py,或者直接拷贝到site-packages->cv2->python-3.11目录。搞定,问题解决。
参考资料
Quick and Easy OpenCV Python Installation with Cuda GPU in Under 10 Minutes (youtube.com)
Unable to enable Cudacodec VideoReader · Issue #11220 · opencv/opencv · GitHub
OpenCV: OpenCV configuration options reference