英伟达(NVIDIA)显卡虚拟化vGPU实践指南

2020-12-16 10:04:44

如何选择GPU

NVIDIA 虚拟 GPU 软件产品包括  GRID 虚拟 PC (GRID vPC)、GRID 虚拟应用程序 (GRID vApp),以及 Quadro 虚拟数据中心工作站 (Quadro vDWS)。

GUP推荐对比看这里:

https://www.nvidia.cn/data-center/graphics-cards-for-virtualization/

具体的GPU列表看这里:

https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html

英伟达显卡虚拟化vGPU实践指南

物理机如何安装驱动:

直接运行run文件就可以了:NVIDIA-Linux-x86_64-430.46-vgpu-kvm.run

vGPU驱动说明:

A physical GPU that is passed through to a VM is bound to the vfio-pci kernel module. A physical GPU that is bound to the vfio-pci kernel module can be used only for pass-through. To enable the GPU to be used for vGPU, the GPU must be unbound from vfio-pci kernel module and bound to the nvidia kernel module.

#  lspci -d 10de: -k

b1:00.0 3D controller: NVIDIA Corporation Device 1db4 (rev a1)

        Subsystem: NVIDIA Corporation Device 1306

        Kernel driver in use: nvidia

        Kernel modules: nvidiafb, nouveau, nvidia_vgpu_vfio, nvidia

查询GPU的BDF

root@example:~# lspci | grep NVID

b1:00.0 3D controller: NVIDIA Corporation Device 1db4 (rev a1)

查找vGPU类型

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# grep -l "V100-1Q" nvidia-*/name

nvidia-105/name

注意:至于要创建哪种类型的vGPU,就看具体的物理GPU型号和vGPU要求了,看这个xxxx-grid-vgpu-user-guide.pdf文档能找到类似如下,各个物理GPU都有详细配置,根据自己情况来:

英伟达显卡虚拟化vGPU实践指南

英伟达显卡虚拟化vGPU实践指南

查看该类型能支持的vGPU个数

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# cat nvidia-105/available_instances

16

创建vGPU

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# uuidgen

b0ff7f66-c989-4841-ba57-6d5adcd55a2d

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# echo "b0ff7f66-c989-4841-ba57-6d5adcd55a2d" > nvidia-105/create

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# uuidgen

b94a0c97-946d-4e57-b317-8bdaa38e455a

root@example:/sys/class/mdev_bus/0000:b1:00.0/mdev_supported_types# echo "b94a0c97-946d-4e57-b317-8bdaa38e455a" > nvidia-105/create

检查创建是否成功

root@example:~# ls -l /sys/bus/mdev/devices/

total 0

lrwxrwxrwx 1 root root 0 Aug 21 12:41 b0ff7f66-c989-4841-ba57-6d5adcd55a2d -> ../../../devices/pci0000:ae/0000:ae:02.0/0000:b1:00.0/b0ff7f66-c989-4841-ba57-6d5adcd55a2d

lrwxrwxrwx 1 root root 0 Aug 21 13:44 b94a0c97-946d-4e57-b317-8bdaa38e455a -> ../../../devices/pci0000:ae/0000:ae:02.0/0000:b1:00.0/b94a0c97-946d-4e57-b317-8bdaa38e455a

qemu虚拟机使用

-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/b94a0c97-946d-4e57-b317-8bdaa38e455a -uuid xxxxxxxxxxxxxxxxxxxxx

后面就是虚拟机内部安装对应的驱动程序了,一般名称为:xxxx_grid_win10_server2016_server2019_64bit_international.exe

删除:

root@example:/sys/devices/pci0000:ae/0000:ae:02.0/0000:b1:00.0/mdev_supported_types/nvidia-105/devices# echo 1 > b0ff7f66-c989-4841-ba57-6d5adcd55a2d/remove

关于授权:

Nvidia的vGPU在虚拟机内部使用是需要购买license的,具体的部署方式是需要搭建一台授权服务器,虚拟机内部安装显卡驱动后需要配置授权服务器的地址和端口,前提是虚拟机和授权服务器网络是通的,虚拟机每次开机后都要连接到授权服务器进行授权。

注:福瑞鑫智能科技vGPU授权有售,欢迎咨询!

电话咨询
最新产品
官方商城
QQ客服