Please select To the mobile version | Continue to access the desktop computer version

BBS-DeePMD

 Forgot password?
 Register
Search
View: 164|Reply: 2
Collapse the left

报错:CUDA driver version is insufficient

[Copy link]
Post time: 2021-03-29 11:32:06
| Show all posts |Read mode
首先我们是超算类型的服务器,我在自己的账号中安装的deepmd kit 之后遇到如下问题

我在安装了GPU版本的DeepMD kit之后提交到了 GPU 节点,之后报错这个信息。
tensorflow.python.framework.errors_impl.InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

我之后使用nvidia-smi 检查我们的驱动是 NVIDIA-SMI 450.80.02    Driver Version: 450.80.02  应该符合 CUDA 10.1

我猜测应该是提交到GPU节点进行训练时,NVIDIA 驱动没有正常被调用

有哪位好心人能帮帮我吗?可以有偿的。

提前谢谢各位了
Reply

Use magic Report

Post time: 2021-03-29 19:21:47
| Show all posts
我也遇到过类似的问题,把GPU驱动升级相应的Cuda版本就可以正常用了。我装的是Cuda10.0版的DeepMD-kit,我觉得应该是类似的
Reply Support Not support

Use magic Report

 Author| Post time: 2021-03-29 21:59:03 From the mobile phone
| Show all posts
我其实可以在主节点跑,但是一到计算节点就不行了,不明白是啥问题
Reply Support Not support

Use magic Report

You have to log in before you can reply Login | Register

Points Rules

Dark room|DeePMD

2021-04-23 11:08 GMT+8 , Processed in 0.083414 sec., 19 queries .

Quick Reply To Top Return to the list