🗒️Could not load dynamic library ‘libcudart.so.10.0‘
2023-6-13
| 2023-6-13
0  |  0 分钟
type
status
data
slug
summary
tags
category
password
icon

Could not load dynamic library ‘libcudart.so.10.0‘

2022-04-29 03:35:16.853021: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64 2022-04-29 03:35:16.853249: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64 2022-04-29 03:35:16.853461: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64 2022-04-29 03:35:16.853664: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64 2022-04-29 03:35:16.853869: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64 2022-04-29 03:35:16.854067: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.1/lib64

出现问题的原因

根本原因在于CUDA版本(10.1)和TensorFlow版本(1.14.0)不一致造成的。因为最新的CUDA版本已经更新到10.1+,但是TensorFlow最新只支持到10.0,所以才会出现各种找不到10.0的库。为什么耗费了这么久?因为CUDA很狡猾,在用nvcc -V命令查看时,给出的版本号是10.0,但用watch -n 1 nvidia -smi查询时,右上角显示的却是10.1。这里推荐用下面这个命令查询:
(tianguoguo) usr@ubuntu16:~$ conda list
在其中找到cudatoolkit这一项,这个版本号是比较准确的。

解决办法

我这显示的cudatoolkit版本是10.1,因为需要10.0的版本,可以用如下命令直接覆盖安装
(tianguoguo) usr@ubuntu16:~$ conda install cudatoolkit=10.0
最后,再用conda list查询是否覆盖成功:
notion image
在改成10.0之后,代码应该就不会再出问题了。 原文链接
技术分享
  • 服务器
  • bug
  • OpenAI 发布新版 GPT-4、GPT-3.5,部分降价 25%,以及支持长达 20 页上下文的 GPT-3.5-16K ,旧版本今年 9 月份将被弃用nohup命令输出到指定文件
    目录