NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

I just installed CUDA in a notebook like this:

sudo apt-get install cuda

Like said here.

The compilation wokrs just fine but when I try to run I got the followin problem: CUDA error at file.cu:128 code=35(cudaErrorInsufficientDriver) "cudaStreamCreate(&(stream[i]))"

My nvcc version:

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

Graphics card info:

lspci | egrep 'VGA|3D'
00:02.0 VGA compatible controller: Intel Corporation Skylake Integrated Graphics (rev 06)
02:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 960M] (rev a2)

I also installed VirtualGL, bumblebee-nvidia, primus, freeglut3-dev. Following this.

When I try to run something on bumblebee I got this: optirun glxspheres64

[ 41.413478] [ERROR]Cannot access secondary GPU - error: Could not load GPU driver
[ 41.413520] [ERROR]Aborting because fallback start is disabled.

nvidia driver not working.

nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

It looks like the nvidia 375 version is instaled but I can't make it works.

whereis nvidia
nvidia: /usr/lib/nvidia /usr/share/nvidia /usr/src/nvidia-375-375.66/nvidia

And some driver info.

modinfo nvidia_375
filename: /lib/modules/4.8.0-54-generic/updates/dkms/nvidia_375.ko
alias: char-major-195-*
version: 375.66
supported: external
license: NVIDIA
srcversion: 68751AFD79A210CEFFB8758
alias: pci:v000010DEd00000E00sv*sd*bc04sc80i00*
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends:
vermagic: 4.8.0-54-generic SMP mod_unload modversions
parm: NVreg_Mobile:int
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_UpdateMemoryTypes:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_MapRegistersEarly:int
parm: NVreg_RegisterForACPIEvents:int
parm: NVreg_CheckPCIConfigSpace:int
parm: NVreg_EnablePCIeGen3:int
parm: NVreg_EnableMSI:int
parm: NVreg_TCEBypassMode:int
parm: NVreg_UseThreadedInterrupts:int
parm: NVreg_MemoryPoolSize:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RmMsg:charp
parm: NVreg_AssignGpus:charp

I think it can be some driver version problem:

dpkg -l | grep nvidia
ii bumblebee-nvidia 3.2.1-10 amd64 NVIDIA Optimus support using the proprietary NVIDIA driver
ii nvidia-375 375.66-0ubuntu0.16.04.1 amd64 NVIDIA binary driver - version 375.66
ii nvidia-375-dev 375.66-0ubuntu0.16.04.1 amd64 NVIDIA binary Xorg driver development files
ii nvidia-modprobe 375.51-0ubuntu1 amd64 Load the NVIDIA kernel driver and create device files
ii nvidia-opencl-icd-375 375.66-0ubuntu0.16.04.1 amd64 NVIDIA OpenCL ICD
ii nvidia-prime 0.8.2 amd64 Tools to enable NVIDIA's Prime

What am I missing?

9 Answers

You may want to install cuda toolkit. Using the following command to install it.

sudo apt install nvidia-cuda-toolkit

Once the installation is done, reboot the machine. nvidia-smi should work.

If your nvidia-smi failed to communicate but you've installed the driver so many times, check prime-select.

Run prime-select query to get all possible options. You should see at least nvidia | intel.
Choose prime-select nvidia.
If it says nvidia is already selected, select a different one, e.g. prime-select intel, then switch back to nvidia prime-select nvidia
Reboot and check nvidia-smi.

I disabled the Secure Boot and it worked pretty fine.

@rod-smith aswered another question more specific explaining how to do it, basically is a setup config, but he also write a good article about how to do that in here.

since I cannot comment on @Rodolfo's answer above (not enough reputation), I am adding a new answer.

On my machine I had to configure Secure Boot accordingly to my OS. I have an ASUS mainboard running Ubuntu 18.04 and tried to install NVIDIA CUDA 10.1 Update 2 with the packaged NVIDIA driver. I faced the same issue as described above. As it turned out, Secure Boot was set to Windows UEFI mode. Changing it to Other OS fixed it for me.

The solution by Markus lead me to a better solution. So it has to do with Secure Boot, but it is not necessary to deactivate.

To fix the problem, just do 3 steps: Deactivate the Nvidia driver by choosing X.Org with the Additional Drivers tool, reboot, then activate the Nvidia driver, reboot and enroll the key in Secure Boot.

Usually when you activate the Nvidia driver with the Additional Drivers tool, you are asked for a (new) password for Secure Boot. After reboot, the PC jumps into Secure Boot settings and you are asked to enroll a new MOK key, which must be confirmed with that same password. Afterwards, the driver will get access to the Nvidia card and will work.

In case you are looking for a solution for Google Cloud Platform, it is best to follow the advice of Google and only use recommended Ubuntu version (at the time of writing May 2020 use either 16.04 or 18.04, the new 20.04 is not yet supported) and follow the official instructions for installing CUDA support for Google Cloud VM here. This will give you the correct version of the driver that works with GCP VM. Then restart the instance with sudo reboot or from console.

If you are installing CUDA for GCP VM any other way you may still succeed but struggle with issues like "NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver" or some dependency problem.

PS! I will not copy the instructions here as they are prone to change any time, always refer to original GCP source for the latest working solution.

A lot of users have mentioned that they are unable to install the Nvidia-toolkit, and sudo apt install nvidia-cuda-toolkit doesn't work. Be sure to check that you are using the latest GCC compiler. Using an older GCC compiler like 4.9 will not be able to compile the Nvidia Cuda toolkit. Try installing after using the latest GCC compiler, such as v9.3.

For future readers:

I am on a virtual machine instance (Google Cloud Platform)

and I am following this gist to install Cuda and CuDNn on my VM.

I had to manually upload the CuDNn part. (Just putting it out there.)

Now, getting to the error:

I was having this issue but a complete restart of the instance did the job. And by complete restart I mean stopping the instance and turning it back on again.

I hope this helps someone.

I was using driver version 470 in Ubuntu 20.04 (latest driver at time of writing).

I went to Software & Updates>Additional Drivers, then downgraded to nvidia-driver-460, clicked Apply, then rebooted.

After that I was able to to see the correct output from nvidia-smi again.

Star Pop News

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running

9 Answers

You Might Also Like

Why does my X-box have 200 GB of storage?

How do I farm creepers for XP without them exploding?

Who pays the cleaner?

Lost access to Steam Desktop Authenticator files and codes - how do I recover my account?