Opened 4 years ago

Last modified 3 years ago

#896 new defect

CUDA fails in v3.4.0 amd64 DISKLESS mode

Reported by: skylar Owned by:
Priority: major Milestone: 3.4.0
Component: Liberated Version: 3.4.0
Keywords: Cc:
Blocked By: Blocking:
Estimated Hours: 4 Total Hours: 0

Description

NVRM: Your system is not currently configured to drive a VGA console NVRM: on the primary VGA device. The NVIDIA Linux graphics driver NVRM: requires the use of a text-mode VGA console. Use of other console NVRM: drivers including, but not limited to, vesafb, may result in NVRM: corruption and stability problems, and is not supported.

bccd@node011:~/CUDA$ ./device-query cudaDriverGetVersion() FAILED, status = 30 (unknown error)

Change History (33)

comment:1 Changed 3 years ago by skylar

In 5211//cluster/svnroot:

v3.4.0 needs explicit compat32 libdir re #896

comment:2 Changed 3 years ago by skylar

In 5215//cluster/svnroot:

fix nvidia_build for v3.4.0 re #896

comment:3 Changed 3 years ago by skylar

In 5219//cluster/svnroot:

remove kernel dependency for nvidia build re #896

comment:4 Changed 3 years ago by skylar

In 5240//cluster/svnroot:

merge addresses #896, #898, #915, #946, #949, #950

comment:5 Changed 3 years ago by skylar

In 5255//cluster/svnroot:

disable nouveau by default re #896

comment:6 Changed 3 years ago by skylar

In 5260//cluster/svnroot:

rolling back gcc 5 changes, no openacc for now re #910, #896

comment:7 Changed 3 years ago by skylar

In 5261//cluster/svnroot:

cuda 5 comes with both i386 and amd64 re #896

comment:8 Changed 3 years ago by skylar

In 5264//cluster/svnroot:

roll back cuda module version re #896

comment:9 Changed 3 years ago by skylar

In 5265//cluster/svnroot:

need gcc-4.8 for cuda re #896

comment:10 Changed 3 years ago by skylar

In 5266//cluster/svnroot:

remove gcc-4.9 re #896

comment:11 Changed 3 years ago by skylar

In 5268//cluster/svnroot:

nvidia build needs ruby-ffi re #896

comment:12 Changed 3 years ago by skylar

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.8 --slave /usr/bin/cpp cpp /usr/bin/cpp-4.8 --slave /usr/bin/cc cc /usr/bin/gcc-4.8

comment:14 Changed 3 years ago by skylar

Apply patch:

  1. Extract NVIDIA package: ./NVIDIA-Linux-x86_64-340.76.run -x
  2. pushd NVIDIA-Linux-x86_64-340.76/kernel
  3. patch < /path/to/nv-pat.patch

comment:15 Changed 3 years ago by skylar

In 5269//cluster/svnroot:

create gcc 4.8 symlinks re #896

comment:16 Changed 3 years ago by skylar

comment:17 Changed 3 years ago by skylar

In 5271//cluster/svnroot:

remove cpp alternatives re #896

comment:18 Changed 3 years ago by skylar

In 5272//cluster/svnroot:

add cc alternative re #896

comment:19 Changed 3 years ago by skylar

In 5273//cluster/svnroot:

need to get uvm module too re #896

comment:20 Changed 3 years ago by skylar

In 5274//cluster/svnroot:

properly manage /usr/bin/cc links re #896[

comment:21 Changed 3 years ago by skylar

In 5275//cluster/svnroot:

link to /usr/bin/cpp re #896

comment:22 Changed 3 years ago by skylar

In 5276//cluster/svnroot:

versioned cpp re #896

comment:23 Changed 3 years ago by skylar

In 5280//cluster/svnroot:

merge addresses #896, #910 #951, #953, #954

comment:24 Changed 3 years ago by skylar

rebuild kernel and .deb's w/ gcc 4.8

comment:25 Changed 3 years ago by skylar

In 5282//cluster/svnroot:

new kernel revision re #896

comment:26 Changed 3 years ago by skylar

Confirmed that NVIDIA-Linux-x86_64-340.76 and cuda 6.5.14 play nicely, but cuda 6.5.14 is 1.5GB.

Next up is trying cuda 6.0.37

comment:27 Changed 3 years ago by skylar

cuda 6.0.37 nvidia driver (331) does not build against linux 4.0.0, does not like nvidia 340.76

size is 1.8GB, could be reduced closer to 1GB though

Last edited 3 years ago by skylar (previous) (diff)

comment:28 Changed 3 years ago by skylar

seems that loading nvidia_uvm and then running as root will solve the problem with 5.5.22

comment:29 Changed 3 years ago by skylar

nvidia device nodes:

crw-rw-rw- 1 root root 249, 0 Aug 19 00:21 /dev/nvidia-uvm crw-rw-rw- 1 root root 195, 0 Aug 18 23:29 /dev/nvidia0 crw-rw-rw- 1 root root 195, 255 Aug 18 23:29 /dev/nvidiactl

comment:30 Changed 3 years ago by skylar

In 5285//cluster/svnroot:

adding nvidia_uvm to load list re #896

comment:31 Changed 3 years ago by skylar

In 5288//cluster/svnroot:

create nvidia-uvm node re #896

comment:32 Changed 3 years ago by skylar

In 5291//cluster/svnroot:

merge addresses #896, #933, #939, #953

Note: See TracTickets for help on using tickets.