profile
viewpoint

Ask questionsInstalling from source using Conda and CUDA could be improved

Thanks to all contributors for their efforts in creating and open sourcing the library. I would like to add my 2 cents of installation process involving building from source for whatever that's worth.

I always like to install things in conda envs so that there is no clash between different software version or requirement libraries.

MWE:

conda create -n jax python scipy cudnn cudatoolkit
conda list

image

Now the installation process:

python build/build.py --enable_cuda --cuda_path  ~/miniconda3/envs/jax/lib/ --cudnn_path ~/miniconda3/envs/jax/include

2 Problems arise:

1. nvcc cannot be found in path ~/miniconda3/envs/jax/lib/ bin 
actually the path is wrong, it should have been ~/miniconda3/envs/jax/bin.
Anyways, I copy nvcc from system wide installation /opt/cuda/bin/nvcc into ~/miniconda3/envs/jax/lib/bin.
So far so good.

2. re-running build it complains about cuda.h
Cuda Configuration Error: Cannot find cuda.h under ~/miniconda3/envs/jax/lib 
FAILED: Build did NOT complete successfully (4 packages loaded, 16 targets

ok, let's copy /opt/cuda/include/cuda.h into ~/miniconda3/envs/jax/lib 
re-running build after removing completely rm -rf ~/.cache/bazel
gives again the same error about not being able to find cuda.h.
At this point I am out of ideas.

Anyone else having other ideas on how to resolve this?

google/jax

Answer questions murphyk

I tried this (with 2 simple additional steps, highlighted) but got the errors below

conda create -n jax python tensorflow-gpu scipy future cudnn=7 python=3.7
**conda activate jax**
git clone https://github.com/google/jax.git
**cd jax**
python build/build.py --enable_cuda --cuda_path /usr --cudnn_path ~/miniconda3/envs/jax --python_bin_path ~/miniconda3/envs/jax/bin/python

...

Bazel binary path: ./bazel-0.24.1-linux-x86_64
Python binary path: /home/murphyk/miniconda3/envs/jax/bin/python
MKL-DNN enabled: yes
-march=native: no
CUDA enabled: yes
CUDA toolkit path: /usr
CUDNN library path: /home/murphyk/miniconda3/envs/jax

Building XLA and installing it in the jaxlib source tree...
INFO: Build options --action_env and --python_path have changed, discarding analysis cache.
ERROR: /home/murphyk/jax/build/BUILD.bazel:21:1: error loading package 'jaxlib': in /home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/tensorflow/core/platform/default/build_config.bzl: Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1266
		_create_local_cuda_repository(repository_ctx)
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 988, in _create_local_cuda_repository
		_get_cuda_config(repository_ctx)
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 714, in _get_cuda_config
		find_cuda_config(repository_ctx, ["cuda", "cudnn"])
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 694, in find_cuda_config
		auto_configure_fail(("Failed to run find_cuda_config...))
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 325, in auto_configure_fail
		fail(("\n%sCuda Configuration Error:%...)))

Cuda Configuration Error: Failed to run find_cuda_config.py: Inconsistent CUDA toolkit path: /usr vs /usr/lib
 and referenced by '//build:install_xla_in_source_tree'
ERROR: /home/murphyk/jax/build/BUILD.bazel:21:1: error loading package 'jaxlib': in /home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/tensorflow/core/platform/default/build_config.bzl: Encountered error while reading extension file 'cuda/build_defs.bzl': no such package '@local_config_cuda//cuda': Traceback (most recent call last):
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 1266
		_create_local_cuda_repository(repository_ctx)
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 988, in _create_local_cuda_repository
		_get_cuda_config(repository_ctx)
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 714, in _get_cuda_config
		find_cuda_config(repository_ctx, ["cuda", "cudnn"])
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 694, in find_cuda_config
		auto_configure_fail(("Failed to run find_cuda_config...))
	File "/home/murphyk/.cache/bazel/_bazel_murphyk/dd8a6ab338402747dc013d7665a15b3c/external/org_tensorflow/third_party/gpus/cuda_configure.bzl", line 325, in auto_configure_fail
		fail(("\n%sCuda Configuration Error:%...)))

However, maybe I should somehow use the locations below?

locate cuda | grep /cuda*
/usr/include/cuda.h
...

locate cudnn.h 
/usr/lib/x86_64-linux-gnu/libcudnn.so
/usr/include/cudnn.h
...
useful!

Related questions

Add has_aux to jacrev, jacfwd and hessian hot 1
Installation problem hot 1
jax `odeint` fails against scipy `odeint` hot 1
cuda failed to allocate errors hot 1
cuda failed to allocate errors hot 1
Custom VJPs for external functions hot 1
cuda failed to allocate errors hot 1
Unimplemented NumPy core functions hot 1
Reshape layer for stax - jax hot 1
Clear GPU memory hot 1
jax/stax BatchNorm: running average on the training set and l2 regularisation hot 1
Github User Rank List