What you need for TF2? CUDA details etc. please

Continuing the weekend theme of entertaining feedback with movie/vid references… A blast from the past)

You can’t always get what you want
But if you try sometime you find
You get what you need

So. In anticipation of an upcoming release using TF2, please could you share a specific CUDA build?

And could you check that it is possible to build it in an Anaconda environment using conda install (as I did previously for TF1.15)?

And if you can’t get exactly what you wanted, i.e. conda doesn’t have the specific version(s) you would propose by default, is there a conda supported build that you can test and then recommend?

i.e. given the runbook for CUDA & TF1.15, what are the precise CUDA related replacements (cuda toolkit, cudnn, TF versions & channels) to deliver a guaranteed working environment?

Many thanks in advance!

And if I may ask @robertl, what version of Python will be supported/required for the TF2 release?

I’m going to start building based on

  • The Assumption of TF 2.4 (probably 2.4.1)
  • limiting CUDA to only versions available with conda

Which means:

  • CUDA Toolkit 11.0.221 (pretty sure TF2 requires CUDA 11)
  • cuDNN from conda-forge channel (but could also use from conda-forge)

But: TF2.4 could be 3.6-3.8 - will PL be using 3.8?

Tracking and providing info for the benefit of all… 1 build done… jupyterlab wouldn’t start - some issue with pywin32… reinstalled with pip install pywin32==225 and problem gone.

Haven’t tested TF2 yet…

Update 2021-04-07 08:35

I tried running a TF2 test nbotebook from the TensorFlow site here and received an issue with cuDNN… during attempted training

c:\users\julian\anaconda3\envs\tft2_4_env_cuda_11_py3_8\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     57   try:
     58     ctx.ensure_initialized()
---> 59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:

UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node my_model/conv2d/Conv2D (defined at <ipython-input-4-1e051998210b>:10) ]] [Op:__inference_train_step_527]

Errors may have originated from an input operation.

Reminder of installation details

# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
tensorflow-estimator      2.4.0                    pypi_0    pypi
tensorflow-gpu            2.4.1                    pypi_0    pypi
# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
cudatoolkit               11.0.221             h74a9793_0    anaconda
# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
cudnn                        h3e0f4f4_0    conda-forge

Will try a different cuDNN version

cuDNN issue resolved by

  • removing cuDNN (conda remove cuDNN)
  • installing cuDNN (conda install -c conda-forge cudnn=

Test notebook now trains! GPU use is strong.


Epoch 1, Loss: 0.001776, Accuracy: 99.9500, Test Loss: 0.130018, Test Accuracy: 98.4100, Timing: 3.4877 seconds/epoch
Epoch 2, Loss: 0.000761, Accuracy: 99.9783, Test Loss: 0.135070, Test Accuracy: 98.4000, Timing: 3.4305 seconds/epoch
Epoch 3, Loss: 0.001070, Accuracy: 99.9667, Test Loss: 0.159565, Test Accuracy: 98.4500, Timing: 3.4037 seconds/epoch
Epoch 4, Loss: 0.001924, Accuracy: 99.9583, Test Loss: 0.148481, Test Accuracy: 98.4200, Timing: 3.4481 seconds/epoch
Epoch 5, Loss: 0.000880, Accuracy: 99.9750, Test Loss: 0.171329, Test Accuracy: 98.4700, Timing: 3.5546 seconds/epoch

The individual epochs are clearly visible in the CUDA usage


UPDATE - nVidia DLL summary

NVIDIA files in C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8\Library\bin
['Name', 'Company', 'Version']
cublas64_11.dll, NVIDIA Corporation,
cublasLt64_11.dll, NVIDIA Corporation,
cudart64_110.dll, NVIDIA Corporation,
cudnn64_8.dll, NVIDIA Corporation,
cudnn_adv_infer64_8.dll, NVIDIA Corporation,
cudnn_adv_train64_8.dll, NVIDIA Corporation,
cudnn_cnn_infer64_8.dll, NVIDIA Corporation,
cudnn_cnn_train64_8.dll, NVIDIA Corporation,
cudnn_ops_infer64_8.dll, NVIDIA Corporation,
cudnn_ops_train64_8.dll, NVIDIA Corporation,
cufft64_10.dll, NVIDIA Corporation,
cufftw64_10.dll, NVIDIA Corporation,
curand64_10.dll, NVIDIA Corporation,
cusolver64_10.dll, NVIDIA Corporation,
cusolverMg64_10.dll, NVIDIA Corporation,
cusparse64_11.dll, NVIDIA Corporation,
nppc64_11.dll, NVIDIA Corporation,
nppial64_11.dll, NVIDIA Corporation,
nppicc64_11.dll, NVIDIA Corporation,
nppidei64_11.dll, NVIDIA Corporation,
nppif64_11.dll, NVIDIA Corporation,
nppig64_11.dll, NVIDIA Corporation,
nppim64_11.dll, NVIDIA Corporation,
nppist64_11.dll, NVIDIA Corporation,
nppisu64_11.dll, NVIDIA Corporation,
nppitc64_11.dll, NVIDIA Corporation,
npps64_11.dll, NVIDIA Corporation,
nvblas64_11.dll, NVIDIA Corporation,
nvjpeg64_11.dll, NVIDIA Corporation,
nvrtc64_110_0.dll, NVIDIA Corporation,
nvvm64_33_0.dll, NVIDIA Corporation,