What you need for TF2? CUDA details etc. please

So. In anticipation of an upcoming release using TF2, please could you share a specific CUDA build?

And could you check that it is possible to build it in an Anaconda environment using conda install (as I did previously for TF1.15)?

And if you can’t get exactly what you wanted, i.e. conda doesn’t have the specific version(s) you would propose by default, is there a conda supported build that you can test and then recommend?

i.e. given the runbook for CUDA & TF1.15, what are the precise CUDA related replacements (cuda toolkit, cudnn, TF versions & channels) to deliver a guaranteed working environment?

And if I may ask @robertl, what version of Python will be supported/required for the TF2 release?

I’m going to start building based on

  • The Assumption of TF 2.4 (probably 2.4.1)
  • limiting CUDA to only versions available with conda

Which means:

  • CUDA Toolkit 11.0.221 (pretty sure TF2 requires CUDA 11)
  • cuDNN from conda-forge channel (but could also use from conda-forge)

But: TF2.4 could be 3.6-3.8 - will PL be using 3.8?

Tracking and providing info for the benefit of all… 1 build done… jupyterlab wouldn’t start - some issue with pywin32… reinstalled with pip install pywin32==225 and problem gone.

Haven’t tested TF2 yet…

Update 2021-04-07 08:35

I tried running a TF2 test nbotebook from the TensorFlow site here and received an issue with cuDNN… during attempted training

c:\users\julian\anaconda3\envs\tft2_4_env_cuda_11_py3_8\lib\site-packages\tensorflow\python\eager\execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     57   try:
     58     ctx.ensure_initialized()
---> 59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     60                                         inputs, attrs, num_outputs)
     61   except core._NotOkStatusException as e:

UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node my_model/conv2d/Conv2D (defined at <ipython-input-4-1e051998210b>:10) ]] [Op:__inference_train_step_527]

Errors may have originated from an input operation.

Reminder of installation details

# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
tensorflow-estimator      2.4.0                    pypi_0    pypi
tensorflow-gpu            2.4.1                    pypi_0    pypi
# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
cudatoolkit               11.0.221             h74a9793_0    anaconda
# packages in environment at C:\Users\Julian\anaconda3\envs\TFT2_4_ENV_CUDA_11_PY3_8:
# Name                    Version                   Build  Channel
cudnn                        h3e0f4f4_0    conda-forge

Will try a different cuDNN version

cuDNN issue resolved by

  • removing cuDNN (conda remove cuDNN)
  • installing cuDNN (conda install -c conda-forge cudnn=

Test notebook now trains! GPU use is strong.


Epoch 1, Loss: 0.001776, Accuracy: 99.9500, Test Loss: 0.130018, Test Accuracy: 98.4100, Timing: 3.4877 seconds/epoch
Epoch 2, Loss: 0.000761, Accuracy: 99.9783, Test Loss: 0.135070, Test Accuracy: 98.4000, Timing: 3.4305 seconds/epoch
Epoch 3, Loss: 0.001070, Accuracy: 99.9667, Test Loss: 0.159565, Test Accuracy: 98.4500, Timing: 3.4037 seconds/epoch
Epoch 4, Loss: 0.001924, Accuracy: 99.9583, Test Loss: 0.148481, Test Accuracy: 98.4200, Timing: 3.4481 seconds/epoch
Epoch 5, Loss: 0.000880, Accuracy: 99.9750, Test Loss: 0.171329, Test Accuracy: 98.4700, Timing: 3.5546 seconds/epoch

The individual epochs are clearly visible in the CUDA usage


UPDATE - nVidia DLL summary

