Initial values question

What does PL currently (v0.11.15) do when initialising values for various things, especially convolution kernels and dense weights and biases?

If set by some sort of random, what is the PDF type (expectation: either uniform or Gaussian), and what are the basic stats (mean, variance etc.)

Will it be possible to specify PDF and statistics per component later on?

Dear @JulianSMoore

Funny you should ask that, I was just looking at the code for a dense component, where without customisation you can see

initial = tf.random.truncated_normal((n_inputs, self._n_neurons), stddev=0.1)
W = tf.compat.v1.get_variable('W', initializer=initial)
initial = tf.constant(0., shape=[self._n_neurons])
b = tf.compat.v1.get_variable('b', initializer=initial) 

from which you can see that a truncated normal distribution is used for Weights, and all biases are set to 0

If you want other distributions for initial values, just modify the code accordingly :slight_smile:

Note however the TF info for truncated normal

tf.random.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.dtypes.float32, seed=None, name=None)

The default mean is zero and the truncation is described as

The generated values follow a normal distribution with specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked.

Therefore, with the mean left at default values, and stddev 0.1 with a cutoff of 2 stddev, no initialisation value will exceed 0.2

This might not be what you want. In fact I intend to repeat some recent tests with mean = 0.5, stddev = 0.2 to see how the behaviour changes.

Since learning rates etc are I think known to depend on starting values, it would be nice if PL could give more prominence to the initial values somehow

HTH, Julian

1 Like