Funny you should ask that, I was just looking at the code for a dense component, where without customisation you can see
initial = tf.random.truncated_normal((n_inputs, self._n_neurons), stddev=0.1)
W = tf.compat.v1.get_variable('W', initializer=initial)
initial = tf.constant(0., shape=[self._n_neurons])
b = tf.compat.v1.get_variable('b', initializer=initial)
from which you can see that a truncated normal distribution is used for Weights, and all biases are set to 0
If you want other distributions for initial values, just modify the code accordingly
Note however the TF info for truncated normal
tf.random.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.dtypes.float32, seed=None, name=None)
The default mean is zero and the truncation is described as
The generated values follow a normal distribution with specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked.
Therefore, with the mean left at default values, and stddev 0.1 with a cutoff of 2 stddev, no initialisation value will exceed 0.2
This might not be what you want. In fact I intend to repeat some recent tests with mean = 0.5, stddev = 0.2 to see how the behaviour changes.
Since learning rates etc are I think known to depend on starting values, it would be nice if PL could give more prominence to the initial values somehow