-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in HamiltonianMonteCarlo when using gather in log_prob function #1837
Comments
@martin-wiebusch-thg I think this is due to gradients with It's possible to convert to tensor using import tensorflow as tf
import tensorflow_probability as tfp
i = tf.constant([0, 0, 1])
def value(x):
return tf.reduce_sum(tf.gather(x, i))
def value_and_gradient(x):
return tfp.math.value_and_gradient(value, x)
y = tf.constant([1.0, 2.0])
value_and_gradient(y)
# (<tf.Tensor: shape=(), dtype=float32, numpy=4.0>,
# <tensorflow.python.framework.indexed_slices.IndexedSlices at 0x7bf0d6a519c0>)
tf.convert_to_tensor(value_and_gradient(y)[1])
# <tf.Tensor: shape=(2,), dtype=float32, numpy=array([2., 1.], dtype=float32)> import tensorflow as tf
import tensorflow_probability as tfp
i = tf.constant([0, 0, 1])
def value(x):
return tf.reduce_sum(tf.linalg.matvec(tf.one_hot(i, 2), x))
def value_and_gradient(x):
return tfp.math.value_and_gradient(value, x)
y = tf.constant([1.0, 2.0])
value_and_gradient(y)
# (<tf.Tensor: shape=(), dtype=float32, numpy=4.0>,
# <tf.Tensor: shape=(2,), dtype=float32, numpy=array([2., 1.], dtype=float32)>) import jax.numpy as jnp
import tensorflow_probability.substrates.jax as tfp
i = jnp.array([0, 0, 1])
def value(x):
return jnp.sum(x[i])
def value_and_gradient(x):
return tfp.math.value_and_gradient(value, x)
y = jnp.array([1.0, 2.0])
value_and_gradient(y)
# (Array(4., dtype=float32), Array([2., 1.], dtype=float32)) |
I am trying to run a Hamiltonian MCMC on a target distribution whose implementation involves a call to tf.gather. The following code:
raises
ValueError: The two structures don't have the same nested structure.
followed by a very long and (to me) cryptic message. Replacing the return statement in the logprob function with the commented line gets rid of the error. The error seems to appear whenever the result of logprob contains atf.gather
subexpression.The error also disappears when I remove the
@tf.function
decorator from the definition ofrun_chain
. However, this comes at a huge performance cost.How can I efficiently sample from a distribution whose log-probability involves a
tf.gather
expression?The text was updated successfully, but these errors were encountered: