Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in TrainerD #1

Open
scotwilli opened this issue Jun 14, 2017 · 7 comments
Open

Issue in TrainerD #1

scotwilli opened this issue Jun 14, 2017 · 7 comments

Comments

@scotwilli
Copy link

Hi,

Iam getting an error at the line trainerD = tf.train.AdamOptimizer().minimze(d_loss,var_list=d_vars)

Please look at the screenshot attached.

Thanks

dcgan_issue

@ghost
Copy link

ghost commented Jun 15, 2017

Hi mate,

Looks like you need to re-run all the components from the above sections of the Jupyter Notebook as it doesn't seem like the tensors are in the working memory of the Kernel.

@adeshpande3
Copy link
Owner

That's definitely what fixes a lot of TF graph issues, but in this case, I think it's an issue with how TF version 1.1 handles scoping.

Go ahead and take a look at the last commit. That should fix it.

@roleiland
Copy link

roleiland commented Jun 26, 2017

I have tf 1.1.0. I try to run the last version of the code and I get the following error:

`---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in ()
----> 1 g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(Dg, tf.ones_like(Dg)))

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_impl.pyc in sigmoid_cross_entropy_with_logits(_sentinel, labels, logits, name)
145 # pylint: disable=protected-access
146 nn_ops._ensure_xent_args("sigmoid_cross_entropy_with_logits",
--> 147 _sentinel, labels, logits)
148 # pylint: enable=protected-access
149

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/nn_ops.pyc in _ensure_xent_args(name, sentinel, labels, logits)
1560 if sentinel is not None:
1561 raise ValueError("Only call %s with "
-> 1562 "named arguments (labels=..., logits=..., ...)" % name)
1563 if labels is None or logits is None:
1564 raise ValueError("Both labels and logits must be provided.")

ValueError: Only call sigmoid_cross_entropy_with_logits with named arguments (labels=..., logits=..., ...)`

With the previous commit (e2b9c7f), I got the error that was reported at the beginning of this post.

I am new with tf, so I don't know whether I fixed the issues with the scopes, but I did the following script: https://drive.google.com/open?id=0B8gMQqp3oacBN25qMHpkbE1aam8

Now the problem is that no mater what are the values of the noise vector that I feed into the generator, it always output the same image. Any suggestions/ideas what can be wrong?

@adeshpande3
Copy link
Owner

Hopefully the latest commit should fix the labels and logits error. As for the outputting of the same image, that is a well known problem of mode collapse. The implementation of GANs in this repo isn't the most optimized code, per se. I emphasized simplicity over that. If you want to see repos with better performance in terms of image quality, be sure to check out https://github.com/carpedm20/DCGAN-tensorflow and https://github.com/soumith/ganhacks for #ganhacks :)

@roleiland
Copy link

roleiland commented Jun 27, 2017

@adeshpande3 thank you for your comment. Now the problem of the logits is solve, but if I run it, I still have the problem that the optimizer does not find the variables names:

ValueError: Variable d_wconv1/Adam/ does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

I am not sure weather it is correct, but can I suggest you to wrap the discriminator and generator function with a tf.variable_scope? I did on this way, and it works:

def  discriminator(x_image, reuse=False):
    with tf.variable_scope('discriminator') as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()
        #First Conv and Pool Layers
        W_conv1 = tf.get_variable('d_wconv1', [5, 5, 1, 8], initializer=tf.truncated_normal_initializer(stddev=0.02))
        b_conv1 = tf.get_variable('d_bconv1', [8], initializer=tf.constant_initializer(0))
        h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
        h_pool1 = avg_pool_2x2(h_conv1)

        #Second Conv and Pool Layers
        W_conv2 = tf.get_variable('d_wconv2', [5, 5, 8, 16], initializer=tf.truncated_normal_initializer(stddev=0.02))
        b_conv2 = tf.get_variable('d_bconv2', [16], initializer=tf.constant_initializer(0))
        h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
        h_pool2 = avg_pool_2x2(h_conv2)

        #First Fully Connected Layer
        W_fc1 = tf.get_variable('d_wfc1', [7 * 7 * 16, 32], initializer=tf.truncated_normal_initializer(stddev=0.02))
        b_fc1 = tf.get_variable('d_bfc1', [32], initializer=tf.constant_initializer(0))
        h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*16])
        h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

        #Second Fully Connected Layer
        W_fc2 = tf.get_variable('d_wfc2', [32, 1], initializer=tf.truncated_normal_initializer(stddev=0.02))
        b_fc2 = tf.get_variable('d_bfc2', [1], initializer=tf.constant_initializer(0))

        #Final Layer
        y_conv=(tf.matmul(h_fc1, W_fc2) + b_fc2)
    return y_conv

and

def generator(z, batch_size, z_dim, reuse=False):
    with tf.variable_scope("generator") as scope:
        if (reuse):
            tf.get_variable_scope().reuse_variables()
        g_dim = 64 #Number of filters of first layer of generator 
        c_dim = 1 #Color dimension of output (MNIST is grayscale, so c_dim = 1 for us)
        s = 28 #Output size of the image
        s2, s4, s8, s16 = int(s/2), int(s/4), int(s/8), int(s/16) #We want to slowly upscale the image, so these values will help
                                                                  #make that change gradual.

        h0 = tf.reshape(z, [batch_size, s16+1, s16+1, 25])
        h0 = tf.nn.relu(h0)
        #Dimensions of h0 = batch_size x 2 x 2 x 25

        #First DeConv Layer
        output1_shape = [batch_size, s8, s8, g_dim*4]
        W_conv1 = tf.get_variable('g_wconv1', [5, 5, output1_shape[-1], int(h0.get_shape()[-1])], 
                                  initializer=tf.truncated_normal_initializer(stddev=0.1))
        b_conv1 = tf.get_variable('g_bconv1', [output1_shape[-1]], initializer=tf.constant_initializer(.1))
        H_conv1 = tf.nn.conv2d_transpose(h0, W_conv1, output_shape=output1_shape, strides=[1, 2, 2, 1], padding='SAME')
        H_conv1 = tf.contrib.layers.batch_norm(inputs = H_conv1, center=True, scale=True, is_training=True, scope="g_bn1")
        H_conv1 = tf.nn.relu(H_conv1)
        #Dimensions of H_conv1 = batch_size x 3 x 3 x 256

        #Second DeConv Layer
        output2_shape = [batch_size, s4 - 1, s4 - 1, g_dim*2]
        W_conv2 = tf.get_variable('g_wconv2', [5, 5, output2_shape[-1], int(H_conv1.get_shape()[-1])], 
                                  initializer=tf.truncated_normal_initializer(stddev=0.1))
        b_conv2 = tf.get_variable('g_bconv2', [output2_shape[-1]], initializer=tf.constant_initializer(.1))
        H_conv2 = tf.nn.conv2d_transpose(H_conv1, W_conv2, output_shape=output2_shape, strides=[1, 2, 2, 1], padding='SAME')
        H_conv2 = tf.contrib.layers.batch_norm(inputs = H_conv2, center=True, scale=True, is_training=True, scope="g_bn2")
        H_conv2 = tf.nn.relu(H_conv2)
        #Dimensions of H_conv2 = batch_size x 6 x 6 x 128

        #Third DeConv Layer
        output3_shape = [batch_size, s2 - 2, s2 - 2, g_dim*1]
        W_conv3 = tf.get_variable('g_wconv3', [5, 5, output3_shape[-1], int(H_conv2.get_shape()[-1])], 
                                  initializer=tf.truncated_normal_initializer(stddev=0.1))
        b_conv3 = tf.get_variable('g_bconv3', [output3_shape[-1]], initializer=tf.constant_initializer(.1))
        H_conv3 = tf.nn.conv2d_transpose(H_conv2, W_conv3, output_shape=output3_shape, strides=[1, 2, 2, 1], padding='SAME')
        H_conv3 = tf.contrib.layers.batch_norm(inputs = H_conv3, center=True, scale=True, is_training=True, scope="g_bn3")
        H_conv3 = tf.nn.relu(H_conv3)
        #Dimensions of H_conv3 = batch_size x 12 x 12 x 64

        #Fourth DeConv Layer
        output4_shape = [batch_size, s, s, c_dim]
        W_conv4 = tf.get_variable('g_wconv4', [5, 5, output4_shape[-1], int(H_conv3.get_shape()[-1])], 
                                  initializer=tf.truncated_normal_initializer(stddev=0.1))
        b_conv4 = tf.get_variable('g_bconv4', [output4_shape[-1]], initializer=tf.constant_initializer(.1))
        H_conv4 = tf.nn.conv2d_transpose(H_conv3, W_conv4, output_shape=output4_shape, strides=[1, 2, 2, 1], padding='VALID')
        H_conv4 = tf.nn.tanh(H_conv4)
        #Dimensions of H_conv4 = batch_size x 28 x 28 x 1

    return H_conv4

Also later:
sample_image = generator(z_placeholder, 1, z_dimensions, reuse=True)

But if I run the code, the generated images are strange, so I am not sure if my solution is correct.

@adeshpande3
Copy link
Owner

Thanks for the suggestion. I agree that that's the best way to fix the problem. As for the strange generated images, I'm not particularly sure what the problem might be. Like I said before, this isn't the most optimized and hyperparameter tuned code, so It could be anything from the structure of the network to length of training time to application of batch norm, etc.

@roleiland
Copy link

roleiland commented Jun 28, 2017

I just started to work with GAN almost a week ago, so I am not an expert. But it seems that one of the biggest problems comes with the discriminator function. It is very easy to be in that part of the sigmoid curve where it is saturated, so the generated distribution cannot be move to the real one:

https://drive.google.com/open?id=0B8gMQqp3oacBMWZvSURuQTJjY2M

The image is from the paper Wassertain GAN. In the image, the red curve is the decision boundary of a GAN based on a sigmoid discriminator, in light blue is the Earth Mover or Wasserstein distance of the critic. As you can see, it is not flat like the sigmoid, so the generated distribution can be moved toward the real one.

I am going to try to implement it based in you code, let's see whether it is better. Thank for your nice tutorial :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants