Guns Germs And Steel
I’ve never been a big history fan. Too many names and dates to memorize. But now, free from the pressure of having to learn for the sake of getting good grad...
While working on the first part of my style-transfer project, I had:
In most use cases, the input would be a placeholder and hence not require initialization. However, since the input was the image I was optimizing, it needed to be a variable and being a variable, it needed initialization.
The initialization methods were as follows:
python sess.run(tf.global_variables_initializer())
python saver.restore(sess, 'vgg_19.ckpt')
On my first pass through, I hadn’t considered that the order of these mattered. I assumed that the if the global initializer was run first, all variables would be initializer and then the saver would restore the appropriate subset corresponding to the checkpoint file. On the other hand, I assumed that if the saver restored the subset first, the global initializer would only focus on the remaining variables, in this case, the newly introduced input variable.
After playing around a little, I found that my second assumption was incorrect! I stumbled across this experimentally. Basically, if both my assumptions were correct, then passing a fixed input variable through the pre-trained network should give the same output always. This happened consistently when the global initializer was run before the saver but not the other way around. Repeated initializations followed by passes through the network gave different outputs.
This clued me in to the fact the the correct way to initialize a combination of new variables and pre-trained variables was to run the global initializer before the saver.
This subtle difference can have insidious effects! If your vgg19 network was not initialized with pre-trained weights, it’s possible that the effects aren’t seen! For example, in my case, I was trying to find a new input image that had the same network response as another input image. In this instance, whether the weights are pre-trained or random, the optimization procedure will still attempt to match the reponses! The incorrect initialization would have eventually caught up to me when I started to make stronger assumptions about the network such as what deeper layers were encoding. If I wasn’t using pre-trained weights, the network’s layers would not have been capturing the correct abstractions of the image (e.g. objects, concepts, etc).
If you’d like to see for yourself the difference in output between correct and incorrect initialization orders, here’s a notebook for you to tinker with!
I’ve never been a big history fan. Too many names and dates to memorize. But now, free from the pressure of having to learn for the sake of getting good grad...
A little while ago, I read Letters from a Stoic by Seneca (translated and edited by Robin Campbell). I got a lot out of reading Letters and wanted to encoura...
I recently finished reading Every Tool’s a Hammer: Life Is What You Make It1 by Adam Savage. It was an energizing read and I highly recommend this book to fe...
CoordConv
Recently, I binged through the Culture series by Ian M. Banks. It was an amazing read and I thought a write up about it might help ground the experience and ...
I recently the following on Coursera: Learning How to Learn: Powerful mental tools to help you master tough subjects Mindshift: Break Through Obstacles ...
If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.
While working on the second part of my style-transfer project, I needed to obtain the shape of a tensor. I decided to try using the tf.shape function.
If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.
While working on the first part of my style-transfer project, I had: A new input variable which would have to be initialized from scratch. The VGG-19 ne...
While working on the first part of my style-transfer project, I found out the hard way that TF is very sensitive to the network’s input’s data type.
While working on the first part of my style-transfer project, I dealt with two main variable groups: The input variable which was the image I was optimizi...
While working on the first part of my style-transfer project, I used pyplot’s imshow to diplay images in the notebook. However, it took me a little bit of pl...
While working on the first part of my style-transfer project, I used Open CV’s cv2.imwrite to save images to disk. However, the images seemed to have a weird...
While working on the first part of my style-transfer project, I ran into lots of image issues. One of the issues was that cv2 uses a BGR channel order inste...
If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.
Cue customary Hello World.