Ash

Person.

Gotcha! TF Variable Initialization Order

While working on the first part of my style-transfer project, I had:

A new input variable which would have to be initialized from scratch.
The VGG-19 network, whose weights would be initialized using pre-trained values from the checkpoint file.

In most use cases, the input would be a placeholder and hence not require initialization. However, since the input was the image I was optimizing, it needed to be a variable and being a variable, it needed initialization.

The initialization methods were as follows:

Input variable: python sess.run(tf.global_variables_initializer())
VGG-19: python saver.restore(sess, 'vgg_19.ckpt')

On my first pass through, I hadn’t considered that the order of these mattered. I assumed that the if the global initializer was run first, all variables would be initializer and then the saver would restore the appropriate subset corresponding to the checkpoint file. On the other hand, I assumed that if the saver restored the subset first, the global initializer would only focus on the remaining variables, in this case, the newly introduced input variable.

After playing around a little, I found that my second assumption was incorrect! I stumbled across this experimentally. Basically, if both my assumptions were correct, then passing a fixed input variable through the pre-trained network should give the same output always. This happened consistently when the global initializer was run before the saver but not the other way around. Repeated initializations followed by passes through the network gave different outputs.

This clued me in to the fact the the correct way to initialize a combination of new variables and pre-trained variables was to run the global initializer before the saver.

This subtle difference can have insidious effects! If your vgg19 network was not initialized with pre-trained weights, it’s possible that the effects aren’t seen! For example, in my case, I was trying to find a new input image that had the same network response as another input image. In this instance, whether the weights are pre-trained or random, the optimization procedure will still attempt to match the reponses! The incorrect initialization would have eventually caught up to me when I started to make stronger assumptions about the network such as what deeper layers were encoding. If I wasn’t using pre-trained weights, the network’s layers would not have been capturing the correct abstractions of the image (e.g. objects, concepts, etc).

If you’d like to see for yourself the difference in output between correct and incorrect initialization orders, here’s a notebook for you to tinker with!

2020 3
2019 3
2018 11

2020

Guns Germs And Steel

46 minute read

I’ve never been a big history fan. Too many names and dates to memorize. But now, free from the pressure of having to learn for the sake of getting good grad...

Letters From A Stoic

42 minute read

A little while ago, I read Letters from a Stoic by Seneca (translated and edited by Robin Campbell). I got a lot out of reading Letters and wanted to encoura...

Every Tool Is A Hammer

21 minute read

I recently finished reading Every Tool’s a Hammer: Life Is What You Make It1 by Adam Savage. It was an energizing read and I highly recommend this book to fe...

2019

Coord Conv Tutorial

21 minute read

CoordConv

Culture Series

46 minute read

Recently, I binged through the Culture series by Ian M. Banks. It was an amazing read and I thought a write up about it might help ground the experience and ...

Learning to Learn

25 minute read

I recently the following on Coursera: Learning How to Learn: Powerful mental tools to help you master tough subjects Mindshift: Break Through Obstacles ...

2018

Style Transfer

4 minute read

If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.

Gotcha! Tensor Shape

1 minute read

While working on the second part of my style-transfer project, I needed to obtain the shape of a tensor. I decided to try using the tf.shape function.

Style Reconstruction

11 minute read

If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.

Gotcha! TF Variable Initialization Order

2 minute read

While working on the first part of my style-transfer project, I had: A new input variable which would have to be initialized from scratch. The VGG-19 ne...

Gotcha! TF Input Data Type

1 minute read

While working on the first part of my style-transfer project, I found out the hard way that TF is very sensitive to the network’s input’s data type.

Gotcha! TF Saver Subset Initialization

2 minute read

While working on the first part of my style-transfer project, I dealt with two main variable groups: The input variable which was the image I was optimizi...

Gotcha! Pyplot Image Displays

3 minute read

While working on the first part of my style-transfer project, I used pyplot’s imshow to diplay images in the notebook. However, it took me a little bit of pl...

Gotcha! CV2 and Pyplot Channel Order

less than 1 minute read

While working on the first part of my style-transfer project, I used Open CV’s cv2.imwrite to save images to disk. However, the images seemed to have a weird...

Gotcha! CV2 JPEG vs PNG

1 minute read

While working on the first part of my style-transfer project, I ran into lots of image issues. One of the issues was that cv2 uses a BGR channel order inste...

Content Reconstruction

12 minute read

If you’re reading this, I’m assuming that you’ve read the paper Image Style Transfer Using Convolutional Neural Networks and have some familiarity with it.

First Post

less than 1 minute read

Cue customary Hello World.