TENSORFLOW HUB — THE FIRST GOOD TUTORIAL

Tensorflow Hub is a library which makes it easier to use pretrained models. You can use models which are trained with millions of images and get state of the art results with small resources. Only the last layers of the model are needed to train. Using pretrained models is called transfer learning and with Tensorflow Hub anyone can do it easily. Using transfer learning training time decrease rapidly and accuracy can become very high in the first epoch.

I wrote this article after learning myself to use Tensorflow Hub. It was kind of trial-and-error progress because I could not find any good tutorials. I was lucky that I had a friend who helped me with this but not all people have someone to ask help. I asked questions about this on different forums without getting respond. I assume that Tensorflow Hub is not very popular although it is super handy after understanding it.

Simple example

With this code we try to predict is an object in an image a motorbike, a watch, or a plain. It is an easy job for human but not for a machine.

First, we just import all needed libraries. The only thing you should notice from here is the second-row import tensorflow_hub as hub

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
import os
from PIL import Image
import matplotlib.pyplot as plt
import random
import pandas as pd
import random
from sklearn.cross_validation import train_test_split

Then we define our hyperparameters and some other information. Important here is image_size. I’m going to explain later why we choose 299x299. Then there is num_classes which just mean how many different objects we want to recognize. In our case, we have motorbikes, planes, and watches which mean that we have 3 classes. Epochs don’t matter that much and 10 is good but if your model’s loss is still decreases after 10 epochs you can make it bigger. This is just basic stuff and doesn’t have anything to do with Tensorflow Hub. Last important thing I want to mention is learning_rate. I found that using a very small learning rate produce good results. I struggled sometime when I got nan as a loss until I finally found that all gradient grew infinity. So if you got the same error that is probably because your learning_rate is too big.

image_size = [299,299]
num_classes = 3
learning_rate = 0.00005
epochs = 10
batch_size = 32
display_step = 1

Then we just import all images to the code. This is not important part because importing data is often case base and I’m not even very good at it.

PATH = 'random_objects/'
files_1 = os.listdir(f'{PATH}airplanes/')
files_2 = os.listdir(f'{PATH}motorbikes/')
files_3 = os.listdir(f'{PATH}watches/')
random.shuffle(files_1)
random.shuffle(files_2)
random.shuffle(files_3)
images,labels = np.array([]),np.array([])counter_1 = 0
counter_2 = 0
counter_3 = 0
for indx in range(len(files_1)+len(files_2)+len(files_3)):
la = random.randint(1,3)
while(True):
if la == 1 and counter_1 < len(files_1):
break
elif la == 2 and counter_2 < len(files_2):
break
elif la == 3 and counter_3 < len(files_3):
break
la = random.randint(1,3)
if la == 1:
lab = np.array([1,0,0])

image_path = f'{PATH}airplanes/{files_1[counter_1]}'
try:
im = Image.open(image_path)
except:
print ("Skip a corrupted file: ", image_path)
continue
im = im.resize([image_size[0],image_size[1]], Image.ANTIALIAS)
pixels = np.asarray(im)
#pixels = pixels/255.0
if images.size == 0:
images = pixels[None,:]
else:
images = np.append(images,pixels[None,:],axis=0)
counter_1 += 1

elif la == 2:
lab = np.array([0,1,0])

image_path = f'{PATH}motorbikes/{files_2[counter_2]}'
try:
im = Image.open(image_path)
except:
print ("Skip a corrupted file: ", image_path)
continue
im = im.resize([image_size[0],image_size[1]], Image.ANTIALIAS)
pixels = np.asarray(im)
#pixels = pixels/255.0
if images.size == 0:
images = pixels[None,:]
else:
images = np.append(images,pixels[None,:],axis=0)
counter_2 += 1

else:
lab = np.array([0,0,1])

image_path = f'{PATH}watches/{files_3[counter_3]}'
try:
im = Image.open(image_path)
except:
print ("Skip a corrupted file: ", image_path)
continue
im = im.resize([image_size[0],image_size[1]], Image.ANTIALIAS)
pixels = np.asarray(im)
#pixels = pixels/255.0
if images.size == 0:
images = pixels[None,:]
else:
images = np.append(images,pixels[None,:],axis=0)
counter_3 += 1

if labels.size == 0:
labels = lab[None,:]
else:
labels = np.append(labels,lab[None,:],axis=0)

Then we split our data into train and test data sets.

x_train, x_test, y_train, y_test = train_test_split(images, labels, test_size=0.1)

We just plot our data sets and find out that we have 1647 images in the train set and 183 in the test set. From here you should also see that images are 299x299x3 so these images have 3 channels for red, green, and blue.

x_train.shape, x_test.shape, y_train.shape, y_test.shapeOUTPUT: ((1647, 299, 299, 3), (183, 299, 299, 3), (1647, 3), (183, 3))

Then we define our x and y values to be empty placeholders.

X_TF = tf.placeholder(tf.float32,[None,image_size[0],image_size[1],3])
Y_TF = tf.placeholder(tf.float32, [None, num_classes])

Now the most important part. Take a break, drink water and after that continue reading.

We set module to be hub.Module . We imported tesnorflow_hub as hub so we can thing hub.Module as tensorflow_hub.Module. The first parameter is just a link to the pretrained model. We got this link from tfhub.dev. There is a lot of different models where you can choose from. This example is about image classification but there is also text and video models. I recommend looking through the site and find out what there is.

I’m using inception_v3 model because it is good for this problem. Basically, all models are producing good results but some models might be better for some tasks. Then you take the link to the model you want BUT don’t close the window yet. There is something you have to know about these models. There is usage header at the middle where is example code. Models work a little bit differently and you should check at least the shape of data. In our example the size is [batch_size,299,299,3]. That is why you have to define the image shape at the beginning. When you import images to make them right size so they fit to the model. Last number is also important because some models don’t take 3 channels, due to that they are designed to grey scale images.

The second parameter in hub.Module is trainable. It is optional and I found the model to train better when it is true.

In the second row we add xs to the module and save those to features parameter. And finally, we add one linear layer to get num_classes amount of outputs.

module = hub.Module("https://tfhub.dev/google/imagenet/inception_v3/classification/1",trainable=True)features = module(X_TF)
logits = tf.layers.dense(inputs=features,units=num_classes)

Then we define our cost and optimizer. We also define our accuracy function.

cost = tf.losses.softmax_cross_entropy(Y_TF,logits)
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(Y_TF, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

Finally we train the model as we would normally do. There is nothing special and Tensorflow Hub related thing.

with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("/tmp/summaries/2",sess.graph)

test_images = x_test[:batch_size]
test_labels = y_test[:batch_size]
test_images = tf.image.resize_images(test_images,[image_size[0],image_size[1]])
test_images = sess.run(test_images)

for epoch in range(epochs):
av_ac = 0
num_steps = int(len(x_train)/batch_size)
for step in range(num_steps):
batch_xs = x_train[batch_size*step:batch_size*(step+1)]
batch_ys = y_train[batch_size*step:batch_size*(step+1)]
batch_xs = tf.image.resize_images(batch_xs,[image_size[0],image_size[1]])
batch_xs = sess.run(batch_xs)

#print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")
#print(len(sess.run(tf.global_variables())))
#print(sess.run(tf.global_variables())[551])
#print("$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$")

summary,_ = sess.run([merged,optimizer], feed_dict={X_TF:batch_xs,Y_TF:batch_ys})
writer.add_summary(summary,step+(epoch*num_steps))

av_ac += sess.run(accuracy, feed_dict={X_TF:test_images,Y_TF:test_labels})
if step % display_step == 0:
print("step:",step,"- epoch:",epoch)
test_pred = sess.run(logits,feed_dict={X_TF: test_images})
print("y_hat:",np.argmax(test_pred[:20],axis=1))
print("y :",np.argmax(test_labels[:20],axis=1))
print("train_cost:",sess.run(cost,feed_dict={X_TF:batch_xs,Y_TF:batch_ys}))
print("valid_cost:",sess.run(cost,feed_dict={X_TF:test_images,Y_TF:test_labels}))
print("Accuracy:",sess.run(accuracy, feed_dict={X_TF:test_images,Y_TF:test_labels}))
print("________________________________________")
print("Average Accuracy:",av_ac/num_steps)

print()
print()
print("Optimization Finished!")
#"""
av_ac = 0
rounds = int(len(x_test)/batch_size)
print("len(x_test):",len(x_test))
print("batch_size:",batch_size)
print("rounds:",rounds)
for st in range(rounds):
ac = sess.run(accuracy, feed_dict={X_TF:x_test[st*batch_size:(st+1)*batch_size],Y_TF:y_test[st*batch_size:(st+1)*batch_size]})
av_ac += ac
av_ac /= rounds
print("Test accuracy:",av_ac)

And finally we got 97.5% accuracy. It got very good accuracy (like 90%) already in the first epoch but a little bit better after training more epochs. I plotted these things using Tensorboard and got nice graphs.

cost

~Lankinen