Docker: Creating a portable image recognition app with TensorFlow and Shiny

Disclaimer

I know I said I would be showing you how to retrain two neural nets to detect cats. However, I accidentally left a test running for a little longer than expected on Google Cloud ML and I ran out of free credits! If any of you fine readers would like to send me some credits or hook me up with a GPU I will get round to doing the next part of this blog series very swiftly! However, until then you can have a sneaky peak into the third part of the blog series!

Docker

What the heck is Docker?

Docker is incredible, the more I use it the more I love it. The reason being it allows you to create an environment that does not depend on your computer’s operating system. This means if you create a docker container with something like python in you will be able to run that same container on a Mac, Window or a Linux PC. This has many advantages, a particular one being that anything you do inside the docker container will be reproducible. Since then one does not have to worry about whether someone has to install the Windows version or the Mac version etc. Docker containers also facilitate the rapid development work for Data Scientists, Web Developers, Software Engineers,… you name it! Here’s a list of some reputable companies currently using Docker: BBC News, PayPal, Expedia, General Electric, Google, Amazon. There are also countless other things you can do with Docker, such as:

    • – Setting up a Docker swarm (or using Kubernetes) to orchestrate your containers.

 

    • – Using Docker containers to run continuous integration (using tools such as Jenkins, Travis, etc).

 

    – Multiple users can use the same Docker container at once! This makes running browser based like IDE’s Rstudio Server or JupyterHub very easy to setup and deploy.

How I have used Docker

First off I’ll answer a slightly different question, ‘Why I have used Docker in this blog’. The reason is, that I wanted the content of my blog to be fully reproducible for anyone that reads it and easily deployable by any keen readers too. The best way to do this in my experience was to ‘Dockerize’ anything that I was going to publish. So in the process of creating this blog I only used docker containers. I actually used 3 different containers for this development, one container running Jupyter notebooks using jupyter/tensorflow-notebook (with a few pip installs here and there). Another based off of Rstudio’s rocker/geospatial which I used to develop the App and the final container which runs the finished Shiny App is based off of rocker/shiny. See below for the Dockerfiles for each of these builds:

RStudio Dockerfile

Shiny Dockerfile

How to get in on the Docker action

If I have persuaded you to try out docker then follow the install instructions here, or click the links below for your operating system:

Windows Apple Ubuntu

Python – TensorFlow

To use the RCNN model simpy change the  MODEL_NAME parameter to ‘faster_rcnn_resnet101_coco_11_06_2017’

To see the Python code that I adapted from Google’s TensorFlow GitHub repo simply toggle the buttons above.

The two models that I decided to implement were a Faster-RCNN-ResNet and an SSD-MobileNet. These sound scary but if we break down what they actually mean it might make a little more sense as to what they actually are.

Faster-RCNN-ResNet

This model can be broken down into three parts as you can probably tell by the name! An RCNN is a Region-based Convolution Neural Network and it aims to propose regions of an image to be passed through a Convolutional Neural Network to compute features, these features are then passed through an SVM (Support Vector Machine) to be classified with an associated probability. This is easier to understand in a diagram:

Now these models can take weeks to train and to have a powerful model you would normally try to have a deep network. But the deeper the network the more expensive it is to train. Luckily some boffins at Microsoft came up with the idea of using a Residual Network or ResNet for short. These ResNets require less parameters (weights) than their regular counterparts and can therefore be used to create deeper models with a reduced expense. The residual block in the network essentially provides a kind of shortcut if the input of the next block and output of the previous block are of the dimension.

Finally the Faster RCNN part, this should be obvious! It’s a faster version, it does this by sharing convolutions across region proposals. The region proposals are usually done using algorithms like Selective Search or EdgeBoxes.

SSD-MobileNet

Let’s break this down again, an SSD is a Single Shot MultiBox Detector, it aims to be just as accurate as the Faster-RCNN-ResNet is but much faster! In fact it clocks in at around 59 frames per second in comparison to 7 FPS for the bulky Faster-RCNN-ResNet. It does this by ‘eliminating bounding box proposals and the subsequent pixel or feature resampling stage’. The model does this by a few improvements such as ‘using a small convolutional filter to predict object categories and offsets in bounding box locations, using separate predictors (filters) for different aspect ratio detections, and applying these filters to multiple feature maps from the later stages of a network in order to perform detection at multiple scales’

MobileNets are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks. The name pointing out an obvious use case, for the deployment of models on devices with less resources such as mobiles. This means it is perfect to deploy into a Shiny App! For a comparison in model size if you download the Shiny App I created the SSD-MobileNet takes up 29.2MB vs the Faster-RCNN-ResNet coming in at a whopping 196.9MB. I’d urge you to give the Shiny App a go so you can see how much faster the SSD-MobileNet is than the Faster-RCNN-ResNet model, it’s very impressive!

R – Shiny

Download Full Shiny App

 

To see the code that is used to create the Shiny App (minus the Python code) simply toggle the buttons above.

How deploy this Shiny App on your machine

Note: To be able to run the Faster-RCNN-ResNet model from the Docker container you may have to increase your allocated memory for the Docker virtual machine to around 5GB.

Final outcome

Leave a comment

Your email address will not be published. Required fields are marked *