Lesson 8 wiki

(Jeremy Howard) #1

This post is editable by all of you! Please edit it to add any useful information for this weeks class, including links brought up during class, other helpful readings, useful code/shell snippets etc. Also, please help organize this wiki post by putting things in sections, adding/editing prose, etc.

NB: Discourse does not allow multiple edits at the same time - if two people save, most recent wins. So copy your edits before saving, just in case!

Lesson 8 Wiki

Class links


[A Neural Algorithm of Artistic Style] (https://arxiv.org/abs/1508.06576v2)

What problem are they solving?

Generate an image that has the content of one image and the style of another. “Style” here means colors and textures.

What is the general idea that they are using to solve it?

Use a pre-trained CNN to extract the content of one image and the style of the other. Combine the content with the style.

A step-by-step explanation:

  1. Choose a CNN pre-trained for image classification (e.g. VGG trained on ImageNet).
  2. Generate a random input image.
  3. Choose a measure and measure the difference between the input image’s CNN activations and the content image’s CNN activations (e.g. MSE).
  4. Choose another measure and measure the difference between the input image’s CNN activations and the style image’s CNN activations (e.g. MSE of the Gram matrices).
  5. Choose a way to combine the differences and combine them (e.g. a weighted sum).
  6. Minimize the combination by changing the pixels of the input image (which will then change the CNN activations, the differences, and finally the combination).
  7. Repeat using the modified input image until it looks good or doesn’t change.

The CNN activations can come from multiple convolutional layers.

What kind of results are they getting?

What previous work are they building on?

Cool examples of neural style

Links mentioned during class

Code snippets

  • To get all files for the lesson (h/t @ibarinov):
    wget -r -nH -nd -np -R index.html* http://files.fast.ai/part2/lesson8/
    (files out-of-date)

Steps needed for Style Transfer using VGG:

Content extraction

  1. Read the cont_image
  2. Resize cont_image
  3. Preprocess : RGB ->BGR and normalize
  4. Create VGG_avg
  5. Generate P(l) = activations for the cont_image at layer l
  6. Generate F(l) = activations for white noise image at layer l
  7. content_loss = MSE(P(l), F(l))

Style extraction

  1. Read the style_image
  2. Resize style_image
  3. Preprocess : RGB ->BGR and normalize
  4. Create VGG_avg
  5. Generate Gram_matrix for original image, A(L) = Inner product of F * Ft for the layers L, where F is the vectorized feature map. (There is some weight to the loss for each layer?)
  6. Generate Gram_matrix for white noise image, G(L) similarly above
  7. style_loss = MSE(A(L), G(L))

Style transfer

  1. loss(c,s,x) = a * content_loss(c, x) + b * style_loss(s, x), where c = content image, s = style image, x = generated image
  2. Use scipy’s implementation of L-BFGS to find the values of “x” that minimize the loss (fmin_l_bfgs_b(loss, x0=x, args=(c, s))). In our case, “x” happens to be image pixels, and thus we end up searching for the image that is close to both the content image © and the style image (s).

An Introduction to Tensorflow

The software library is called TensorFlow and the central unit of data in TensorFlow is known as tensor. Tensors are values shaped in the form of arrays varying from zero to any number of dimensions. When you hear a “Rank” of a tensor it refers to the number of dimensions of the tensor array.


3                  # A Rank 0 tensor
[1, 2, 3]          # A rank 1 tensor
[[1, 2, 3]]        # A rank 2 tensor

TensorFlow Dev Summit 2017 notes

Tensorflow Dev Summit Videos


  • History:
    • DistBelief in 2012
      • Scalable and worked well in production
      • But not flexible
        • Designed for CPUs, GPUs were bolted on
        • Worked for simple models but sequence models and reinforcement learning problems were hard to express
    • TensorFlow
      • Supports many platforms
        • CPU, GPU, Android, iOS, Raspberry Pi, ASICs (e.g. TPU)
      • Supports cloud platforms and virtual machines
      • Supports many languages
      • Visualization and monitoring tools
        • TensorBoard
    • Progress to date
      • Nov '15 (v0.5): Launched
      • Dec '15 (v0.6): Faster on GPUs; Python 3.3+
      • Feb '16 (v0.7): TF Serving
      • Apr '16 (v0.8): Distributed TensorFlow
      • Jun '16 (v0.9): iOS; Mac GPU
      • Aug '16 (v0.10): TF Slim
      • Oct '16 (v0.11): HDFS; CUDA 8; CuDNN 5
      • Nov '16 (v0.12): Windows 7, 10, and Server 2016; TensorBoard Embedding Visualizer
      • Feb '17 (v1.0): XLA (accl. linear algebra), TF debugger; pip package; better API; stable API
  • Applications:
    • Word lens (augmented reality; translating text in images to other languages)
    • Spam detection
    • Smart reply (“Are you free tomorrow?”–> “Yeah, what’s up?”)
    • Cucumber sorter
    • Dermatologist-level classification of skin cancer
    • Ophthalmologist-level detection of diabetic retinopathy


  • Just-in-time (JIT) compiler for TF
    • Better on GPUs, worse on CPUs (for now)
  • Ahead-of-time (AOT) compiler for TF
    • Smaller binaries on mobile (600KB instead of 2.6MB)
  • Makes models faster
  • Makes models smaller
  • Experimental


  • Visualizing the computation graph
    • tf.summary.FileWriter
      • A Python class that writes data for TensorBoard
      • Code example
            writer = tf.summary.FileWriter("tmp/mnist_demo/1")
        $ tensorboard --logdir /tmp/mnist_demo/1

  • Summaries
    • tf.summary.scalar
    • tf.summary.image
    • tf.summary.audio
    • tf.summary.histogram
    • Code example
        merged_summary = tf.summary.merge_all()
        writer = tf.summary.FileWriter("tmp/mnist_demo/3")
        for i in range(2001):
            batch = mnist.train.next_batch(100)
            feed_dict = {x: batch[0], y: batch[1]}
            if i % 5:
                s = sess.run(merged_summary, feed_dict=feed_dict)
                writer.add_summary(s, i)
            sess.run(train_step, feed_dict=feed_dict)
    $ tensorboard --logdir /tmp/mnist_demo/3
  • Hyperparameter search
    • Parent directory
    • Child directory for each hyperparameter
  • Embedding visualizer

High-Level API

  • Layers: Layer-oriented API (like Keras)
    • tf.layers
    • An API that maps to architecture descriptions
    • Compatible with Keras
  • Estimator
    • model_fn
    • API:
      • train_op: fit()
      • eval_op: evaluate()
      • predictions: predict()
      • export_savedmodel()
        • For TensorFlow Serving
  • Canned Estimators

Integrating Keras & TensorFlow

  • Keras: An API for building deep learning models
  • Integrating with TensorFlow means:
    • New features with the Keras API:
      • Distributed learned
      • Cloud ML
      • Hyperparameter tuning
      • TF Serving
  • Take-aways slide:
    • For TF users: an accessible high-level API with good defaults
    • For Keras users: powerful TF features for your Keras models
    • tf.contrib.keras by TF 1.1 (mid-March 2017)
    • tf.keras by TF 1.2
    • A big step in making TensorFlow and deep learning accessible to as many people as possible.

TensorFlow at DeepMind

  • DeepMind moved to TensorFlow
  • Data center cooling
  • AlphaGo
  • WaveNet
    • Natural-sounding speech
    • Music generation (nonvocal)
  • Learning to learn by gradient descent by gradient descent
    • Trained a neural network to train a neural network

Skin Cancer Image Classification

  • Stanford
  • AI lab + medical school
  • Skin cancer
    • Most common cancer in the US
    • 1 in 5 Americans will develop skin cancer
    • Estimate for 2017: 87k new cases of melanoma and 9.7k deaths from it.
    • Survival rate for melanoma is 98% if detected early
    • Estimate for 2020: 6.1 billion smartphones in circulation
  • “If your program can differentiate between hundreds of dog breeds, I believe it could make a great contribution to dermatology.” - Dr. Rob Novoa, Jan 27, 2015, to the Stanford AI lab
  • Dataset:
    • 129k images, 2k diseases
    • Base classes: 2k
    • Superclasses: Benign, ambiguous, dangerous, deadly
  • Training:
    • “We find that training on finer classes results in better performance.”
    • If they want less-fine classification, they sum the probabilities of the constituent classes.
    • Transfer learning with Inception-V3 (V3 worked better than V1)
  • Evaluation:
    • Sensitivity: True position / positive
    • Specificity: True negative / negative
    • Confusion matrices

Mobile and Embedded TensorFlow

  • A low-level talk
  • Offering unique user experiences
    • Real-time translations (e.g. Word Lens)
    • Predicting next words on keyboards
    • Scanning old photos
    • Detecting diseases real-time
  • Working closely with hardware builders
    • ARM, CEVA, Movidius, IBM, Intel, Qualcomm
  • TF support for:
    • Android, iOS, Raspberry Pi
  • Tutorials for getting started with TF + Android/iOS/RaspberryPi
  • TF + Android examples
    • TF Classify
    • TF Detect
    • TF Stylize
  • Managing binary size

Distributed TensorFlow

  • Data parallelism
  • Model paralellism
  • Very large models (wide and deep)
  • Outrageously large models
  • Core concepts of distributed TF:
    • Replicating your model
    • Device placement for Variables
    • Sessions and Servers
    • Fault tolerance

TensorFlow Ecosystem: Integrating TensorFlow with Your Infrastructure

  • Data preparation, training, and serving.
  • Data prep
    • Import from various sources
    • Proprocess the data
    • Export in a file format that TF supports (e.g. TFRecords)
  • Data prep tools
    • Apache Spark, Hadoop MapReduce, Beam
  • Training
    • Local vs. Distributed
  • Distributed training tools (cluster managers)
    • Kubernetes, Apache hadoop, Mesos, Slurm
  • Distributed storage tools
    • Hadoop HDF5, Google Cloud Storage, AWS
  • Container engines:
    • Docker, Rkt
  • Distributed training concepts:
    • Parameter servers
    • Workers
  • TensorBoard is compatible with distributed training
  • Be careful with your choice of file formats (for data and for models)
  • Serving models has nuances; TF Serving handles them:
    • Loading a new version of a model
    • Batching inputs efficiently
    • Isolation between multiple models (by default they contend for hardware resources)
  • When to use in-process TF over TF Serving?:
    • Mobile
    • Batch inference
    • Very strict latency requirements (TF Serving involves round-trip RPCs)
    • Run one fewer service

Serving Models in Production with TensorFlow Serving

  • History of software:
    • 2005: No source control
    • 2010: Source control & continuous build, but not for ML
    • 2017: Great tools for ML but still have a way to go
  • “Just because we have a best practice, doesn’t mean that everyone uses it yet.”
  • Goal: Develop best practices for ML, and make them the default configuration for ML tools
  • “Serving”: How you use your ML model after you’ve trained it
  • Continually training models and continually deploying them
  • TF Serving: A flexible, high-performance serving system for machine learned models, designed for production environments

ML Toolkit

  • Tools
    • Linear / logistic regression
    • Clustering
      • KMeans clustering
      • Gaussian mixture model (GMM)
    • WALS matrix factorization
    • Support vector machine (SVM)
    • Stochastic dual coordinate ascent (SDCA)
    • Random forest
    • DNN, RNN, LSTM, Wide & Deep, …
  • Properties (of the implementations of the tools):
    • Usable
      • scikit-learn-inspired Estimator APIs
    • Extensible
      • Combining the tools in novel ways?
      • Chaining a DNN after KMeans?
    • Scalable
      • Distributed implementations of the tools
    • Fast

Sequence Models and the RNN API

  • Case study: Google Translate
    • Sequence to sequence
    • Encoder and decoder
      • Encode from source language to vector
      • Decode from vector to target language
  • Topics
    • Reading sequence data
      • Goal: Pad, but minimize the amount of padding needed
        • Approach 1: Static padding
          • Need to know the maximum length ahead of time
          • Wastes time
          • Wastes space
        • Approach 2: Dynamic padding
          • Max sequence in the batch sets the sequence length
          • This is still pretty close to static padding in terms of waste
        • Approach 3: Bucketing
          • When reading a sequence, place it into the shortest bucket it can fit into
        • Approach 4: Truncated BTT via State Saver
          • ?
    • The RNN API
      • A library of RNN architectures
      • A flexible API for implementing your own RNN architectures
    • Fully dynamic calculation
      • Fast and memory efficient custom loops
      • Goal: Handle sequences of unknown length
    • Fused RNN cells (flexibility vs. efficiency)
      • Optimization for special cases
      • XLA fused time steps (use on GPUs, not CPUs; no speedup on embedded devices, but makes the models smaller)
      • Recommendation: Try XLA and benchmark your intended use case
    • Dynamic decoding

Wide & Deep Learning: Memorization + Generalization with TensorFlow

  • Wide: Memorization: “Seagulls can fly.” and “Pigeons can fly.”
  • Deep: Generalization: “Animals with wings can fly.”
  • Wide & Deep: Generalization + memorizing exceptions: “Animals with wings can fly, but penguins cannot fly.”
  • Wide & Deep tutorials available; The Wide & Deep Project

Magenta: Music and Art Generation

  • Can ML generate compelling media?
    • Compelling media
      • Music
      • Images and video
      • Text
        • Jokes
        • Stories
  • Interface for creative coders and artists
    • pip install magenta
  • The importance of critical feedback
    • Musicians and listeners
  • Music and art co-evolve with technology
    • Turning technologies into artistic instruments
      • Using them in novel, unintended ways
      • “Breaking” them
      • Merging them
  • Examples:
    • Image inpainting
    • Music generation (similar to image inpainting)
    • Artistic style transfer
  • “If you’re not using TensorBoard, you want to be using TensorBoard.”
  • magenta.tensorflow.org

Case Study: TensorFlow in Medicine - Retinal Imaging

  • Diabetic retinopathy: fastest growing cause of blindness
  • 415M people with diabetes, each at risk of going blind
  • Regular screening is key to preventing blindness (e.g. once a year)
    • The doctor takes a picture of the back of the eye with a special camera, and then looks at the picture
  • Shortage of eye doctors to do this task
  • Doctors are inconsistent with each other
  • Dataset: 30k images, 880k diagnoses (labels)
    • Hired an army of doctors to label the images (54 doctors)
  • F-score of model: 0.95; Median F-score of doctors: 0.91
  • How did TF help?
    • Quick prototyping
      • Starter architectures
      • Pre-trained models
    • Experiment at scale
      • GPU support
      • Fast training
    • The above helped them focus on their other problems:
      • Finding the right problem
      • Getting the data & labels
      • Validating & deploying
  • Next steps:
    • Validation of the model by hospitals
    • Custom hardware to take the images and run the models
      • Low-cost and easy to use hardware

Are the Part 2 lesson notes available on the wiki as well?
Issues setting up keras & theano with miniconda both on Python 2.7 and 3.5
Part 2 early release videos now available!
Pre-release part 2 videos
(Jeremy Howard) #2

(sravya8) #3

I recommend not using -nd as all files in future will be dumped in the same folder, rest of the flags are useful.

(Jeremy Howard) #4

Here’s the description of that flag, for those (like me) that weren’t aware of it:

Do not create a hierarchy of directories when retrieving recursively. With this option turned on, all files will get saved to the current directory, without clobbering (if a name shows up more than once, the filenames will get extensions ‘.n’).

Whether you use it I guess depends on whether you create your own folder structure in advance or not.

(lin.crampton) #9

made a text file transcript of lesson 8 (otherwise known as part2 lesson1). file is lesson8.txt and it resides at https://drive.google.com/open?id=0BxXRvbqKucuNVUFBU3NIT1dtb3M

let me know if you see things i should change


(Ljubomir Buturovic) #11

I’d like to share a tensorflow tip

You may have noticed that tensorflow outputs messages like these when it starts:

import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally

I find these unhelpful because they make it harder to see the actually useful output.

To control the level of messages produced by tensorflow use the environment variable
TF_CPP_MIN_LOG_LEVEL. If you set it to 1, tensorflow will produce only W (Warning)
and E (Error) messages, not I (Information):


If you set it to 2, tensorflow will only output E messages.

For the record, I found this here (the second answer, by user “craymichael”):

There are several other solutions proposed on stackoverflow, but they don’t seem
to work in tensorflow 1.0.0.

Hope this is useful



@lin.crampton Thanks a lot for the transcript Lin, this is great :wink:

(Jeremy Howard) #13

Thanks folks for all the great edits to the wiki! Lots of good info there now. :slight_smile:

(Jeremy Howard) #14

@Matthew those tensor summit notes are great! Any interest in writing them in prose and turning it into a blog post? That would be of interest to a lot of people, I think.

(Matthew Kleinsmith) #15

No, but anyone is welcome to take the notes and expand and modify them into whatever they’d like.

When I reflect on my skims through the videos, here’s what comes out of my head (not much):

  • Inner thought
    • Video title
  • TensorFlow is newer than I thought (late 2015). It’s growing fast. Wow!
    • Keynote
  • When deploying, check out XLA and TF Serving.
    • XLA
    • Distributed TensorFlow
    • TensorFlow Ecosystem: Integrating TensorFlow with Your Infrastructure
    • Serving Models in Production with TensorFlow Serving
  • When training, use TensorBoard to improve one’s intuition for models and to look for bugs.
    • TensorBoard
  • Keras is heavily influencing TensorFlow’s API. Sweet.
    • High-level API
    • Integrating Keras & TensorFlow
  • A bunch of applications. Hopefully one day I’ll think of one or discover one. One approach to attempting an application seems to be “get the data, and then throw every trick in the book at it.”
    • TensorFlow at DeepMind
    • Skin Cancer Image Classification
    • Magenta: Music and Art Generation
    • Case Study: TensorFlow in Medicine - Retinal Imaging
  • TF on mobile. Deep learning on mobile. Oh yeah, mobile is a thing. I should keep that in mind when looking for applications. When working with mobile, google “mobile and embedded tensorflow”
    • Mobile and Embedded TensorFlow
  • TF is trying to replace scikit-learn? Hopefully this means more for users than just a change in import statements and syntax (e.g. faster code; better APIs)
    • ML Toolkit
  • RNNs. Not focused on them yet. I’ll return to these nuances when I am. I’m glad to know these nuances exist, even if I don’t fully understand them yet.
    • Sequence Models and the RNN API
  • I thought memorization was a bad thing. Memorizing exceptions to generalizations seems like a good thing. I never connected memorization to network width. Cool.
    • Wide & Deep Learning: Memorization + Generalization with TensorFlow

(lin.crampton) #16

created a text transcript earlier in the week, https://drive.google.com/open?id=0BxXRvbqKucuNVUFBU3NIT1dtb3M.

if there’s something that needs to be changed, let me know.

(Jeremy Howard) #17

Thanks @lin.crampton - I’ve added those captions directly to the video now.

(sravya8) #18

You are just awesome @lin.crampton! Thank you!

(Mariya) #19

I built my own deep learning server this week and migrated my old homework code off of AWS. My neural-style.ipynb which ran fine on AWS instance is now producing this error. Ideas on how to fix?

(Jeremy Howard) #20

Looks like you have keras 2. Install keras 1 instead - there’s a thread on the main forum about this.


Hi looking for the style images used in the lesson8 neural_style notebook

(Jeremy Howard) #22

Just find some you like through Google Images! :slight_smile:


Please could some one explain what the ‘f’ stands in the following code snippet. When I see this form of code the instinct is too remove the ‘f’ as the syntax sugar thinks the rest of the line is a string and removing the ‘f’ corrects the syntax sugar. Which is confusing.

imsave(f'{path}/results/res_at_iteration_{i}.png', deproc(x.copy(), shp)[0])

Thanks :blush:

(Jeremy Howard) #24

Python 3.6 string formatting. Upgrade jupyter notebook to fix the syntax highlighting.

(Jakub Arnold) #25

Just a small note, this will only work in bash and if there already isn’t a file called index.htmlSOMETHING. If there is, the shell auto-expands the command before passing it to wget, resulting in -R index.htmlSOMETHING.

The correct command is using index.html\* or "index.html*" to prevent shell expansion, the one mentioned only works because bash has inconsistent behavior in this case. I’m saying this mostly since people tend to get used to this and then are surprised when the * actually does get expanded. The correct command should be

wget -r -nH -nd -np -R index.html\* http://files.fast.ai/part2/lesson8/


wget -r -nH -nd -np -R "index.html*" http://files.fast.ai/part2/lesson8/