Hands On Machine Learning with Scikit and Tensorflow(VII)

Posted by Kaiyuan Chen on September 23, 2017

CNN (chpater 13)

Local receptive field: neurons that react only to limited region of visual field. These fields may overlap and combine to cover whole visual field.

Convolution: slides one function over another and measures the integral of their pointwise multiplication.

A neuron located in row i, column j of a given layer is connected to the outputs of the neurons in the previous layer located in rows i to i + fh – 1, columns j to j + fw – 1, where fh and fw are the height and width of the receptive field (another way of doing this is to use j * s )

zero padding: in order for a layer to have same height/width as previous layer, add zeros around input


neuron can have different weights and have small image represented as the size of receptive field. CNN can find msot userful filters(horizontal & vertical) and generate complex patterns Feature map: a layer full of neurons using the same filter

  • all neurons have same parameter(weights and bias) on one feature map
  • a neuron’s receptive filed extends across all previous layers’ feature maps


have to specify size, the stride, and the padding type.

X = tf.placeholder(tf.float32, shape=(None, height, width, channels)) #mini batch input
convolution = tf.nn.conv2d(X, filters, strides=[1,2,2,1], padding="SAME")

# Run it 
with tf.Session() as sess:
    output = sess.run(convolution, feed_dict={X: dataset})

Pooling layer

subsample (shrink) the input image to reduce computational load it does not have weights: it just do max or mean

Requirement: High memory usage


Typical CNN architectures stack a few convolutional layers (each one generally followed by a ReLU layer), then a pooling layer, then another few convolutional layers (+ReLU), then another pooling layer, and so on