Understanding convolutional pooling sizes (deep learning)

Question

I'm dumb but still trying to understand the code provided from this e-book on deep learning, but it doesn't explain where the n_in=40*4*4 comes from. 40 is from the 40 previous feature maps, but what about the 4*4?

>>> net = Network([
        ConvPoolLayer(image_shape=(mini_batch_size, 1, 28, 28), 
                      filter_shape=(20, 1, 5, 5), 
                      poolsize=(2, 2), 
                      activation_fn=ReLU),
        ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 12), 
                      filter_shape=(40, 20, 5, 5), 
                      poolsize=(2, 2), 
                      activation_fn=ReLU),
        FullyConnectedLayer(
            n_in=40*4*4, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        FullyConnectedLayer(
            n_in=1000, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        SoftmaxLayer(n_in=1000, n_out=10, p_dropout=0.5)], 
        mini_batch_size)
>>> net.SGD(expanded_training_data, 40, mini_batch_size, 0.03, 
            validation_data, test_data)

For instance, what if I do a similar analysis in 1D as shown below, which should that n_in term be?

>>> net = Network([
        ConvPoolLayer(image_shape=(mini_batch_size, 1, 81, 1), 
                      filter_shape=(20, 1, 5, 1), 
                      poolsize=(2, 1), 
                      activation_fn=ReLU),
        ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 1), 
                      filter_shape=(40, 20, 5, 1), 
                      poolsize=(2, 1), 
                      activation_fn=ReLU),
        FullyConnectedLayer(
            n_in=40*???, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        FullyConnectedLayer(
            n_in=1000, n_out=1000, activation_fn=ReLU, p_dropout=0.5),
        SoftmaxLayer(n_in=1000, n_out=10, p_dropout=0.5)], 
        mini_batch_size)
>>> net.SGD(expanded_training_data, 40, mini_batch_size, 0.03, 
            validation_data, test_data)

Thanks!

score 0 · Accepted Answer

In the given example from the e-book, the number 4 comes from (12-5+1)/2, where 12 is the input image size (12*12) of the second constitutional layer; 5 is the filter size (5*5) used in that layer; and 2 is the poolsize.
This is similar to how you get the number 12 from the first constitutional layer: 12=(28-5+1)/2. It's well explained in your linked chapter.

Regarding your "For instance" code, your 6th line is not correct:

ConvPoolLayer(image_shape=(mini_batch_size, 20, 12, 1),

The number 12 should be replaced by (81-5+1)/2 which unfortunately is not an integer. You may want to change the filter_shape in the first layer to (6,1) to make it work. In that case, your 6th line should be:

ConvPoolLayer(image_shape=(mini_batch_size, 20, 38, 1),

and your 11th line should be:

n_in=40*17*1, n_out=1000, activation_fn=ReLU, p_dropout=0.5),

programming matrix

Wednesday, February 22, 2017

Drawing deep learning networks

Deep learning: convolutional nets: filter param sizing

Understanding convolutional pooling sizes (deep learning)

1 Answer

Sunday, February 19, 2017

Bayesian Statistics

Bayesian Statistics Made Simple

Tuesday, February 14, 2017

Python Jupyter Tutorials

Awesome Data Science: 1.0 Jupyter Notebook Tour

IPython Notebook best practices for data science

Mac OSX Anaconda installation

Followers

Blog Archive