Residual Neural Networks (ResNet) are special types of neural networks used in image processing. They are characterized by their deep architectures, which can still produce low error rates.
What architecture has been used in image recognition so far?
After the great success of a Convolutional Neural Network (CNN) at the ImageNet competition in 2012, CNNs were the dominant architecture in machine vision. The approach is modeled on how our eye works. When we see an image, we automatically split it into many small sub-images and analyze them individually. By assembling these sub-images, we process and interpret the image. How can this principle be implemented in a Convolutional Neural Network?
The work happens in the so-called Convolution Layer. To do this, we define a filter that determines how large the partial images we are looking at should be, and a step length that decides how many pixels we continue between calculations, i.e. how close the partial images are to each other. By taking this step, we have greatly reduced the dimensionality of the image.
The next step is the Pooling Layer. From a purely computational point of view, the same thing happens here first as in the Convolution Layer, with the difference that we only take either the average or maximum value from the result, depending on the application. This preserves small features in a few pixels that are crucial for the task solution.
Finally, there is a Fully-Connected Layer in the Convolutional Neural Network, as we already know it from regular Neural Networks. Now that we have greatly reduced the dimensions of the image, we can use the tightly meshed layers. Here, the individual sub-images are linked together again in order to recognize the connections and carry out the classification.
What is the problem with deep neural networks?
In order to achieve better results, the architectures used became deeper and deeper. Thus, several CNN blocks were simply stacked on top of each other in the hope of achieving better results. However, the problem of the so-called vanishing gradient arises with deep neural networks.
The training of a network happens during the so-called backpropagation. In short, the error travels through the network from the back to the front. In each layer, it is calculated how much the respective neuron contributed to the error by calculating the gradient. However, the closer this process approaches the initial layers, the smaller the gradient can become so that there is no or only very slight adjustment of neuron weights in the front layers. As a result, deep network structures often have a comparatively high error.
In practice, however, we cannot make it so easy for ourselves and simply blame the decreasing performance on the vanishing gradient problem. In fact, it can even be handled relatively well with so-called batch normalization layers. The fact that deeper neural networks have a worse performance can furthermore also be due to the initialization of the layers or to the optimization function.
How do residual neural networks solve the problem?
The basic building blocks of a residual neural network are the so-called residual blocks. The basic idea here is that so-called “skip connections” are built into the network. These ensure that the activation of a layer is added together with the output of a later layer.
This architecture allows the network to simply skip certain layers, especially if they do not contribute anything to a better result. A residual neural network is composed of several of these so-called residual blocks.
What problems can arise with ResNets?
Especially with Convolutional Neural Networks, it naturally happens that the dimensionality at the beginning of the skip connection does not match that at the end of the skip connection. This is especially the case if several layers are skipped. In Convolutional Neural Networks, the dimensionality is changed in each block with the help of a filter. Thus, the skip connection faces the problem of simply adding the inputs of previous layers to the output of later layers.
To solve this problem, the residual can be multiplied by a linear projection to align the dimensions. In many cases, for example, a 1×1 convolutional layer is used for this purpose. However, it can also happen that an alignment of dimensions is not necessary at all.
How to build a ResNet block in TensorFlow?
A ResNet block is relatively easy to program in TensorFlow, especially if you ensure that the dimensions are the same when merging.
In this case, the input first passes through a dense layer with 1024 neurons. This is followed by a block consisting of a dropout layer and two dense layers, which first limits the number of neurons to 512 before it is increased again to 1024. Then the merging with the add layer takes place. Since both inputs have a dimensionality of 1024, they can be added up without any problems.
This is what you should take with you
- Residual Neural Networks, or ResNets for short, offer a way to train deep neural networks without a high error rate.
- For this purpose, they are composed of many so-called residual blocks, which are characterized by a skip connection.
- The skip connection allows the network to skip one or more layers if they do not improve the result.
What is the Curse of Dimensionality?
Explanation of the Curse of Dimensionality and its problems.
What is Batch Normalization?
Explanation of Batch Normalization and its advantages.
What is XGBoost?
Explanation of the XGBoost library including the gradient boosting method.
What is a Perceptron?
Explanation of perceptrons with example and how neural networks arise from them.
What is Overfitting?
Overfitting explained and strategies for avoiding it listed.
Cross Validation – easily explained!
Cross Validation explained with examples and concrete Python code snippets.
What is the Confusion Matrix?
Confusion Matrix explained with a detailed example.
How does the Apriori Algorithm work?
Explanation of the Apriori algorithm with an illustrative example.
What is Elasticsearch?
Explanation of the Elasticsearch search algorithm and its applications.
Long Short-Term Memory Networks (LSTM)- simply explained!
Explanation of Recurrent Neural Networks and LSTM models with example.
Other Articles on the Topic of ResNets
- Here you can find the original paper on Residual Neural Networks: Deep Residual Learning for Image Recognition.