[UDL Study Notes] Chapter 8 - Measuring performance

Use Original Cover Image

Type

Post

Parent

“Understanding Deep Learning” Study Notes

Children

Language

Overview

This posting series is a study note that records the process of learning the book “Understanding Deep Learning”. This time, it covers Chapter 8, Measuring performance.

Understanding Deep Learning

https://udlbook.github.io/udlbook/

Noise, Bias, Variance

The book explains noise, bias, and variance in detail, but I didn't understand it well at first glance. After reading it several times, I could clearly understand what it meant. To put it simply, noise is the various possibilities that arise from the data itself. For example, a strange value may have been entered by mistake from the time the data was first created, or the correct answer value cannot be determined by the input value alone, so there may be room for other probabilities.

Bias is related to the model's capacity. It appears because the area that the model can express is limited compared to the actual correct answer.

Variance depends on the characteristics of the train data. The train data can deviate from the actual correct answer, and there is a difference between the model learned with this data and the model learned with train data created with a different batch or a different seed. This refers to the variance in this part.

In conclusion, noise is affected by the input data itself, bias by the model's capacity, and variance by the different degrees of each train data.

Fixed Weights and Biases

In Problem 8.3, it asks to show the parameters for the simple model in the left image of the two figures above in closed form. At this time, the condition that the weights and biases between the input and hidden layers are fixed was additionally presented. But I didn't know exactly what this meant. I didn't know exactly what was different between being fixed and not being fixed. In fact, once you know it, it seems too obvious and easy, but you can compare it with the right image of the two figures above in Chapter 3, Shallow neural networks. The difference is that you can see that the parameter between the input and hidden layers is literally fixed. Because it is fixed, it can be easily expressed in closed form.

Reference

[1] Prince, S. J. D. (2023). Understanding Deep Learning. The MIT Press. Retrieved from http://udlbook.com