Hyperparameter Tuning
Tuning Process
Try random values for hyperparameter, Don't use a grid
Coarse to fine
Using an Appropriate Scale to pick Hyperparameters
For example, check \( \alpha \) in log scale
For \( \beta \), consider the value of \( 1 - \beta \)
Hyperparameters Tuning in Practice: Pandas vs. Cavier
Babysitting one model vs Training many models in parallel
Batch Normalization
Normalizing Activations in a Network
Batch Norm: Normalize Z, and set its mean and variance something learnable
Fitting Batch Norm into a Neural Network
If we use batch norm, bias term is not meaningful, so get rid of these.
Why does Batch Norm work?
Covariate shift
Training set과 test set이 다른 distribution을 보이지만, x와 y의 상관관계는 동일한 경우를 의미한다.
input의 조그마한 변화도 layer를 거쳐가며 그 영향이 커질 수 있는데 이를 normalization으로 줄여줄 수 있다.
각각의 mini-batch는 그 mini-batch의 mean과 variance로 scaling 될 것이다.
이는 z 값에 노이즈를 더해 dropout처럼 regularization 효과를 준다.
Batch Norm at Test Time
z -> normalization (mean: 0, variance: 1) -> chane its mean and variance as \( \beta, \gamma \)
Multi-class Classification
Softmax Regression
Compute each class' probability
Training a Softmax Classifier
\( z^{[L]} \) -> \( np.exp(z^{[L]}) \) -> \( \frac{np.exp(z^{[L]})}{ \sum_i np.exp(z^{[L]}_i) } \)
hardmax -> 1 (100 %) for one, 0 (0 %) for others
softmax -> probability
Introduction to Programming Frameworks
Deep Learning Frameworks
Choosing deep learning frameworks
- Ease of programming (development and deployment)
- Running speed
- Truly open (open source with good governance)
'Google Machine Learning Bootcamp 2022 > Improving Deep Neural Networks' 카테고리의 다른 글
2. Optimization Algorithms (0) | 2022.07.14 |
---|---|
1. Practical Aspects of Deep Learning #3 (0) | 2022.07.10 |
1. Practical Aspects of Deep Learning #2 (0) | 2022.07.06 |
1. Practical Aspects of Deep Learning #1 (0) | 2022.07.05 |
댓글