본문 바로가기

Google Machine Learning Bootcamp 202255

1. ML Strategy Introduction to ML Strategy Why ML Strategy? Ideas to improve ML model: collect more data, collect more diverse training set, train algorithm longer with gradient descent, try adam instead of gradient descent, try bigger network, try smaller network, try dropout, add l2 regularization, modify network architecture, ... ML problem을 분석하여 가장 효과적일 것으로 보이는 아이디어를 시도하는 것이 당연하게도 좋을 것이다. (ML strategy) Ort.. 2022. 7. 20.
3. Hyperparameter Tuning, Batch Normalization and Programming Frameworks Hyperparameter Tuning Tuning Process Try random values for hyperparameter, Don't use a grid Coarse to fine Using an Appropriate Scale to pick Hyperparameters For example, check \( \alpha \) in log scale For \( \beta \), consider the value of \( 1 - \beta \) Hyperparameters Tuning in Practice: Pandas vs. Cavier Babysitting one model vs Training many models in parallel Batch Normalization Normaliz.. 2022. 7. 15.
2. Optimization Algorithms Mini-batch Gradient Descent Batch Gradient Descent - Training once by m training sets Mini-batch Gradient Descent - Training t times by m / t training sets Understanding Mini-batch Gradient Descent Batch Gradient Descent - mini-batch size is m Stocastic Gradient Descent - mini-batch size is 1 In practice, mini-batch size is in-between 1 and m Exponentially Weighted Averages \( v_t \) = \( \beta .. 2022. 7. 14.
1. Practical Aspects of Deep Learning #3 Setting Up your Optimization Problem Normalizing Inputs 1. Subtract mean \( \mu = \frac{1}{m} \sum^m_i x^{(i)} \) \( x := x - \mu \) 2. Normalize variance \( \sigma^2 = \frac{1}{m} \sum^m_i x^{(i)}*x^{(i)} \) \( x /= \sigma \) Vanishing / Exploding Gradients network가 너무 깊으면 gradient가 너무 작아 사라지거나 너무 커 폭발할 수 있다. Weight initialization for Deep Networks Numerical Approximation of Gradients Gradient .. 2022. 7. 10.