1. ML Strategy

Introduction to ML Strategy

Why ML Strategy?

Ideas to improve ML model: collect more data, collect more diverse training set, train algorithm longer with gradient descent, try adam instead of gradient descent, try bigger network, try smaller network, try dropout, add l2 regularization, modify network architecture, ...

ML problem을 분석하여 가장 효과적일 것으로 보이는 아이디어를 시도하는 것이 당연하게도 좋을 것이다. (ML strategy)

Orthogonalization

Chain of assumptions in ML

Fit training set well on cost function -> Fit dev set well on cost function -> Fit test set well on cost function -> Performs well in real world

Setting Up your Goal

Single Number Evaluation Metric

Precision: \( \frac{\text{# real true among them}}{\text{# model classifies true}} \)

Recall: \( \frac{\text{# model classifies true among real true}}{\text{real true}} \)

F1 score: Harmonic mean of precision and recall ( \( \frac{2}{\frac{1}{\text{precision}} + \frac{1}{\text{recall}}} \) )

Dev set + Single real number evaluation metric -> speed up iterating

Satisfying and Optimizing Metric

1 satisfying metric + n - 1 optimizing metrics

Train / Dev / Test Distributions

Development set, hold out set, cross validation set

choose a dev set and test set with the same distribution

Size of the Dev and Test Sets

Old way of splitting data: 70% / 30%, 60% / 20% / 20%

Set your test set to be big enough to give high confidence in the overall performances of your system

When to Change Dev / Test Sets and Metrics

If you are not satisfied with the current metric, then you can define another metric you prefer

1. Place target -> 2. Aim / Shoot at target

If doing well on your metric + dev/test set does not correspond to doing well on your application, change your metric and/or dev/test set

Comparing to Human-level Performance

Why Human-level Performance?

- Get labeled data from humans

- Gain insight from manual error analysis:; Why did a person get this right

- Better analysis of bias/variance

Avoidable Bias

Human-level error as a proxy for Bayes error

If the training error reaches the human-level error (bayes error precisely), then we consider the error as an avoidable bias.

Understanding Human-level Error

Until your model surpasses the human level, you can choose whether you reduce bias, or variance.

After that, making better model is quite difficult

Surpassing Human-level Performance

Problems where ML significantly surpasses human-level performance

- Online advertising, Product recommendations, Logistics (predicting transit time), Loan approvals, Speech recognition, Some image recognition, Medical

Improving your Model Performance

The two fundamental assumptions of supervised learning

1. You can fit the training set pretty well. (~ Avoidable bias)

2. The training set performance generalizes pretty well to the dev/test set. (~ variance)

Human-level <- Avoidable bias -> Training error <- Variance -> Dev error

Improving bias: Train bigger model, Train longer/better optimization algorithms, NN architecture / hyperparameters search

Improving variance: More data, Regularization, NN architecture / hyperparameters search

저작자표시

'Google Machine Learning Bootcamp 2022 > Structuring Maching Learning Project' 카테고리의 다른 글

2. ML Strategy (0)	2022.07.21

Life Story

1. ML Strategy

Introduction to ML Strategy

Why ML Strategy?

Orthogonalization

Setting Up your Goal

Satisfying and Optimizing Metric

Size of the Dev and Test Sets

When to Change Dev / Test Sets and Metrics

Comparing to Human-level Performance

Why Human-level Performance?

Avoidable Bias

Understanding Human-level Error

Surpassing Human-level Performance

Improving your Model Performance

'Google Machine Learning Bootcamp 2022 > Structuring Maching Learning Project' 카테고리의 다른 글

댓글

티스토리툴바

1. ML Strategy

Introduction to ML Strategy

Why ML Strategy?

Orthogonalization

Setting Up your Goal

Satisfying and Optimizing Metric

Size of the Dev and Test Sets

When to Change Dev / Test Sets and Metrics

Comparing to Human-level Performance

Why Human-level Performance?

Avoidable Bias

Understanding Human-level Error

Surpassing Human-level Performance

Improving your Model Performance

'Google Machine Learning Bootcamp 2022 > Structuring Maching Learning Project' 카테고리의 다른 글

관련글

댓글

티스토리툴바