Bagging, Boosting

Published by onesixx on 16-06-2516-06-25

https://becominghuman.ai/ensemble-learning-bagging-and-boosting-d20f38be9b1e?gi=dbd9ea407f3d

Mehods to optimize Machine Learning models will help you understand Ensemble model.
Bagging and Random Forest Ensemble Algorithms for Machine Learning talks about Random Forest Algorithm.
A Primer to Ensemble Learning

Actual value과 Predicted value의 차이는 Bias 와 Variance 그리고 Noise에 기인한다.
Machine Learning의 목적은 (Noise를 제외한) Bias와 Variance를 줄이는데 있다.
여러 Classifier를 조합하는 것은 Variance를 줄일 수 있고, 이는 Model의 robustness가 증가를 의미한다.

Bagging (Bootstrap Aggregation)

Bagging은 bootstrap(Random sampling with replacement)방법으로 선택된 sample들의 statistics를
combine하는 ensemble technique[=> Weak learners(random guessing보다 조금 나은수준)집합을 combine해서 Strong learner를 생성] 중 하나

> mean(c(1,2,3))
[1] 2

> mean(c(mean(c(1,2)), mean(c(1,3)), mean(c(2,3)),
         mean(c(1,1)),  mean(c(2,2)), mean(c(3,3))))
[1] 2

Bagging은 항상 같은 classifier를 사용한다.
Bootstrap으로 추출한 각 Sub-DataSet의 선택확률은 같다.
Sample의 갯수는 Tree의 갯수와 같게 되는데, accuracy가 개선되지 않을때까지 늘리면 된다.
Decision Tree의 overfit에 신경쓰지 않고, prune없이 deep하게 grow하면, bias는 낮고 variance는 높은 tree를 얻을 수 있는데, Bagging을 통해 variance를 줄일수 있다.

Boosting

– https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/beginners-tutorial-on-xgboost-parameter-tuning-r/tutorial/
– https://www.analyticsvidhya.com/blog/2015/11/quick-introduction-boosting-algorithms-machine-learning/

Boosting은 Weighted averages을 적용한 algorithm의 집합을 순차적으로 이용하는 ensemble technique중 하나.

이전 Classifier에서 잘못 분류된 데이터에 초점을 맞추어 새로운 Classifier를 생성한다.
즉 새로운 Classifier는 이전 Classifier의 성에 기반하여 learning하게 된다.
바로다음 Model은 이전 model의 오분류(또는 에러)를 참고하여, 그것을 줄이는 방향으로 진행된다.
Bagging은 동일한 Classifier를 이용하여, 각각 독립적인 Model을 활용하지만,
Boosting은 서로다른 Classifier를 순차적(sequential process) 으로 Weighted averages을 활용하여 각 model은 다음 model에 영향을 준다.

Basic idea behind boosting

같은 부호끼리 분류하는 문제. 4개의 박스(4개의 classifier)

Bagging 과 Boosting 중 어떤게 좋은가?

https://analyticsindiamag.com/primer-ensemble-learning-bagging-boosting/

Bias가 큰 Single Model인 경우,
Baggging은 bias를 줄이는데는 크게 도움이 되지 않는다.
Over-fitting이 문제인 Single Model인 경우,
Boosting은 그 자체가 Overfitting문제를 가지고 있어 큰 도움이 안된다.

	Bagging	Boosting
Partitioning of data	Random	Higher vote to misclassified samples
Goal to achieve	Minimum variance	Increase accuracy
Methods used	Random subspace	Gradient descent
Functions to combine single model	Weighted average	Weighted majority vote
Example	Random Forest	Ada Boost

Bagging, Boosting

https://becominghuman.ai/ensemble-learning-bagging-and-boosting-d20f38be9b1e?gi=dbd9ea407f3d

Mehods to optimize Machine Learning models will help you understand Ensemble model.
Bagging and Random Forest Ensemble Algorithms for Machine Learning talks about Random Forest Algorithm.
A Primer to Ensemble Learning

Bagging (Bootstrap Aggregation)

Boosting

– https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/beginners-tutorial-on-xgboost-parameter-tuning-r/tutorial/
– https://www.analyticsvidhya.com/blog/2015/11/quick-introduction-boosting-algorithms-machine-learning/

Basic idea behind boosting

Bagging 과 Boosting 중 어떤게 좋은가?

https://analyticsindiamag.com/primer-ensemble-learning-bagging-boosting/

onesixx

KalmanFilter

Backpropagation

독립 종속 사건

Bagging, Boosting

https://becominghuman.ai/ensemble-learning-bagging-and-boosting-d20f38be9b1e?gi=dbd9ea407f3d

Mehods to optimize Machine Learning models will help you understand Ensemble model. Bagging and Random Forest Ensemble Algorithms for Machine Learning talks about Random Forest Algorithm. A Primer to Ensemble Learning

Bagging (Bootstrap Aggregation)

Boosting

– https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/beginners-tutorial-on-xgboost-parameter-tuning-r/tutorial/ – https://www.analyticsvidhya.com/blog/2015/11/quick-introduction-boosting-algorithms-machine-learning/

Basic idea behind boosting

Bagging 과 Boosting 중 어떤게 좋은가?

https://analyticsindiamag.com/primer-ensemble-learning-bagging-boosting/

onesixx

Related Posts

KalmanFilter

Backpropagation

독립 종속 사건

Mehods to optimize Machine Learning models will help you understand Ensemble model.
Bagging and Random Forest Ensemble Algorithms for Machine Learning talks about Random Forest Algorithm.
A Primer to Ensemble Learning

– https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/beginners-tutorial-on-xgboost-parameter-tuning-r/tutorial/
– https://www.analyticsvidhya.com/blog/2015/11/quick-introduction-boosting-algorithms-machine-learning/