September 11, 2020 – Random Thought

Decision Tree

Posted on September 11, 2020November 29, 2021 by jiseung

Decision tree is a simple algorithm. Simply spliting data out of all possible features. A deeper tree will have high variance and shallower tree will have high bias. Naturally, we want to find a smallest tree to result to fit all data (we called a zero training error tree consistent). However, in general this…

Random Forest

Posted on September 11, 2020September 11, 2020 by jiseung

It is basically bagging decision tree. Say we can bootstrap $latex M$ dataset from the original dataset. We then train each bootstrap dataset by a decision tree algorithm, to ensure diversity and reduce complexity, only $latex k \approx \sqrt{d}$ dimensions, where $latex d$ is original dimension. The nice thing of random forest is that it…

Bagging

Posted on September 11, 2020September 12, 2020 by jiseung

The goal of bagging is to avoid overfitting (high variance). Instead of training one model, we can bootstrap the dataset for the training of multiple models. Then, the output will simply be the average (for regression) or the majority vote (for classification) of the outputs evaluated by the trained models . There are at least…

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30