Random Forest – Random Thought

It is basically bagging decision tree. Say we can bootstrap $latex M$ dataset from the original dataset. We then train each bootstrap dataset by a decision tree algorithm, to ensure diversity and reduce complexity, only $latex k \approx \sqrt{d}$ dimensions, where $latex d$ is original dimension.

The nice thing of random forest is that it works well for most dataset out of the box. It has almost no hyperparameters. $latex k \approx \sqrt{d}$ often works well and $latex M$ can be anything from hundreds to thousands.

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Leave a Reply Cancel reply