It is basically bagging decision tree. Say we can bootstrap $latex M$ dataset from the original dataset. We then train each bootstrap dataset by a decision tree algorithm, to ensure diversity and reduce complexity, only $latex k \approx \sqrt{d}$ dimensions, where $latex d$ is original dimension.
The nice thing of random forest is that it works well for most dataset out of the box. It has almost no hyperparameters. $latex k \approx \sqrt{d}$ often works well and $latex M$ can be anything from hundreds to thousands.