I have the assignment to implement a random tree classifier in MATLAB.
The lecture says:
Input: observations and lables While stopping criterion not reached: 1. Node optimization: - several split candidates are randomly generated - the best splitting function is chosen according to some quality measure 2. Data splitting: observations are pushed to the left or right branch. 3. Move to next node Stopping criteria: Quality measure - Number of data points in the current node/leaf
My problem now is I do not understand how to get the randomly generated split candidates? Get them from the input values? But then I would get a decision tree (pick a random element and say >x right node, < x left node.) Also I do not understand what the difference between the random tree and the decision tree is in the end.
Also the lecture says:
Choosing the best candidates: according to a quality measures Out-of-bag error (OOB) - Minimize error rate after splitting using a test set Information gain - Maximize information gain after splitting
But what test set should I use? The test set already in the tree used for training?
Wikipedia and Google did not help me either. The code of the MATLAB stub can be found here: http://pastebin.com/iuzqF8gG
I appreciate your help.