Created
November 8, 2018 05:27
-
-
Save vkargov/708c38e809cb1d476e39a480e7f53f19 to your computer and use it in GitHub Desktop.
C4.5 upper error limit formula
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In his seminal article "C4.5. Programs for Machine Learning." Quinlan uses a criterion U_CF for determining error limits in nodes of decision trees. This criterion is important as it drives the tree pruning heuristic. Problem is, no clear formula is provided in the article. However, the formula can be found in the source code for C4.5 and C5.0: | |
U_CF(E, N) ≡ RawExtraErrs(N, E)/N, with RawExtraErrs defined in pruning.c | |
To verify the article results, you can insert the following line into main() or whatever: | |
#define U_CF(E, N, expectation) printf("U_CF(%d, %d) = %0.3f (expected to be " #expectation ")\n", E, N, RawExtraErrs(N, E)/N) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment