QUEST Classification Tree (version 1.10.1)
© Yu-Shan Shih 1997-2013
QUEST is a binary-split decision tree algorithm for classification and data mining developed by Wei-Yin Loh (University of Wisconsin-Madison, USA) and Yu-Shan Shih (National Chung Cheng University, Taiwan). QUEST stands for Quick, Unbiased and Efficient Statistical Tree.
The objective of QUEST is similar to that of the CART(TM) algorithm described in the book, Classification and Regression Trees, by Breiman, Friedman, Olshen and Stone (1984). [CART is a registered trademark of California Statistical Software, Inc.]
QUEST has the following properties:
- It uses an unbiased variable selection technique by default.
- It provides linear splits using Fisher's LDA method.
- It uses imputation instead of surrogate splits to deal with missing values.
- It can easily handle categorical predictor variables with many categories.
- If there are no missing values in the data, it can optionally use the CART algorithm to produce a tree with univariate splits.
- It can optionally produce LaTeX ( MikTeX) or allCLEAR source code for the tree diagrams. The LaTeX code, which requires the PSTricks package, can output pdf or postscript files (the latter can be viewed and printed using Ghostscript and GSView).
See Table 1 for a feature comparison between QUEST and other classification tree algorithms.
- Loh, W.-Y. and Shih, Y.-S. (1997), Split selection methods for classification trees, Statistica Sinica, vol. 7, 815-840. [This is the definitive reference for QUEST.]
- Lim, T.-S., Loh, W.-Y., and Shih, Y.-S. (2000), A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms, Machine Learning Journal, vol. 40, 203-228. [This paper compares the performance of version 1.7 of QUEST against other methods.] A separate appendix contains more detailed results. The datasets used in the study are in the gzipped tar archive (5.8Mb)
- Shih, Y.-S. (1999), Families of splitting criteria for classification trees , Statistics and Computing, vol. 9, 309-315. [This paper documents the enlarged class of splitting criteria in QUEST.]
- Shih, Y.-S. (2004), A note on split selection bias in classification trees , Computational Statistics and Data Analysis, vol. 45, pp. 457-466. [This paper explains the existence of CART's split selection bias.]
Compiled binaries: The following files may be freely distributed but not sold for profit.
Revision history: See history.txt
Related algorithms with unbiased splits:
CRUISE: Classification trees with more than two splits per node GUIDE: Classification trees and Piecewise-linear least-squares, quantile, Poisson, proportional hazards or multi-response (e.g., longtudinal) regression trees
QUEST is free software. You may use the Program without restriction. You may copy and distribute the Program in executable form provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; and give any other recipients of the Program a copy of this license along with the Program.
Disclaimer of Warranty:
The copyright holder provides the Program "as is" without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the Program is with you. Should the Program prove defective, you assume the cost of all necessary servicing, repair or correction. In no event will the copyright holder be liable to you for damages, including any general, special, incidental or consequential damages arising out of the use or inability to use the Program (including but not limited to loss of data or data being rendered inaccurate or losses sustained by you or third parties or a failure of the Program to operate with any other programs), even if such holder has been advised of the possibility of such damages.
Last modified: October 27, 2013 by Yu-Shan Shih