GUIDE Regression Tree (version 7.9)
© Wei-Yin Loh 1997-2009
GUIDE is a multi-purpose machine learning algorithm for
constructing classification and regression trees. It is designed and maintained by
Wei-Yin Loh at the University of Wisconsin-Madison. GUIDE stands for
Generalized, Unbiased, Interaction Detection and Estimation.
This material is based upon work supported by grants from the
U.S. Army Research Office and the National Science Foundation.
Properties and features:
- Choice of classification or regression trees
- Negligible bias in split variable selection
- Importance ranking and identification of unimportant variables
- Power to detect local interactions between pairs of
predictor variables
- Ability to use ordered (continuous) and unordered (categorical)
predictor variables
- Automatic handling of missing values
- Automatic prediction of new samples
- Choice of weighted least squares (Gaussian), least median of
squares, Poisson, quantile (including median), or proportional
hazards regression tree models
- Choice of piecewise constant, best simple polynomial, multiple
linear, or stepwise regression models
- Choice of roles for predictor variables (splitting only, node modeling
only, both, or none)
- Choice of using categorical variables for splitting only or both
splitting and fitting through dummy 0-1 vectors
- Choice of stopping rules: no pruning, pruning by
cross-validation, or pruning with a test sample
- Choice of batch or interactive mode of operation
- Automatic generation of products and powers of predictor
variables as regressor variables
- Automatic generation of LaTeX ( MikTeX) or allCLEAR source code for
the tree diagrams in PostScript and PDF formats. The LaTeX code requires the
PSTricks package which is normally included in most LaTeX
distributions. The PostScript files require
Ghostscript and Ghostview for display and printing.
- Free executables for Windows, Macintosh, and Linux computers
Documentation:
- Loh, W.-Y. (2009), Improving the precision of classification
trees, Annals of Applied
Statistics, to appear. [This is the definitive reference
for GUIDE classification.]
- Loh, W.-Y., Chen, C.-W., and Zheng, Z.(2007),
Extrapolation errors in linear model trees,
ACM Transactions on Knowledge Discovery in
Data, vol. 1.
DOI
- Loh, W.-Y. (2007),
Regression by parts: Fitting visually interpretable models with GUIDE,
Handbook of Computational Statistics, vol. III
, Springer, in press.
- Kim, H., Loh, W.-Y., Shih, Y.-S., and Chaudhuri, P. (2007),
Visualizable and interpretable regression models with good prediction power
. [This is the author's version of the work. It is posted here by
permission of Taylor & Francis for personal use, not for
redistribution.] IIE Transactions, vol. 39, Issue
6, June 2007, pp. 565-579.
DOI
- Loh, W.-Y. (2006), Regression tree models
for designed experiments, Second Lehmann
Symposium, Institute of Mathematical Statistics Lecture
Notes-Monograph Series, vol. 49, 210-228.
- Loh, W.-Y. (2002),
Regression trees with unbiased variable selection and interaction
detection, Statistica Sinica,
vol. 12, 361-386. [This is the original reference for GUIDE regression.]
- Chaudhuri, P. and Loh, W.-Y. (2002),
Nonparametric estimation of conditional quantiles using quantile
regression trees, Bernoulli,
vol. 8, 561-576. [This paper extends GUIDE to quantile regression.]
- Chaudhuri, P., Lo, W.-D., Loh, W.-Y., and Yang, C.-C. (1995),
Generalized regression trees, Statistica
Sinica, vol. 5, 641-666. [This is the first paper on
Poisson and logistic regression trees.]
- Chaudhuri, P., Huang, M.-C., Loh, W.-Y., and Yao, R. (1994),
Piecewise-polynomial regression trees, Statistica Sinica, vol. 4,
143-167. [This is the first paper on polynomial regression trees.]
-
GUIDE manual in pdf format. The manual uses the example data and
description files bbdat.txt,
bbdsc.txt,
irisdata.txt,
irisdsc.txt,
solderdat.txt,
and solderdsc.txt
for illustration.
Compiled binaries: The following files may be freely
distributed but not sold for profit.
- Apple Macintosh (OS X Intel) in gzip format ---
guide7.9
- Intel and compatibles (Windows 9x/NT/2000/XP) in pkzip format ---
guide7.9
(guide8.0b beta)
- Intel and compatibles (Linux) in gzip format ---
guide7.9
Revision history: See the file
history.txt
Closely related algorithms developed by Wei-Yin Loh and his students:
QUEST:
A binary classification tree
CRUISE: A classification tree that splits each node into two or
more subnodes
LOTUS: A logistic regression tree
Application papers that use CRUISE, GUIDE, LOTUS, or QUEST: See file
License:
GUIDE is free software. You may use the Program without
restriction. You may copy and distribute the Program in executable
form provided that you conspicuously and appropriately publish on each
copy an appropriate copyright notice and disclaimer of warranty; and
give any other recipients of the Program a copy of this license along
with the Program.
Disclaimer of Warranty:
The copyright holder provides the Program "as is" without warranty of
any kind, either expressed or implied, including, but not limited to,
the implied warranties of merchantability and fitness for a
particular purpose. The entire risk as to the quality and
performance of the Program is with you. Should the Program prove
defective, you assume the cost of all necessary servicing, repair or
correction. In no event will the copyright holder be liable to you
for damages, including any general, special, incidental or
consequential damages arising out of the use or inability to use the
Program (including but not limited to loss of data or data being
rendered inaccurate or losses sustained by you or third parties or a
failure of the Program to operate with any other programs), even if
such holder has been advised of the possibility of such damages.
Return to Wei-Yin Loh's
homepage.
Last modified: November 14, 2009 by Wei-Yin Loh