Home | Browse | Search | Credits | About
Register | User Area | DL-Harvest | Help
DLIST

ATTRIBUTE SELECTION MEASURE IN DECISION TREE GROWING

Badulescu, Laviniu Aurelian (2007) ATTRIBUTE SELECTION MEASURE IN DECISION TREE GROWING. In Nicolae, Ileana Diana and Doicaru, Elena, Eds. Proceedings SINTES 13, The International Symposium on System Theory, Automation, Robotics, Computers, Informatics, Electronics and Instrumentation 2(1), pages pp. 1-6, Craiova, Romania.

Full text available as:
PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Abstract

One of the major tasks in Data Mining is classification. The growing of Decision Tree from data is a very efficient technique for learning classifiers. The selection of an attribute used to split the data set at each Decision Tree node is fundamental to properly classify objects; a good selection will improve the accuracy of the classification. In this paper, we study the behavior of the Decision Trees induced with 14 attribute selection measures over three data sets taken from UCI Machine Learning Repository.

EPrint Type:Conference Paper
Keywords:decision trees, classification, error rates
Subjects:Data Mining
Classification
Computer Science
ID Code:2390
Deposited On:07 July 2008
Eprint Statistics:View statistics for this eprint
Tell A Colleague:Tell a colleague about it.

Baim, P. W. (1988), A method for attribute selection in inductive learning systems, IEEE Trans. on PAMI, Volume 10, No. 6, pp. 888-896.

Borgelt, C. (2000), Data Mining with Graphical Models, Ph. D. Thesis, Fakultat fur Informatik der Otto-von-Guericke-Universitat Magdeburg, p. 211, http://fuzzy.cs.uni-magdeburg.de/~borgelt/software.html.

Borgelt, C. and R. Kruse (1997), Evaluation Measures for Learning Probabilistic and Possibilistic Networks, Proc. of the FUZZ-IEEE’97, Barcelona, Volume 2, pp.669–676.

Breiman, L., J. Friedman, R. Olshen and C. Stone (1984), Classification and Regression Trees, Stanford University, Berkeley.

Buntine, W. (1991), Theory Refinement on Bayesian Networks, Proc. 7th Conf. on Uncertainty in Artificial Intelligence (UAI 91), Morgan Kaufman, Los Angeles, pp. 52–60.

Cooper, G. F. and E. Herskovits (1992), A Bayesian Method for the Induction of Probabilistic Networks from Data, Machine Learning, Springer, Volume 9, No 4, pp. 309–347.

Heckerman, D., D. Geiger and D. M. Chickering (1995), Learning Bayesian Networks: The Combination of Knowledge and Statistical Data, Machine Learning, Kluwer, Boston, Volume 20, No. 3, pp. 197–243.

Kantardzic, M. (2003), Data Mining: Concepts, Models, Methods, and Algorithms, Chapter 7.2, John Wiley & Sons, Louisville.

Kira, K. and L. Rendell (1992), A practical approach to feature selection, In: Proc. Int. Conf. on Machine Learning, D. Sleeman and P. Edwards (Ed), pp. 249-256, Morgan Kaufmann, Aberdeen.

Kononenko, I. (1994), Estimating Atributes: Analysis and extensions of RELIEF, In: Proc. European Conf. on Machine Learning, L. De Raedt and F. Bergadano (Ed), pp. 171-182, Springer Verlag, Catania.

Kononenko, I. (1995), On Biases in Estimating Multi-Valued Attributes, In: Proc. of the 14th Int. Joint Conference on Artificial Intelligence (IJCAI'95), C. S. Mellish (Ed.), pp. 1034–1040, Morgan Kaufmann, San Mateo, CA.

Krichevsky, R. E. and V. K. Trofimov (1983), The Performance of Universal Coding, IEEE Trans. on Information Theory, Volume 27, No 2, pp. 199–207.

Mantaras, R. L. de (1991), A Distance-based Attribute Selection Measure for Decision Tree Induction, Machine Learning, Kluwer, Boston, Volume 6, No. 1, pp. 81–92.

Michie, D. (1990), Personal Models of Rationality, J. of Statistical Planning and Inference, Special Issue on Foundations and Philosophy of Probability and Statistics, Volume 21, pp. 381-399.

Newman, D.J., S. Hettich, C. L. Blake and C. J. Merz (1998), UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Depart. of Information and Computer Science.

Quinlan, J. R. (1986), Induction of Decision Trees, Machine Learning, Kluwer, Boston, Volume 1, pp. 81–106.

Quinlan, J. R. (1993), C4.5: Programs for Machine Learning, Morgan Kaufmann Series in Machine Learning, Canada.

Rissanen, J. (1987), Stochastic Complexity, J. of the Royal Statistical Society (Series B), Volume 49, No. 3, pp. 223-239.

Wang, X., D. D. Nauck, M. Spott and R. Kruse (2007), Intelligent data analysis with fuzzy decision trees, Soft Computing, Volume 11, No. 5, Springer-Verlag, pp. 439-457

Wehenkel, L. (1996), On Uncertainty Measures Used for Decision Tree Induction, Proc. of the Int. Congress on Information Processing and Management of Uncertainty in Knowledge based Systems (IPMU96), Granada, pp. 413-418.

White, A. P. and W. Z. Liu (1994), Bias in information-based measures in Decision Tree Induction, Machine Learning, Kluwer, Boston, Volume 15, pp. 321-329.

Zhou, X. and T. S. Dillon (1991), A statistical-heuristic Feature Selection Criterion for Decision Tree Induction, IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI), Volume 13, No. 8, pp. 834–841.

EPrints dLIST, an open access archive for the Information Sciences, is supported by the School of Information Resources and Library Science and Learning Technologies Center, University of Arizona. Established in 2002, dLIST has a global Advisory Board and is a part of the Information Technology & Society Research Lab. Open Archives
Contact: Admin | Donate