A NOVEL STRATEGY FOR A VERTICAL WEB PAGE CLASSIFIER BASED ON CONTINUOUS LEARNING NAÏVE BAYES ALGORITHM I

doi:10.2316/Journal.202.2007.3.202-1961

A NOVEL STRATEGY FOR A VERTICAL WEB PAGE CLASSIFIER BASED ON CONTINUOUS LEARNING NAÏVE BAYES ALGORITHM I

H.A. Ali, A.I. El-Desouky, and A.I. Saleh

References

[1] H. Chen & S. Dumais, Bringing order to the web: Auto-matically categorizing search results, Proc. Computer–Human Interaction 2000 (CHI ’00) Conf. on Human Factors in Computing Systems, Hague, Netherlands, April 2000, 145–152.
[2] J.M. Pierre, Practical issues for automatic categorization of web sites, Proc. European Computer Driving Licence (ECDL ’00) Workshop on the Semantic Web, Lisbon, Portugal, September 2000, 83–94.
[3] W. Lam & Y. Han, Automatic textual document categorization based on generalized instance sets and a meta-model, Proc. of the IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(5), 2003, 628–633. doi:10.1109/TPAMI.2003.1195997
[4] A. Sun & E. Lim, Web classification using support vector machine, Proc. ACM Workshop on Web Information and Data Management (WIDM ’02), McLean, Virginia , USA, November 2002, 96–99.
[5] H. Yu, J. Yang, & J. Han, Classifying large data sets using SVM with hierarchical clusters, Proc. 9th Int. Conf. on Knowledge Discovery and Data Mining (KDD ’03), Washington, DC, USA, August 2003, 25–44.
[6] O. Kwon & J. Lee, Text categorization based on k-nearest neighbor approach for web site classification, Proc. of the Int. Journal of Information Processing and Management, 39(1), 2003, 25–44. doi:10.1016/S0306-4573(02)00022-5
[7] C. Apte, F. Damerau, & S. Weiss, Text mining with decision trees and decision rules, Proc. Conf. on Automated Learning and Discovery, Carnegie-Mellon University, Pittsburgh, PA, June 1998, 172–179.
[8] F. Ciravegna, Adaptive information extraction from text by rule induction and generalization, Proc. 17th Int. Joint Conf. on Artificial Intelligence (IJCAI ’01), Seattle, Washington, vol. 2, August 2001, 751–756.
[9] M. Ruiz & P. Srinivasan, Hierarchical text categorization using neural networks, Proc. of the Int. Journal of Information Retrieval, 5(1), 2002, 87–118. doi:10.1023/A:1012782908347
[10] A. Selamat, S. Omatu, & H. Yanagimoto, Web news classification using neural networks based on PCA, Proc. of the Society of Instrument and Control Engineers (SICE ’02) Journal, Osaka, Japan, August 2002, 2388–2393.
[11] J. Rennie, L. Shih, J. Teewan, & D. Karger, Tackling the poor assumptions of Naive Bayes text classifiers, Proc. of the Int. Conf. on Machine Learning, Washington, DC, USA, August 2003, 199–205.
[12] G. Tur, D. Hakkani-Tür, & R.E. Schapire, Combining active and semi-supervised learning for spoken language understanding, Journal of Speech Communication, 45 (2), 2005, 171–186. doi:10.1016/j.specom.2004.08.002
[13] J. Xu, Y. Cao, H. Li, & M. Zhao, Ranking definitions with supervised learning methods, Special interest tracks and posters, 14th Int. Conf. on World Wide Web (WWW ’05), Chiba, Japan, May 2005, 811–819.
[14] Z. Solan, D. Horn, E. Ruppin, & S. Edelman, Unsupervised learning of natural languages, Proc. of the National Academy of Sciences, 102 (33), 2005, 11629–11634.
[15] V. Vapnik, The nature of statistical learning theory, New York: Springer-Verlag, 1999.
[16] G. Salton, Automatic text processing: The transformation, analysis, and retrieval of information by computer (Boston: Addison Wesley, 1989).
[17] R. Kurino, M. Sugisaka, & K. Shibata, Growing neural network with hidden neurons, Proc. 9th Int. Symp. on Artificial Life and Robotics (AROB ’04), 1, Oita, Japan, January 2004, 144–147.
[18] A.L. Blum & R.L. Rivest, Training a 3-node neural network is NP-complete, Proc. of the Int. Journal of Neural Networks, 5 (1), 1992, 117–127. doi:10.1016/S0893-6080(05)80010-3
[19] A. Sperduti & A. Starita, Speed up learning and network optimization with extended back-propagation, Proc. of the Int. Journal of Neural Networks, 6, 1993, 365–383. doi:10.1016/0893-6080(93)90004-G
[20] Koller, Robert D. & Bogdan M. Wilamowski, A relax-ation/regression algorithm for efficient training of multilayer neural networks, World Congress of Neural Networks, vol. 1, Washington DC, USA, July 17–21, 1995, 683–686.
[21] ISO 2788, Documentation guidelines for the establishment and development of monolingual thesauri, Second Edition (Geneva, Switzerland: International Organization for Standardization, 1986).
[22] M. Bates, Subject access in online catalogs: A design model, Proc. American Society for Information Science Journal (ASIST ’86), 37 (6), November 1986, 357–376.
[23] C.E. Shannon & W. Weaver, The mathematical theory ofcommunication, Proc. Bell System Technical Journal, vol. 27, July 1948, 379–423.
[24] K. Aas & L. Eikvil, Text categorization: A survey, Technical Report no. 941, Norwegian Computing Center, June 1999.
[25] T.Y. Chen, F. Kuo, & R. Merkel, On the statistical properties of the F-measure, Proc. 4th Int. Conf. on Quality Software (QSIC ’04), Braunshweig, Germany, September 2004, 146–153.

Important Links:

Abstract
DOI: 10.2316/Journal.202.2007.3.202-1961
From Journal (202) International Journal of Computers and Applications - 2007

Go Back