International Journal of Information Technology & Computer Science ( IJITCS )
Text handling has become an important task in the current age of information processing. In this text categorization (TC) is the fundamental task of handling the text documents. There are several algorithms available for implementing text categorization. This paper concentrates on proposing a new algorithm for text categorization termed as NGramsSA algorithm. The performance of this algorithm is also compared with NGrams algorithm and the results are provided at the end of this paper.
:N-Grams, Test Profile, Simulated Annealing, Precision, Execution Time .
- Abdellatif Rahmoun and Zakaria Elberrichi, (2007). Experimenting N-Grams in Text Categorization. The International Arab Journal of Information Technology, Vol.4,No.4.
- A.Salappa, M.Doumpos and C.Zopoundis, (2007). Feature Selection algorithms in classification problems: an experimental evaluation, Taylor & Francis, Optimization Methods and Software Vol.22, No. 1, pp 199-214
- Ben-Bassat, M., (1982). Pattern recognition and reduction of dimensionality. In: Handbook of Statistics, (P. R. Krishnaiah and L. N. Kanal, eds.), North Holland, p.p.773–791.
- Cavnar, William B. AND Vayda , Alan J.,(1993). “N-gram based matching for multi-field database access in postal applications”, Proceedings of the 1993 Symposium on Document Analysis and Information Retrieval, University of Nevada, Las Vegas
- Chuanyao Yang, et.al., (2007). A Fast KNN Algorithm based on simulated Annealing. Proceedings of the International Conference on Data Mining, pp.46-51.
- Damashek, M., (1995), “Gauging similarity with n-grams: language-independent categorization of text?”, Science, Vol. 267 No. 10, pp. 843-8.
- Fabrizio Sebastiani, ( 2002). Machine Learning in automatic text categorization, ACM Computing Surveys, Vol. 34, No. 1, , pp. 1–47.
- FRASCONI, P., et.al., (2002). Text categorization for multi-page documents: A hybrid naive Bayes HMM approach. J. Intell. Inform. Syst. 18, 2/3 (March–May), p.p. 195–217
- Fuchun Peng, Xiangji Huang, (2006). Machine Learning for Asian language Text Classification, Journal Documentation.
- Furnkranz J., (1998). “A study using n-gram features for Text Categorization”, Technical Report OEFAI – TR -98-30, Austrian research Institute for artificial Intelligence, Austria.