International Journal of Computer Science & Engineering Technology

ISSN : 2229-3345

Open Access
Open Access

ABSTRACT

Title : An Effective Approach for Web Document Classification using FP-Growth and Naïve Bayes Techniques
Authors : Rajendra Kumar Roul, Dr. Sanjay Kumar Sahay
Keywords : Classification, FP-growth, Gensim, Naïve Bayes, Vector space model
Issue Date : October 2012
Abstract :
Exponential growth of the web increased the importance of web documents classification and data mining. To get the exact information, in the form of knowing what classes a web document belongs to, is expensive. Automatic classification of web documents is of great use to search engines which provides this information at a low cost. In this paper, we propose an approach for classifying the web documents using the frequent item word sets generated by the Frequent Pattern(FP) Growth technique. These set of associated words act as feature set. The final classification obtained after Naïve Bayes classifier used on the feature set. For the experimental work, we use Gensim package, as it is simple and robust. Results show that our approach can be effectively classifying the web documents.
Page(s) : 483-491
ISSN : 2229-3345
Source : Vol. 3, Issue.10

Copyright © 2010-2024 IJCSET KEJA Publications