International Journal of Computer Science & Engineering Technology

ISSN : 2229-3345

Open Access
Open Access

ABSTRACT

Title : FIDOOP-HD: Mining Frequent Datasets on Layer Clusters by Balanced Iterative Reducing and Clustering using Hierarchies
Authors : E.Shapna Rani, T.Aarthi
Keywords : Parallel Mining, Data distribution, Hadoop Clusters, BIRCH
Issue Date : Mar 2017
Abstract :
Mining of data items plays a very crucial role in recent era as the World tends to manage with huge number of data’s. Existing algorithm for frequent item set is Parallel Mining algorithm. But it lacks various mechanisms like Automatic Parallelization, Load Balancing, Data distribution and Fault Tolerance on large clusters. So a Parallel frequent item sets mining algorithm called FiDoop using the Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) is designed. To achieve compressed storage and avoid building conditional pattern bases, FiDoop is implemented on in-house Hadoop cluster and showed that FiDoop on the cluster is sensitive to data distribution and dimensions, because item sets with different lengths have different decomposition and construction costs. Also Birch performs faster, scans whole data only once, handles outlier better, superior to other algorithms in stability and scalability. To improve FiDoop’s performance, workload balance metric is developed to measure load balance across the cluster’s computing nodes. FiDoop-HD, an extension of Fi-Doop to speed up the mining performance for high-dimensional data analysis is developed extensive experiments using real-world celestial spectral data demonstrate that the proposed solution is efficient and scalable.
Page(s) : 95-101
ISSN : 2229-3345
Source : Vol. 8, Issue.03

Copyright © 2010-2024 IJCSET KEJA Publications