FEATURE SUBSET SELECTION FOR HIGH DIMENSIONAL DATA BASED ON CLUSTERING

Authors

  • Prof. S.N.Zaware Computer Department, AISSMS IOIT Pune
  • Heena Shaikh Computer Department, AISSMS IOIT Pune
  • Sheefa Shaikh Computer Department, AISSMS IOIT Pune
  • Asmita Orpe Computer Department, AISSMS IOIT Pune
  • Pooja Rokade Computer Department, AISSMS IOIT Pune

Keywords:

Markov Blanket, MST Creation, Gaussian Distribution, Shannon Infogain, Bayesian Probability, Fuzzy Logic

Abstract

Feature selection is the process of evaluating and extracting desired data which can be grouped into subsets
which retain the integrity of original data. A feature selection algorithm should be efficient and effective. Efficient means
minimum time required and effective means quality of generated subset is not compromised. Our system proposes an
algorithm which consists of following steps: Markov Blanket, Shannon Infogain, Minimum Spanning Tree, Tree
Partition, Gaussian Distribution, Bayesian Probability. Applying these steps we get the desired subset from the clusters.
Our system ensures to remove irrelevant data along with redundant data which most of the systems fail to perform.

Published

2015-12-25

How to Cite

Prof. S.N.Zaware, Heena Shaikh, Sheefa Shaikh, Asmita Orpe, & Pooja Rokade. (2015). FEATURE SUBSET SELECTION FOR HIGH DIMENSIONAL DATA BASED ON CLUSTERING. International Journal of Advance Engineering and Research Development (IJAERD), 2(12), 105–107. Retrieved from https://www.ijaerd.org/index.php/IJAERD/article/view/5271