• Printed Journal
  • Indexed Journal
  • Peer Reviewed Journal
Journal of Applied Science & Engineering

Dhaka University Journal of Applied Science & Engineering

Issue: Vol. 6, No. 2, July 2021
Title: A Relief Based Feature Subset Selection Method
Authors:
  • Suravi Akhter
    Institute of Information Technology Dhaka, Bangladesh
  • Sumon Ahmed
    Institute of Information Technology Dhaka, Bangladesh
  • Ahmedul Kabir
    Institute of Information Technology Dhaka, Bangladesh
  • Mohammad Shoyaib
    Institute of Information Technology Dhaka, Bangladesh
DOI:
Keywords: Relief, Feature Subset Selection, Relief based Feature Subset Selection (RFSS), Greedy Forward search
Abstract:

Feature selection methods are used as a preliminary step in different areas of machine learning. Feature selection usually involves ranking the features or extracting a subset of features from the original dataset. Among various types of feature selection methods, distance-based methods are popular for their simplicity and better accuracy. Moreover, they can capture the interaction among the features for a particular application. However, it is difficult to decide the appropriate feature subset for better accuracy from the ranked feature set. To solve this problem, in this paper we propose Relief based Feature Subset Selection (RFSS), a method to capture more interactive and relevant feature subset for obtaining better accuracy. Experimental result on 16 benchmark datasets demonstrates that the proposed method performs better in comparison to the state-of-the-art methods.

References:
  1. M. E. Kadir, P. S. Akash, S. Sharmin, A. A. Ali, and M. Shoyaib, “Can a simple approach identify complex nurse care activity?,” in Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers, pp. 736–740, 2019.
  2. J. Gui, Z. Sun, S. Ji, D. Tao, and T. Tan, “Feature selection based on structured sparsity: A comprehensive study,” IEEE transactions on neural networks and learning systems, vol. 28, no. 7, pp. 1490–1507, 2016.
  3. M. Liu, C. Xu, Y. Luo, C. Xu, Y. Wen, and D. Tao, “Costsensitive feature selection by optimizing f-measures,” IEEE Transactions on Image Processing, vol. 27, no. 3, pp. 1323– 1335, 2017
  4. Y. Wang, L. Feng, and J. Zhu, “Novel artificial bee colony based feature selection method for filtering redundant information,” Applied Intelligence, vol. 48, no. 4, pp. 868–885, 2018
  5. Y.-T. P. Lee, Sungyoung and B. J. d’Auriol., “A novel feature selection method based on normalized mutual information,” Applied Intelligence, vol. 37, no. 1, pp. 100 –120, 2012
  6. M. Mafarja, I. Aljarah, A. A. Heidari, A. I. Hammouri, H. Faris, A.-Z. Ala’M, and S. Mirjalili, “Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems,” Knowledge-Based Systems, vol. 145, pp. 25–45, 2018
  7. H. Peng, F. Long, and C. Ding, “Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy,” IEEE Transactions on Pattern Analysis & Machine Intelligence, no. 8, pp. 1226–1238, 2005.
  8. J. Moody and H. Yang, “Data visualization and feature selection: New algorithms for nongaussian data,” Advances in neural information processing systems, vol. 12, pp. 687–693, 1999
  9. N. X. Vinh, S. Zhou, J. Chan, and J. Bailey, “Can highorder dependencies improve mutual information based feature selection?,” Pattern Recognition, vol. 53, pp. 46–58, 2016
  10. K. Igor, “Estimating attributes: analysis and extensions of relief,” European conference on machine learning. Springer, Berlin, Heidelberg, 1994.
  11. S. H. Yang and B.-G. Hu, “Discriminative feature selection by nonparametric bayes error minimization,” IEEE Transactions on knowledge and data engineering, vol. 24, no. 8, pp. 1422–1434, 2012
  12. C. S. Greene, N. M. Penrod, J. Kiralis, and J. H. Moore, “Spatially uniform relieff (surf) for computationally-efficient filtering of gene-gene interactions,” BioData Mining, vol. 2, no. 1, pp. 1–9, 2009.
  13. P. M. M. M. Urbanowicz R. J. Olson, R. S. Schmitt and J. H., “Multiple threshold spatially uniform relieff for the genetic analysis of complex human diseases.,” Benchmarking reliefbased feature selection methods for bioinformatics data mining. Journal of Biomedical Informatics, vol. 85, pp. 168–188, 2018.
  14. M. I.Guyon, S.Gunn and L.A.Zadeh, “Feature extraction: Foundations and applications (studies in fuzziness and soft computing),” pringer-Verlag New York, Inc., 2006.
  15. I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” J. Machine Learning Research, vol. 3, pp. 1157– 1182, 2003.
  16. A. Senawi, H.-L. Wei, and S. A. Billings, “A new maximum relevance-minimum multicollinearity (mrmmc) method for feature selection and ranking,” Pattern Recognition, vol. 67, pp. 47–61, 2017.
  17. S. Sharmin, M. Shoyaib, A. A. Ali, M. A. H. Khan, and O. Chae, “Simultaneous feature selection and discretization based on mutual information,” Pattern Recognition, vol. 91, pp. 162 – 174, 2019.
  18. M. S. H. Khan, P. Roy, F. Khanam, F. H. Hera, and A. K. Das, “An efficient resource allocation mechanism for time-sensitive data in dew computing,” in 2019 International Conference of Artificial Intelligence and Information Technology (ICAIIT), pp. 506–510, IEEE, 2019.
  19. S. Nakariyakul, “A hybrid gene selection algorithm based on interaction information for microarray-based cancer classification,” Plos One, vol. 14, no. 2, p. e0212333, 2019.
  20. R. Ruiz, J. C. Riquelme, J. S. Aguilar-Ruiz, and M. Garc ́ıaTorres, “Fast feature selection aimed at high-dimensional data via hybrid-sequential-ranked searches,” Expert Systems with Applications, vol. 39, no. 12, pp. 11094–11102, 2012.
  21. S. Sharmin, A. A. Ali, M. A. H. Khan, and M. Shoyaib, “Feature selection and discretization based on mutual information,” in 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR), pp. 1–6, IEEE, 2017.
  22. T. Naghibi, S. Hoffmann, and B. Pfister, “A semidefinite programming based search strategy for feature selection with mutual information measure,” IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 8, pp. 1529–1541, 2014.
  23. L. Goh, Q. Song, and N. Kasabov, “A novel feature selection method to improve classification of gene expression data,” in Proceedings of the second conference on Asia-Pacific bioinformatics-Volume 29, pp. 161–166, Australian Computer Society, Inc., 2004.
  24. I. Jo, S. Lee, and S. Oh, “Improved measures of redundancy and relevance for mrmr feature selection,” Computers, vol. 8, no. 2, p. 42, 2019.
  25. D. D. Lewis, “Feature selection and feature extract ion for text categorization,” Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26,1992, 1992.
  26. J. C. Nguyen Xuan Vinh, Shuo Zhou and J. Bailey, “Can highorder dependencies improve mutual information based feature selection?,” Pattern Recognition, vol. 53, pp. 46—-58, 2016
  27. Kira, Kenji, and L. A. Rendell, “The feature selection problem: traditional method and a new algorithm,” AAAI, vol. 2, pp. 129–134, 1992. [28] F. Pisheh and R. Vilalta, “Filter-based information-theoretic feature selection,” Proceedings of the 2019 3rd International Conference on Advances in Artificial Intelligence, 2019.
  28. Y. Sun, “Iterative relief for feature weighting: algorithms, theories, and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1035–1051, 2007.
  29. C. S. Greene, N. M. Penrod, J. Kiralis, and J. H. Moore, “Spatially uniform relieff (surf) for computationally-efficient filtering of gene-gene interactions,” Biodata Mining, vol. 2, no. 1, pp. 1–9, 2009.
  30. D. S. K. J. Greene, C. S. Himmelstein and M. J. H., “The informative extremes: using both nearest and farthest individuals can improve relief algorithms in the domain of human genetics.,” In European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics, pp. 182–193, April, 2010.
  31. D. Granizo-Mackenzie and J. H. Moore, “Multiple threshold spatially uniform relieff for the genetic analysis of complex human diseases.,” European Conference on Evolutionary Computation,Machine Learning and Data Mining in Bioinformatics. Springer, Berlin, Heidelberg, pp. 1–10, 2013.
  32. R. J. Urbanowicz, M. Meeker, W. La Cava, R. S. Olson, and J. H. Moore, “Relief-based feature selection: Introduction and review,” Journal of Biomedical Informatics, vol. 85, pp. 189–203, 2018.
  33. “Uci machine learning repository.” https://archive.ics.uci.edu/ ml. [Online; accessed 16-February-2021].
  34. “Scikit rebate library.” https://github.com/EpistasisLab/ scikit-rebate. [Online; accessed 3-March-2021].