• Printed Journal
  • Indexed Journal
  • Peer Reviewed Journal
Journal of Applied Science & Engineering

Dhaka University Journal of Applied Science & Engineering

Issue: Vol. 7, No. 1, January 2022
Title: A Vertex-extension based Algorithm for Frequent Pattern Mining from Graph Databases
Authors:
  • Md Tanvir Alam
    Department of Computer Science and Engineering, University of D haka, Bangladesh
  • Chowdhury Farhan Ahmed
    Department of Computer Science and Engineering, University of D haka, Bangladesh
  • Md Samiullah
    Department of Computer Science and Engineering, University of D haka, Bangladesh
DOI:
Keywords: Graph mining, Knowledge graphs.
Abstract:

Frequent pattern mining is a core problem in data mining. Algorithms for frequent pattern mining have been proposed for itemsets, sequences, and graphs. However, existing graph mining frameworks follow an edge-growth approach to building patterns which limits many applications. Motivated by real-life problems, in this work, we define a novel graph mining framework that incorporates vertex-based extensions along with the edge-growth approach. We also propose an efficient algorithm for mining frequent subgraphs. To deal with the exploding search space, we introduce a canonical labeling technique for isomorphic candidates as well as downward closure property-based search space pruning. We present an experimental analysis of our algorithm on real-life benchmark graph datasets to demonstrate the performance in terms of runtime.

References:
  1. H. Cheng, X. Yan, J. Han, and C.-W. Hsu, “Discriminative frequent pattern analysis for effective classification,” in 2007 IEEE 23rd international conference on data engineering. IEEE, pp. 716–725, 2007
  2. M. T. Alam, C. F. Ahmed, M. Samiullah, and C. K. Leung, “Discriminating frequent pattern based supervised graph embedding for classification,” in Advances in Knowledge Discovery and Data Mining, pp. 16–28, 2021
  3. J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate generation,” vol. 29, no. 2. ACM, pp. 1–12, 2000.
  4. J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, and M.-C. Hsu, “Prefixspan: Mining sequential patterns efficiently by prefixprojected pattern growth,” in ICDE. IEEE, 2001.
  5. R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tompkins, and E. Upfal, “The web as a graph,” in Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 1–10, 2000
  6. L. Tang and H. Liu, “Graph mining applications to social network analysis,” in Managing and Mining Graph Data. Springer, pp. 487–513, 2010.
  7. X. Yan and J. Han, “gspan: graph-based substructure pattern mining,” in ICDM, 2002
  8. M. Kuramochi and G. Karypis, “Frequent subgraph discovery,” in ICDM. IEEE, 2001
  9. M. T. Alam, C. F. Ahmed, M. Samiullah, and C. K. Leung, “Mining frequent patterns from hypergraph databases,” in Advances in Knowledge Discovery and Data Mining, pp. 3–15, 2021
  10. Q. Wang, Z. Mao, B. Wang, and L. Guo, “Knowledge graph embedding: A survey of approaches and applications,” IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 12, pp. 2724–2743, 2017
  11. J. Pujara, H. Miao, L. Getoor, and W. Cohen, “Knowledge graph identification,” in International Semantic Web Conference. Springer, pp. 542–557, 2013
  12. X. Chen, S. Jia, and Y. Xiang, “A review: Knowledge reasoning over knowledge graph,” Expert Systems with Applications, vol. 141, p. 112948, 2020
  13. R. Srikant, Q. Vu, and R. Agrawal, “Mining association rules with item constraints.” in KDD, vol. 97, pp. 67–73, 1997.
  14. A. Inokuchi, T. Washio, and H. Motoda, “An apriori-based algorithm for mining frequent substructures from graph data,” in European conference on principles of data mining and knowledge discovery. Springer, 2000
  15. N. Wale, I. A. Watson, and G. Karypis, “Comparison of descriptor spaces for chemical compound retrieval and classification,” Knowledge and Information Systems, vol. 14, no. 3, pp. 347–375, 2008
  16. K. M. Borgwardt, C. S. Ong, S. Schönauer, S. Vishwanathan, A. J. Smola, and H.-P. Kriegel, “Protein function prediction via graph kernels,” Bioinformatics, vol. 21, no. suppl_1, pp. i47–i56, 2005
  17. P. Yanardag and S. Vishwanathan, “Deep graph kernels,” in ACM SIGKDD, pp. 1365–1374, 2015.
  18. A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Hansch, “Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity,” Journal of medicinal chemistry, vol. 34, no. 2, pp. 786–797, 1991
  19. H. Toivonen, A. Srinivasan, R. D. King, S. Kramer, and C. Helma, “Statistical evaluation of the predictive toxicology challenge 2000–2001,” Bioinformatics, vol. 19, no. 10, pp. 1183–1193, 2003
  20. S. Nijssen and J. N. Kok, “A quickstart in frequent structure mining can make a difference,” in Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 647–652, 2004
  21. B. Vo, D. Nguyen, and T.-L. Nguyen, “A parallel algorithm for frequent subgraph mining,” in Advanced Computational Methods for Knowledge Engineering. Springer, pp. 163–173, 2015.
  22. D. Nguyen, W. Luo, T. D. Nguyen, S. Venkatesh, and D. Phung, “Learning graph representation via frequent subgraphs,” in Proceedings of the 2018 SIAM International Conference on Data Mining. SIAM, pp. 306–314, 2018.