A Survey of Data Mining Techniques on Information Networks

Authors

  • Sadhana Kodali

  • Madhavi Dabbiru

  • B Thirumala Rao

How to Cite

Kodali, S., Dabbiru, M., & Rao, B. T. (2018). A Survey of Data Mining Techniques on Information Networks. International Journal of Engineering and Technology, 7(2.6), 293-300. https://doi.org/10.14419/ijet.v7i2.6.11267

Received date: April 8, 2018

Accepted date: April 8, 2018

Published date: March 11, 2018

DOI:

https://doi.org/10.14419/ijet.v7i2.6.11267

Keywords:

InformationNetworks, DataMining Techniques, Homogeneous Information Networks, HeterogeneousInformation Networks

Abstract

An Information Network is the network formed by the interconnectivity of the objects formed due to the interaction between them. In our day-to-day life we can find these information networks like the social media network, the network formed by the interaction of web objects etc. This paper presents a survey of various Data Mining techniques that can be applicable to information networks. The Data Mining techniques of both homogeneous and heterogeneous information networks are discussed in detail and a comparative study on each problem category is showcased.


References

  1. [1] Chuan Shi, Yitong Li, Jiawei Zhang, Yizhou Sun, Philip S. Yu,†A Survey of Heterogeneous Information Network Analysisâ€, in Journal Latex class Files, Vol 14,no 8,August 2017.

    [2]Glen Jeh, Jennifer Widom,â€SimRank: A measure of Structural-Context similarity†in KDD, pp. 538–543, 2002.

    [3] A. Blum, T.-H. H. Chan, and M. R. Rwebangira. A random-surfer web-graph model. In ANALCO ’06: Proceedings of the eighth Workshop on Algorithm Engineering and Experiments and the third Workshop on Analytic Algorithmic and Combinatorics, pages 238- 246, Philadelphia, PA, USA, 2006. Society for Industrial and Applied Mathematics.

    [4] Peixiang Zhao, Jiawei Han, Yizhou Sun,†P-Rank: a Comprehensive Structural Similarity Measure over Information Networksâ€.CIKM’09, Hong Kong, China. November 2–6, 2009

    [5] H. G. Small. Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, Science, 24(4):265-269, 1973.

    [6] M. M. Kessler. Bibliographic coupling between scientific papers. American Documentation, 14:10-25, 1963.

    [7] R. Amsler. Application of citation-based automatic classification. Technical report, The University of Texas at Austin Linguistics Research Center, December 1972.

    [8] Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi Wu, “PathSim: Meta path Based TopK Similarity Search in Heterogeneous Information Networks†Proceedings of the VLDB Endowment, Vol. 4, No. 11 2011.

    [9] Wensi Xi, Benyu Zhang, Edward A. Fox, SimFusion: A Unified Similarity Measurement Algorithm for Multi-Type Interrelated Web Objects, in the www conference May10-14, 2005.

    [10] C. Shi, X. Kong, Y. Huang, S. Y. Philip, and B. Wu, “HeteSim:A general framework for relevance measure in heterogeneous networks,†IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. 10, pp. 2479–2492, 2014.

    [11] Sadhana Kodali,Madhavi Dabbiru,Kamalakar Meduri,â€Constraint based approach for minging Heterogeneous Information Networks†6th IEEE IACC 2016 ,27th -28th February 2016.

    [12] Jeffrey Dean and Sanjay Ghemawat ,†MapReduce: Simplified Data Processing on Large Clusters†,OSDI 2004 11th March.

    [13] Jiazhen Nian, Shanshan Wang, and Yan Zhang,â€HN-Sim: A Structural Similarity Measure over Object-Behavior Networksâ€, Part I, LNAI 8346, pp. 48–59 ,ADMA-2013.

    [14]J. M. Kleinberg, “Authoritative sources in a hyperlinked environment,†in SODA, pp. 668–677. 1999.

    [15] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Page Rank citation ranking: Bringing order to the web.†Technical report, Stanford University Database Group, 1998.

    [16] R. LEMPEL and S. MORAN, SALSA: The Stochastic Approach for Link Structure Analysis, ACM Transactions on Information Systems, Vol. 19, No. 2, April 2001.

    [17] Taher H. Haveliwala, Topic Sensitive Page Rank WWW May 7–11, Honolulu, Hawaii, USA, 2002.

    [18] G.Jeh and J. Widom, “Scaling personalized web search,†in WWW, pp 271–279, 2003.

    [19] Gyongyi Z, Garcia-Molina H, Pedersen J Combating web spam with TrustRank. In: Proceedings of the Thirtieth international conference on Very large data bases - Volume 30, VLDB Endowment, VLDB '04, pp 576-587, 2004.

    [20] Balmin A, Hristidis V, Papakonstantinou ObjectRank: authority-based keyword search in databases. In: Proceedings of the Thirtieth international conference on Very large data bases - Volume 30, VLDB Endowment, VLDB, pp 564-575, 2004.

    [21] Yoav Freund, Raj Iyer, Robert E. Schapire, Yoram Singer, An Efficient Boosting Algorithm for Combining Preferences, Journal of Machine Learning Research 4 pp.933-969,2003.

    [22] Kazawa, H., Hirao, T.Maeda,: Order SVM: a kernel method for order learning based on generalized order statistics. Systems and Computers in Japan pp 35–43, 2005.

    [23] Herbrich, R., Graepel, T., Bollmann-Sdorra, P Obermayer, K.: Learning preference relations for information retrieval. In: ICML-98 Workshop: Text Categorization and Machine Learning.pp 80–84, 1998.

    [24] Nie Z, Zhang Y, Wen JR, Ma WY (2005) Object-level ranking: bringing order to web objects. In: Proceedings of the 14th international conference on World Wide Web, WWW '05, pp 567-574,2005.

    [25] Hai-jiang He, A Co-Ranking Algorithm for Learning Listwise Ranking Functions from Unlabeled Data, journal of computers, vol. 6, no. 11, november 2011.

    [26]Zhirun Liu,Heyan Huang,Xiaochi Wei,Xianling Mao, Tri-Rank: An Authority Ranking Framework in Heterogeneous Academic Networks by Mutual Reinforce,26 th IEEE International Conference on tools with Artificial Intelligence,2014.

    [27] Y. Sun, J. Han, P. Zhao, Z. Yin, H. Cheng, and T. Wu, “RankClus:Integrating clustering with ranking for heterogeneous information network analysis,†in EDBT,pp. 565–576, 2009

    [28] A. P. Dempster; N. M. Laird; D. B. Rubin, Maximum Likelihood from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, No. 1. , pp. 1-38, 1977.

    [29] Y. Sun, Y. Yu, and J. Han, “Ranking-based clustering of heterogeneous information networks with star network schema,†in KDD, pp. 797–806, 2009.

    [30] Ling Chen,XueLi,Jiawei Han “MedRank: Discovering Influential Medical Treatments from Literature by Information Network Analysis†Twenty-Fourth Australasian Database Conference (ADC2013), Adelaide, Australia,2013.

    [31]R. Wang, C. Shi, P. S. Yu, and B. Wu, “Integrating clustering and ranking on hybrid heterogeneous information network,†in PAKDD, pp. 583–594, 2013.

    [32] R. Angelova, G. Kasneci, and G. Weikum, “Graffiti: Graph-based classification in heterogeneous networks,†in WWW, pp.139–170, 2012.

    [33] M. Ji, J. Han, and M. Danilevsky, “Ranking-based classification of heterogeneous information networks,†in KDD pp. 1298–1306, 2011.

    [34] C. Luo, R. Guan, Z. Wang, and C. Lin, “HetPathMine: A novel transductive classification algorithm on heterogeneous information networks,†Advances in Information Retrieval, vol. 8416, pp. 210–221, 2014.

    [35]L. Ungar and D. Foster, “Clustering Methods for Collaborative Filtering,†Proc. Workshop on Recommendation Systems, AAAI Press, 1998.

    [36] Xiao Yu, Jiawei Han et al."Personalized Entity Recommendation: A Heterogeneous Information Network Approach"WSDM’14, , New York, New York, USA, February 24–28, 2014.

    [37] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. "A limited memory algorithm for bound constrained optimization." SIAM Journal on Scientific Computing, 16(5):1190–1208, 1995.

Downloads

How to Cite

Kodali, S., Dabbiru, M., & Rao, B. T. (2018). A Survey of Data Mining Techniques on Information Networks. International Journal of Engineering and Technology, 7(2.6), 293-300. https://doi.org/10.14419/ijet.v7i2.6.11267

Received date: April 8, 2018

Accepted date: April 8, 2018

Published date: March 11, 2018