IMPROVING RANKING PROCEDURE IN INFORMATION RETRIEVAL PROCESS USING SCRANK ALGORITHM

Published 30 november 2020 •  vol 144  • 


Authors:

 

Shadab Irfan, Department of SCSE, Galgotias University, Greater Noida, Uttar Pradesh, India
D Rajesh Kumar, Department of SCSE, Galgotias University, Greater Noida, Uttar Pradesh, India

Abstract:

 

The exponential growth of the World Wide Web has lead to the difficulty in finding out the relevant web pages on the basis of search query and most of the time is spent to retrieve the much needed information. Internet normally comprises of abundance of information and people explore their areas to get the relevant information. It has been found out that the core component of any search engine is the ranking framework that helps to rank the web pages based on user queries. A good ranking system should not be based on single criteria but can take inputs from multiple points to create a comprehensive ranking system. It can be pointed out that the process of Information Retrieval can be improved by incorporating the process of both content similarity and link analysis. In this paper, a ranking algorithm scRank is proposed that helps in efficiently ranking the web pages. The web pages are clustered on the basis of the query entered by the user by applying certain conditions and thereby ranking them. The illustrated example given in the paper demonstrates the working of the model and compares scRank with traditional PageRank approach. The proposed approach helps in reducing the complexity by reducing the number of iterations and rank documents in minimum time duration. The given work is tested on dataset and for accessing the ranking quality mean average precision and normalized discounted cumulative gain parameters shows an improvement in the results and reduction in execution time and number of iterations.

Keywords:

 

Information Retrieval, IR Models, Ranking Algorithm, Web Mining, Clustering, Document Similarity

References:

 

[1] Sethi, S. and Dixit, A., “A Novel Page Ranking Mechanism Based on User Browsing Patterns”, Software Engineering, Advances in Intelligent Systems and Computing, Springer, (2019).
[2] Goel, S., Kumar, R., Kumar, M. and Chopra, V., “An efficient page ranking approach based on vector norms using sNorm(p) algorithm”, Information Processing and Management, Elsevier, (2019).
[3] Guo, J., Fana, Y., Pang, L., Yang, L., Ai, Q., Zamani, H., Wu, C., Croft, W. B. and Chenga, X., “A Deep Look into Neural Ranking Models for Information Retrieval”, Journal of Information Processing and Management, (2019).
[4] Sethi, S. and Dixit, A., “A Novel Page Ranking Mechanism Based on User Browsing Patterns”, Software Engineering, Advances in Intelligent Systems and Computing, Springer Nature Singapore Pte Ltd., (2019).
[5] Overland, I. and Juraev, J., “Algorithm for Producing Rankings Based on Expert Surveys”, MDPI, Algorithms, (2019).
[6] Sahu, S., Gupta, R. and Dutta, A., “An Analysis of Web User Behavior using Hybrid Algorithm based on Sequential Pattern Mining”, International Journal of Applied Engineering Research ISSN 0973-4562, vol. 14, (2019).
[7] Choudhary, J., Tomar, D. S. and Singh, D. P., “An Efficient Hybrid User Profile Based Web Search Personalization Through Semantic Crawler”, Springer, (2018).
[8] Goel, S., Kumar, R., Kumar, M. and Chopra, V., “An efficient page ranking approach based on vector norms using sNorm(p) algorithm”, Information Processing and Management, Elsevier, (2019).
[9] Gao, Y. and Xu, K., “pRankAggreg: A fast clustering based partial rank aggregation”, Information Sciences, Elsevier, (2018).
[10] Koo, J., Chae, D. K., Kim, D. J. and Kim, S. W., “Incremental C-Rank: An effective and efficient ranking algorithm for dynamic Web environments”, Knowledge-Based Systems, Elsevier, (2019).
[11] Anjusha, I. T. and Nizar, M. A., “Combining Hyperlink Structure and Content of Webpage for Personalization of Search Engine”, Information and Communication Technology for Intelligent Systems, Smart Innovation, Systems and Technologies, Springer, (2019).
[12] Shen, C., Kin, J., Wang, L. and Van den Hengel, A., “A positive semi definite metric learning using boosting-like algorithms”, Journal of Machine Learning Research, (2012).
[13] Kehinde, K. A., Aruleba, K. D. and Ayetiran, E. F., “Algorithm for Information Retrieval Optimization”, International Journal of Computer, Electrical, Automation, Control & Info. Engineering, (2016).
[14] Mashagba, E. A., Mashagba, F. A. and Nassar, M. O., “Query Optimization Using Genetic Algorithms in the Vector Space Model”, IJCSI International Journal of Computer Science Issues, vol. 8, is. 5, no. 3, (2011).
[15] https://en.wikipedia.org/wiki/Evolutionary_computation.
[16] Irfan, S. and Ghosh, S., “Optimization of Information Retrieval Using Evolutionary Computation: A Survey”, International Conference on Computing, Communication and Automation (ICCCA), (2017).
[17] Hiemstra, D., “Information Retrieval: Searching in the 21st Century”, John Wiley & Sons, Ltd., (2009).
[18] Sahu, S., Gupta, R. and Dutta, A., “An Analysis of Web User Behavior using Hybrid Algorithm based on Sequential Pattern Mining”, International Journal of Applied Engineering Research ISSN 0973-4562, vol. 14, no. 10, (2019).
[19] Li, Y. and Zhong, N., “Web Mining Model and Its Applications for Information Gathering”, Knowledge-Based Systems, Elsevier.
[20] Yates, R. B., “Information retrieval in the Web: Beyond Current Search Engines”, International Journal of Approximate Reasoning, Elsevier, (2003), pp. 97-104.
[21] Witten, I. H. and Frank, E., “Data Mining”, Morgan Kaufmann, San Francisco, CA, (2000).
[22] Guo, J., Fan, Y., Pang, L., Yang, L., Ai, Q., Zamani, H., Wua, C., Croft, W. B. and Chenga, X., “A Deep Look into Neural Ranking Models for Information Retrieval”, Journal of Information Processing and Management, (2019).
[23] Koo, J., Chae, D. K., Kim, D. J. and Kim, S. W., “Incremental C-Rank: An effective and efficient ranking algorithm for dynamic Web environments”, Knowledge-Based Systems, Elsevier, (2019).
[24] Lewamdowski, D., “Search engine User behavior: how user be guided to quality content?”, Information Services and Use, (2008).
[25] Jansen, B. J., Spink, A. and Saracevic, T., “Real life, Real users, and Real needs: A Study and Analysis of User Queries on the web”, Information Process Management, (2000).
[26] Hjorland, B., “The Foundation of the Concept of Relevance”, Journal of the American Society for Information Science and Technology, (2010).
[27] Grady, C. and Lease, M., “Crowd sourcing Document Relevance Assessment with Mechanical Turk”, In proceedings of the NAACL with 2010 workshop on Creating Speech and Language Data with Amazon Mechanical Turk, Los Angelis, California, (2010).
[28] Brin, S. and Page, L., “The anatomy of a Large-scale Hypertextual Web Search Engine”, In Proceedings of the Seventh International World Wide Web Conference, (1998).
[29] Xing, W. and Ghorbani, A., “Weighted PageRank Algorithm”, Proceedings of the Second Annual Conference on Communication Networks and Services Research, IEEE, (2004).
[30] Bidoki, A. M. Z. and Yazdani, N., “DistanceRank: An intelligent ranking algorithm for web pages”, Information Processing and Management, 2007 Elsevier, (2007).
[31] Lamberti, F., Sanna, A. and Demartini, C., “A Relation-Based Page Rank Algorithm for. Semantic Web Search Engines”, In IEEE Transaction of KDE, vol. 21, no. 1, (2009).
[32] Fujimura, K., Inoue, T. and Sugisaki, M., “The EigenRumor Algorithm for Ranking Blogs”, WWW, (2005).
[33] Jie, S., Chen, C., Hui, Z., Shuang, S. R., Yan, Z. and Kun, H., “TagRank: A New Rank Algorithm for Webpage Based on Social Web”, In proceedings of the International Conference on Computer Science and Information Technology, (2008).
[34] Lee, L. W., Jiang, J. Y., Wu, C. and Lee, S. J., “A Query-Dependent Ranking Approach for Search Engines”, Second International Workshop on Computer Science and Engineering, vol. 1, (2009), pp. 259-263.
[35] Kleinberg, J. M., “Authoritative Sources in a Hyperlinked Environment”, ACM, vol. 46, no. 5, (1999), pp. 604-632.
[36] Jiang, H., “TIMERANK: A Method of Improving Ranking Scores by Visited Time”, In proceedings of the Seventh International Conference Machine Learning and Cybernetics, Kunming, (2008).
[37] Yates, R. B. and Davis, E., “Web Page Ranking using Link Attributes”, ACM, (2004) May 17-22.
[38] Irfan, S. and Ghosh, S., “Analysis & Challenges of Web Ranking Algorithms”, ICCCA, (2018).
[39] Runkler, T. A. and Bezdek, J. C, “Web Mining with Relational Clustering”, Elsevier, (2018).
[40] Gao, Y. and Xu, K., “pRankAggreg: A fast clustering based partial rank aggregation”, Information Sciences, Elsevier, (2019).
[41] Rafi, M., Shaikh, M. S. and Faroq, A., “Document Clustering based on Topic Maps”, International Journal of Computer Applications, (2010).
[42] Shah, N. and Mahajan, S., “Document Clustering: A Detailed Review”, International Journal of Applied Information Systems, (2012).
[43] Hirsch, L. and Nuovo, A. D., “Document Clustering with Evolved Search Queries”, IEEE, (2017).

Citations:

 

APA:
Irfan, S., & Kumar, D. R., (2020). Improving Ranking Procedure in Information Retrieval Process Using Scrank Algorithm. International Journal of Advanced Science and Technology (IJAST), ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, 144, 1-16. doi: 10.33832/ijast.2020.144.01.

MLA:
Irfan, Shadab, et al. “Improving Ranking Procedure in Information Retrieval Process Using Scrank Algorithm.” International Journal of Advanced Science and Technology, ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, vol. 144, 2020, pp. 1-16. IJAST, http://article.nadiapub.com/IJAST/Vol144/1.html.

IEEE:
[1] S. Irfan, and D. Rajesh Kumar, "Improving Ranking Procedure in Information Retrieval Process Using Scrank Algorithm." International Journal of Advanced Science and Technology (IJAST), ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, vol. 144, pp. 1-16, November 2020.