UNIFORM SELECTION ALGORITHM FOR BIG DATA RANDOM PROCESS AND REPEAT USE BASED ON TOPOLOGICAL LOCATION

Published 30 jun 2019 •  vol 127  • 


Authors:

 

Jaehoon You, Department of Computer Engineering, Chung-Nam National University, Korea
Cheong Youn, Department of Computer Engineering, Chung-Nam National University, Korea

Abstract:

 

Since the information revolution, massive online content has been placed on the Internet, and it has become possible to use it indefinitely as long as it is in demand. In the past, such content was disposable, but products having higher frequency of use have been explosively increasing because of the development of technology. Existing algorithms are biased towards simple sorting and aggregation. In this paper, we define an algorithm based on topological location that guarantees uniform selection, weight selection, and fast processing speed for problems that occur from mass consumption and repeatable reuse of tangible and intangible products, and we attempt to demonstrate these features through experiments. Disadvantages of the simple random sampling method are that it cannot guarantee a uniform selection and that it does not offer a way to select using different weights. Hence, the proposed algorithm assigns priority to the uniform probability distribution without making an assumption about the distribution. Further, by narrowing the selection range by digitizing the selection frequency to the topographical position, uniformity of random selection is ensured, and the processing speed for mass selection can be dramatically increased.

Keywords:

 

Algorithm, Random, Big data, Select, Topography, Process

References:

 

[1] Yinglei Wang, Wing-kei Yu, Shuo Wu, Greg Malysa, G. Edward Suh, Edwin C. Kan. “Flash Memory for Ubiquitous Hardware Security Functions: True Random Number Generation and Device Fingerprints.” IEEE Symposium on Security and Privacy, Kota Kinabalu, Malaysia, 20-23 May 2012, pp. 3-47, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6233637.
[2] Charles Eckert, Fatemeh Tehranipoor, John A. Chandy. "DRNG: DRAM-Based Random Number Generation Using its Startup Value Behavior." IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, USA, 6-9 August 2017, pp. 1260-1263, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8039346.
[3] Alexandros Kalousis, Julien Prados, Melanie Hilario. “Stability of Feature Selection Algorithms: A Study on High Dimensional Spaces.” Knowledge and Information Systems 12.1 (2007): 95-116.
[4] Salem Alelyani, Zheng Zhao, Huan Liu. “A Dilemma in Assessing Stability of Feature Selection Algorithms.” 2011 IEEE 13th International. Conference on High Performance Computing and Communications, Banff, AB, Canada, 2-4 September 2011, pp. 701-707, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6062562.
[5] D. B. Thomas, Wayne Luk. "FPGA-Optimised Uniform Random Number Genera-tors Using LUTs and Shift Registers." 2010 International Conference on Field Pro-grammable Logic and Applications, Milano, Italy, 31 August – 2 September 2010, pp. 77-82, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5690273.
[6] D.B. Thomas, Wayne Luk, "The LUT-SR Family of Uniform Random Number Generators for FPGA Architectures." IEEE Tran. Very Large Scale Integration (VLSI) Systems 21.4 (2013): 761-770.
[7] F. Brglez, G. Gloster, G. Kedem. "Built-In Self-Test with Weighted Random Pattern Hardware." 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors, Cambridge, MA, USA, USA, 17-19 September 1990, pp. 161-166, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=308.
[8] Sungjin Im, Shi Li. "Better Unrelated Machine Scheduling for Weighted Completion Time via Random Offsets from Non-uniform Distributions." 2016 IEEE 57th Annual Symp. on Foundations of Computer Science (FOCS), New Brunswick, NJ, USA, 9-11 October 2016, pp. 138-147, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7781469.
[9] Sigui Hu. "Optimum Truncated Sequential Test of Binomial Distribution." 2011 9th International Conference on Reliability, Maintainability and Safety (ICRMS), Guiyang, China, 12-15 June 2011, pp. 293-298, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5963915.
[10] E.W. Weisstein, “Probability.” From MathWorld—A Wolfram Web Resource, 2003, http://mathworld.wolfram.com/Probability.html. Accessed 25th October 2018.
[11] E.W. Weisstein, “Statistics,” From MathWorld—A Wolfram Web Resource, 2006, http://mathworld.wolfram.com/Statistics.html. Accessed 25th October 2018.
[12] E.W. Weisstein, “Standard Deviation,” From MathWorld—A Wolfram Web Resource, 2011, http://mathworld.wolfram.com/StandardDeviation.html. Accessed 25th October 2018.
[13] Jaehoon You, Sanggook Han, Cheong Youn. “Study on the random selection algorithm based on the geographical position.” Asia-Pacific Proc. of Applied Science and Engineering for Better Human Life, 16 August 2016, pp. 1-4.
[14] Guo-Sheng Yang, Ting Wang, Huan-Long Zhang. "Eye Location Method Based on Gabor Wavelet and Topographic Feature Extraction." 2008 International Conference on Machine Learning and Cybernetics, Kunming, China, 12-15 July 2008, pp. 2857-2862, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=4604641.
[15] Yang Xiaobo, Chen Chuxiang, Wang Zhiwan. "Improved LFM Algorithm in Weighted Network Based on Rand Walk," 2017 29th Chinese Control and Decision Conf. (CCDC), Chongqing, China, 28-30 May 2017, pp. 3719-3723, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=7961861.
[16] Yulin He, Joshua Zhexue Huang, Hao Long, Qiang Wang, Chenghao Wei. "I-Sampling: A New Block-Based Sampling Method for Large-Scale Dataset," 2017 IEEE Internationa. Congress on Big Data (BigData Congress), Honolulu, HI, USA, 25-30 June 2017, pp. 360-367, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8027154.
[17] Xing-Min Chen, Chao Gao, Ming-Kun Zhang, Yi-Da Qin. "Randomized Gradient-Free Distributed Algorithms Through Sequential Gaussian Smoothing," 2017 36th Chinese Control Conference (CCC), Dalian, China, 26-28 July 2017, pp. 8407-8412, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=8020036.
[18] Milan Vojnovi´, Fei Xu, Jingren Zhou. “Sampling Based Range Partition Methods for Big Data Analytics,” Technical report MSR-TR-2012-18, Microsoft Research, Redmond, WA, March 2012.
[19] T. Schoning. “A Probabilistic Algorithm for K-Sat and Constraint Satisfaction Problems,” FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science, New York City, NY, USA, USA, 17-19 October 1999, pp. 410-414, IEEE Xplore, ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=6604.

Citations:

 

APA:
You, J., & Youn, C. (2019). Uniform Selection Algorithm for Big Data Random Process and Repeat Use Based on Topological Location. International Journal of Advanced Science and Technology (IJAST), ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, 127, 25-40. doi: 10.33832/ijast.2019.127.03.

MLA:
You, Jaehoon, et al. “Uniform Selection Algorithm for Big Data Random Process and Repeat Use Based on Topological Location.” International Journal of Advanced Science and Technology, ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, vol. 127, 2019, pp. 25-40. IJAST, http://article.nadiapub.com/IJAST/Vol127/3.html.

IEEE:
[1] J. You, and C. Youn, “Uniform Selection Algorithm for Big Data Random Process and Repeat Use Based on Topological Location.” International Journal of Advanced Science and Technology (IJAST), ISSN: 2005-4238(Print); 2207-6360 (Online), NADIA, vol. 127, pp. 25-40, Jun. 2019.