Abstract:On the premise of not affecting the HDFS decentralized storage structure, the dynamic copy storage and galohua finite field theory was combined to optimize the calculation and calculation mode of van der Monde code, so that the time cost and the memory pressure of the coding and decoding were reduced. About 35% storage costs of HDFS were saved, and the balance ability of the node load and decoding efficiency of the HDFS system were improved. This algorithm is more suitable for the process of medical professional documents, and meets clinical research needs and data supply. It can save the storage capacity, can accommodate the increasing and more complex medical data, can reduce the cost of hardware server, save the capital cost for the hospital, quickly query and obtain the effective data in the data pool, make the lying data live, and give full play to their clinical use value and scientific research value. This complete and systematic optimization plan provides an effective way for the development of HDFS in the future.
杨莲, 郭良君, 马磊, 王圣芳. 大数据环境下hadoop分布式文件系统分散式动态副本存储优化策略研究[J]. 中国医院统计, 2019, 26(1): 75-78.
Yang Lian, Guo Liangjun, Ma Lei, Wang Shengfang. Research on HDFS decentralized dynamic replica storage optimization strategy in big data environment. journal1, 2019, 26(1): 75-78.
[1] 罗象宏,舒继武.存储系统中的纠码研究综述[J].计算机研究与发展,2012,49(1):1-11. [2] 周傲英,金澈清,王国仁,等.不确定性数据管理技术研究综述[J].计算机学报,2009,32(1):1-16. [3] 冯登国,张敏,李昊.大数据安全与隐私保护[J].计算机学报,2014,31(1):246-258. [4] PANIAN Z. A new data management challenge:How to handle big data//Proceedings of the International Conference on Humanities,Geography and Economics.Dubai,UAE,2013:47-51. [5] 陈宝纯.基于纠删码与HDFS的云文件系统[D].长春:吉林大学,2012. [6] GUI JF,ZHANG Y,LI C,et al.A packaging approach for massive amounts of small geospatial files with HDFS[C]//Proceedings of the Web-Age Informoation Management,Beijing,China,2012:210-215. [7] 董新华,李瑞轩,周湾湾,等.Hadoop系统性能优化与功能增强综述[J].计算机研究与发展,2013,50:1-15. [8] ROUSSEAU R. A view on big data and its relation to informetrics[J].Chinese Journal of Library and Information Science,2012,5(3):12-26. [9] ZHAO TZ,YUAN HQ.Performance analysis of distributed file systems for data-intensive applications[C]//Proceedings of the 2013 IEEE International Conference on Computer Science and Automation Engineering,Guangzhou,China,2012:1417-1420. [10]李晓凯,代翔,李文杰,等.基于纠删码和动态副本策略的HDFS改进系统[J].计算机应用,2012:1417-1420. [11]朱媛媛,王晓京.基于GE码的HDFS优化方案[J].计算机应用,2013,33(3):730-733. [12]ZHU R,WANG GR.Inedxing uncertain data for supporting range queries[C]//Proceedings of the Web-Age Information Management(WIAM’84).Macau,China,2014:72-83. [13]史英杰,孟小峰.云数据管理系统中查询技术研究综述[J].计算机学报,2013,36(2):209-225.