A Distributed Storage Model for Healthcare Big Data Designed on HBase.

Lu Zhang, Qi Li, Ye Li, Yunpeng Cai
Author Information

Abstract

With the explosive growth of healthcare data, traditional relational database management systems (RDBMS) are limited in scalability, storage of unstructured data, concurrency and cost. Thus we proposed a snowflake model based on HBase with a multi-table structure and three kinds of index tables for efficient assess of large-scale health records. A guideline for designing of index tables was proposed which covers the demands of most healthcare data processing applications. A benchmark test was carried out with six types of queries on a large dataset comprising 750 million records to compare the performance of the proposed model against the traditional tall-table model on HBase. We found that the snowflake model was more efficient than the tall-table model. The adoption of index tables could greatly improve the query speed and provided real-time queries for two models. In general, snowflake model could be used for managing large-scale healthcare data as an advantageous alternative.

MeSH Term

Benchmarking
Big Data
Database Management Systems
Medical Records Systems, Computerized

Word Cloud

Created with Highcharts 10.0.0modeldatahealthcareproposedsnowflakeHBaseindextablestraditionalefficientlarge-scalerecordsqueriestall-tableexplosivegrowthrelationaldatabasemanagementsystemsRDBMSlimitedscalabilitystorageunstructuredconcurrencycostThusbasedmulti-tablestructurethreekindsassesshealthguidelinedesigningcoversdemandsprocessingapplicationsbenchmarktestcarriedsixtypeslargedatasetcomprising750millioncompareperformancefoundadoptiongreatlyimprovequeryspeedprovidedreal-timetwomodelsgeneralusedmanagingadvantageousalternativeDistributedStorageModelHealthcareBigDataDesigned

Similar Articles

Cited By