English | 简体中文 400 821 3659 | info@meehealth.com

Health and medical big data review

2017-09-01 Source: World Medical Devices

Author: Zhang Jiwu August 2017 published in the "World of Medical Devices" special issue (as cited article, please indicate the source, thank you.)

The progress of human society, like the progress of nature and the advancement of science, has its inevitable laws and its relevance. Big data is not suddenly emerging. Big data itself is a process or form of information development.
Take the health industry as an example.
(1) Medical informationization digitizes and informatizes the amount of simulation;
(2) Cloud provides a distributed management and service technology platform;
(3) Big data Under the accumulation of such technology, human beings are able to acquire, store and process data in large quantities, thereby achieving the purpose of extracting information, rules and knowledge to serve humanity.
The three are not substitutes, but support each other, and are essentially informational.
Big data definition: refers to the traditional data processing application software is not enough to handle its large (large amount) or complex data collection (extracted from Baidu, Google).
In the process of big data development, it involves the medical industry, the information industry, standardization, precision medical treatment, etc. Correspondingly, international and domestic data, ownership, security, and privacy have different methodological and regulatory support.
At present, it is recognized that big data has four basic characteristics (this article focuses on the field of health care):

First, the amount of data is huge

The basic unit of data minimum is bit, which gives all units in order: bit, Byte, KB, MB, GB, TB, PB, EB, ZB, YB, BB, NB, DB. They are calculated according to a rate of 1024 (2 to the power of ten).

Second, the variety of data types
Big data includes structured, semi-structured, and unstructured data, and unstructured data is increasingly becoming a major part of the data. The sources of general health care big data mainly include:

*Structural electronic medical record data Structured EMR Data
*Unstructured clinical record Unstructured Clinical Notes
*Medical Imaging Data
*Gene Data
*Other data (epidemic and behavioral) Other Data (Epidemiology & Behavioral)

Electronic medical record data
General data types commonly used in the United States
a. International Classification of Diseases (ICD): A classification term for diseases, signs, symptoms, and procedure codes, maintained by the World Health Organization (WHO);
b. CPT – Current Procedural Terminology: This code set is a collection of medical codes maintained by the American Medical Association through the CPT Editorial Group;
c. Inspection results (Lab): The standard code for the inspection results is the logical observation identifier name and code (LOINC®);
d. Medication: The standard code is the US Food and Drug Administration (FDA) National Drug Administration Regulations (NDC), which provides a unique identifier for each drug;
e. Clinical Notes (Clinical Notes).
2. Medical image data
As of 2015, general hospitals have 665 terabytes of patient data, 80% of which are unstructured medical image data such as CT, magnetic resonance, and digital X-ray.
The main challenge of medical imaging data is not only the huge amount of data, but also the high dimensional and high complexity. Extracting important and relevant features of an image is a daunting task. These challenges include:
a. Extract meaningful features
b. Select relevant features (sparse and dimensionality reduction techniques)
c. Integration with other clinical data
At present, the main work achievement is to extract the features of related images for image retrieval.

3. Gene data
The human genome contains approximately 3 billion pairs of DNA base pairs, which are arranged in base sequences with four bases: thymine (T), adenine (A), cytosine (C), and guanine (G). Some of the base pairs make up about 20,000 to 25,000 genes. The amount of genetic data for a person is about 3 GB.

4. Behavioral and public health data
Social media generates a large amount of data, such as blogs, WeChat, SMS, Facebook, Twitter, etc. Foreign teams have analyzed the media data to analyze behaviors, the use of certain drugs, and even the prediction of epidemics ( Almost in real time). The chart below shows the correspondence between Google Flu Trends Analysis and actual flu outbreaks.
There are also institutions such as Patientlikeme (connected to the disease) to help patients provide symptom-based treatment advice, treatment analysis, etc. through social media big data analysis. Continuously recording individual home information (such as using the iPhone's motion-aware device) can analyze and alert individuals' health and behavior.

Third, the key to the current development of medical health big data
The application and research of big data is multi-layered. The first is data acquisition, data modeling, and then data processing, analysis, knowledge acquisition, cognition, and application.
For the development of big data applications in China, the current strategic considerations to break through development are to solve the bottleneck problem of data acquisition and data modeling, and to study the corresponding system platform and methodology. Including, key technologies for data acquisition, establishment and promotion of data collection interconnection standards; data quality, including data patterns, management of heterogeneous data, correlation between data, time distribution of data; data mining, clinical data Feature parameter extraction; data application, clinical data mining methodology applied to clinically assisted diagnosis of CDSS model; precision medical research.
1. Collection of medical big data
A large amount of data can analyze the correlation of disease, symptoms and laboratory data, helping clinical researchers to establish predictive models for some typical diseases. In the hospital's diagnosis and treatment process, for the specific application of each department, long-term clinical monitoring parameters related to specific diseases have been accumulated, and a large amount of data has been accumulated with the operation process of the hospital.
At the same time, with the development of mobile Internet technology and wearable medical devices and technologies, user vital signs obtained through various wearable devices provide great convenience for the acquisition of user health data.
On the one hand, the health data can be analyzed to obtain the user's health information to guide the living habits such as exercise and diet; on the other hand, the combination with medical data can improve the scientific and diagnostic accuracy of the user's disease diagnosis.

2. Analysis of medical big data
In the traditional medical industry, the hospital information system has completed the process control and data accumulation within the hospital. The medical industry has long encountered the challenge of massive data and unstructured data. In recent years, many countries are actively promoting the development of medical information, which makes many medical institutions have the funds to do big data analysis. Medical data is the data generated by medical personnel during the patient's diagnosis and treatment process, including the patient's basic condition, behavior data, medical treatment data, management data, inspection data, electronic medical records, and so on. In modern hospitals, the above data is stored in various information systems of hospitals, which is the basis of medical big data analysis.
Medical health data is a continuous and high-growth complex data, and the information value is rich and diverse. The effective storage, processing, query and analysis of medical health data, tapping its potential value, and discovering medical knowledge will deeply affect human health. Level and treatment. On the basis of traditional medical statistical methods, the emergence of new models and technologies provides new ideas for acquiring new knowledge from data.
Different types of patients are used to reason and judge different types of physiological data and health perception data. The big data analysis technology achieves the purpose of serving clinical treatment, predicting the incidence of diseases, and tracking the patient's condition.

3. Application of medical big data
Based on the collection and analysis of the user's medical data and health monitoring data, it is possible to predict and monitor the user's physical condition, and even to determine which type of disease the user is susceptible to. Improve the user's health status and reduce the risk of the user's disease. Accurate analysis of large data sets including patient vital data, cost data, and efficacy data can help doctors determine the most effective and cost-effective treatments in the clinic. Medical care systems will have the potential to reduce over-treatment, such as treatments that avoid side effects greater than efficacy.

Fourth, big data related technical field
The technology areas related to big data are wearable devices, the Internet of Things (IoT).

V. Summary
McKinsey research shows that applying big data technology can save the health care industry costs of $450 billion.
1. Right living: sufficient information and knowledge to help people effectively prevent specific diseases, continuous health and treatment, and take a more active lifestyle to improve their health.
2, correct care (Right care): through the big data analysis to ensure the correctness of the treatment plan (clinical support, clinical pathway), and can enable different medical staff to cooperate on the basis of common information;
3. Right provider: The ability and behavior record analysis of the health care provider helps both parties choose the right service provider;
4. Right value: Big data obviously can control medical expenses and improve medical quality.
5, Right innovation (Right innovation): Big data is important for the development of new drugs, new treatment plans, etc., can save the time of the clinical phase of drug supervision and management requirements.