Cardiovascular disease is now increasingly threatening to humanity. The accurate prediction of patients’ condition is significant to early prevention. This paper describes our research about how to predict patients’ risk of cardiovascular disease by processing their physical examination reports. We use five items (systolic pressure, diastolic pressure, triglyceride, high-density lipoprotein cholesterol and low-density lipoprotein cholesterol) to quantizer this risk in our research. To extract useful information from the medical records, we use natural language processing (NLP) method. To conserve the sentence into digital data, we use term frequency-inverse document frequency (TF-IDF) algorithm to extract major information from medical reports. Principal component analysis (PCA) algorithm is used to reduce the high dimension of text information data. Additionally, we extracted easy-transform numerical features and category features. Combining all these features, we use the xgboost algorithm to make final predictions. The results turn out to be well that the mean square error and relative error can be restricted to an acceptable low level.
Skip Nav Destination
Article navigation
10 January 2019
INTERNATIONAL CONFERENCE ON FRONTIERS OF BIOLOGICAL SCIENCES AND ENGINEERING (FBSE 2018)
23–24 November 2018
Chongqing City, China
Research Article|
January 10 2019
A data mining approach to predict risk of cardiovascular
Shaopeng Ma;
Shaopeng Ma
a)
1
Department of electronic engineering, Fudan University
, Shanghai 200433, China
Search for other works by this author on:
Xiong Chen
Xiong Chen
b)
1
Department of electronic engineering, Fudan University
, Shanghai 200433, China
b)Corresponding author email: chenxiong@fudan.edu.cn
Search for other works by this author on:
b)Corresponding author email: chenxiong@fudan.edu.cn
AIP Conf. Proc. 2058, 020014 (2019)
Citation
Shaopeng Ma, Xiong Chen; A data mining approach to predict risk of cardiovascular. AIP Conf. Proc. 10 January 2019; 2058 (1): 020014. https://doi.org/10.1063/1.5085527
Download citation file:
264
Views
Citing articles via
Related Content
Cardiovascular disease prediction with imputation techniques and recursive feature elimination
AIP Conference Proceedings (May 2023)
High frequency hearing impairment and cardiovascular disease in Canada: Results from the Canadian Health Measures Survey
J. Acoust. Soc. Am. (August 2021)
Lipoprotein interactions with a polyurethane and a polyethylene oxide-modified polyurethane at the plasma–material interface
Biointerphases (June 2016)
Antibody production of wild-type and enzyme V279F variants of PAF-AH as a risk factor for Cardiovascular disease
AIP Conference Proceedings (November 2017)
The efficacy of Nigella Sativa L extracts to reduce cardiovascular disease risk in diabetic dyslipidemia
AIP Conference Proceedings (July 2019)