Enhanced K-means Clustering Approach for Health Care Analysis Using Clinical Documents

  • Effat Naaz SCSE Department of VIT, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014
  • Divya Sharma SCSE Department of VIT, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014
  • D. Sirisha SCSE Department of VIT, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014
  • M. Venkatesan SCSE Department of VIT, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014
Keywords: Clinical note; clinical document; document clustering; K-means Clustering; Machine learning


Clinical documents contain enormous amount of medical information. These documents are gold mine of information for medical treatment of various diseases and their symptoms along with their prescribed medications. Data mining techniques when applied on this clinical data is vital source to improve the current healthcare system by making it more efficient. We define an approach to build a system that firstly pre-processes the clinical documents. Pre-processing of textual data will amplify the performance of Clustering. Then we apply the K-means clustering on the pre-processed notes. Extraction of symptoms and medication names on the clustered data results in improved medication recommendation. Our experiments show that K-means clustering is a favored approach for clustering of clinical documents.