Abstract:Depending on Shanghai medical big data center and taking the medical big data after quality control and before data utilization as research object, the paper establishes the data cleaning frame, gives the evaluation method for data availability, finds out the corresponding cleaning strategies according to the clustering analysis of data characteristics and repeatedly deduces the accuracy, reliability of the strategy, thus providing a strong support for the analysis and utilization of medical big data.