Abstract:It has great significance for the processing and mining of diabetes Electronic Medical Records (EMR) containing mass data. By use of SQL statements and functions, based on SQL Server 2008, the paper preprocesses data of diabetes EMR, including data cleaning, integration, transformation and reducing, etc. It eliminates noisy, incomplete and inconsistent data and transforms unstructured text data to structured numerical data.