Abstract:Purpose/Significance A method is proposed for extracting data elements from electronic medical records (EMR) based on national standards, helping to achieve fine-grained sharing of EMR data. Method/Process The ALBERT, BILSTM and CRF models are used to perform sequence labeling on EMR, and a set of candidate data elements based on labeling results are generated. For any candidate data elements, the contextual information is collected to form an enhanced key vector. Then the similarity between the vector and the standard vector is calculated to determine whether the candidate data element is valid. Result/Conclusion The F1 value is 90.32%, indicating the proposed method has a good performance.