Abstract:The paper introduces the data structure and conceptual hierarchy of pathological texts. Based on unstructured pathological texts, at first, it analyzes the structure of pathological texts other than prefaces. Next, it conducts pattern extraction and generalization of the pathological texts by pattern matching. At last, it extracts structured information from the participle sequence. As proved by the experiment, this method can achieve a high accuracy and recall rate.