Your conditions: 吴凯莹
  • Clinical Term Normalization Based on Multiple Strategies

    Subjects: Computer Science >> Natural Language Understanding and Machine Translation submitted time 2023-07-11

    Abstract: The clinical term normalization has important research significance for dealing with the problem of non-standardization of clinical terminology in electronic medical records. The current mainstream solution is to adopt a "recall-sort" strategy. Based on the dataset provided in Evaluation 3 of the China Conference of Health Information Processing, we propose a multi-strategy-based normalization method for clinical terms. In the recall phase, the full-matching strategy, standard words recommendation of similar original words, and similarity calculation based on the TF-IDF and the improved Jaccard coefficient are used to recall the candidate standard word set. At the same time, we construct a standard quantity prediction model based on the BERT model, and use adversarial training, focal loss and label smoothing strategies to effectively improve the prediction performance and generalization performance of the models. In the ranking stage, In the ranking stage, we use the BERT implicit score ranking model based on adversarial training and fusion of diagnostic information to rank the candidate word set, and then generate the final predicted standard words based on the output of the quantity prediction model. In the final evaluation test set, the method accuracy rate of our method reached 0.6356, ranking second place among the participating teams.