Integrative Complexity Modeling in English and Chinese Texts based on large language model

Author: Li Dongqi ^1,2 Zhu Tingshao ^1,2
Institute:

1. Institute of Psychology, Chinese Academy of Sciences, Beijing

2. Department of Psychology, University of Chinese Academy of Sciences, Beijing
Submit Time:2024-04-10 17:09:58

Abstract: Integrative complexity is a concept used in psychology to measure the structure of an individual’s thinking in two aspects: differentiation and integration. The measurement of integrative complexity relies primarily on manual analysis of textual content, which can be written materials, speeches, interview transcript large language models, or any other form of oral or written expression. To solve the problems of high cost of manual assessment methods, low accuracy of automated assessment methods, and the lack of Chinese text assessment scheme, this study designed an automated assessment scheme for integrative complexity on Chinese and English texts. We utilized text data enhancement technique of the large language model and the model migration technique for the assessment of integrative complexity, and explored the automated assessment methods for the two sub-structures of integrative complexity, namely, the fine integration complexity and the dialectical integration complexity. In this paper, two studies are designed and implemented. Firstly, a prediction model for the integration complexity of English text is implemented based on the text data enhancement technology of large language model; secondly, a prediction model for the integration complexity of Chinese text is implemented based on the model transfer technology. The results showed that: 1) We used GPT-3.5-Tubo for English text data enhancement, a pre-trained multilingual Roberta model for word vector extraction, and a text convolutional neural network model as a downstream model. The Spearman correlation coefficient between this model’s prediction of integration complexity and the manual scoring results was 0.62, with a dialectical integration complexity correlation coefficient of 0.51 and a fine integration complexity Spearman correlation coefficient of 0.60. It is superior to machine learning methods and neural network models without data enhancement. 2) In Study 2, a model with the same structure as the neural network in Study 1 was established, and the final model parameters in Study 1 were also transferred to the model in this study to train the integration complexity prediction model based on Chinese text. In the case of zero samples, the Spearman correlation coefficients of the transfer learning model for integrative complexity are 0.31, the Spearman correlation coefficient of dialectical integration complexity is 0.31, and the correlation coefficient of fine integration complexity is 0.33, all of which are better than the model in the case of random parameters (integrative complexity: 0.17, dialectical integrative complexity: 0.10, fine integrative complexity: 0.10). In the case of small samples, the Spearman correlation coefficient of the transfer learning model was 0.73, with a dialectical integration complexity correlation coefficient of 0.51 and a fine integration complexity correlation coefficient of 0.73.

Integrative Complexity Neural Networks Large Language Models Transfer Learning

From: 朱廷劭
Subject: Psychology >> Applied Psychology Computer Science >> Computer Application Technology
Contribution： No Submitted
Cite as: ChinaXiv:202404.00195 (or this version ChinaXiv:202404.00195V1)
DOI:10.12074/202404.00195V1
CSTR:32003.36.ChinaXiv.202404.00195.V1
Recommended references： 李东启,朱廷劭.(2024).基于大语言模型的中英文整合复杂性建模研究.中国科学院科技论文预发布平台.doi:10.12074/202404.00195V1 (Click&Copy)

Version History

[V1]

2024-04-10 17:09:58

ChinaXiv:202404.00195V1

Download

Related Paper

1. Turing’s thinking machine and ’t Hooft’s principle of superposition of states	2024-05-14
2. 恶意代码SCMP分类方法框架与风险行为多标签机制	2024-05-09
3. Guiding Large Language Models to Generate Computer-Parsable Content	2024-04-23
4. SteganoDDPM: A high-quality image steganography self-learning method using diffusion model	2024-04-23
5. 引导大语言模型生成计算机可解析内容	2024-04-21
6. 大模型与标准文献知识库的融合应用探索	2024-04-10
7. 简体中文LIWC2024(SCLIWC2024)词典的修订与验证	2024-04-09
8. Multimodal Physical Fitness Monitoring (PFM) Framework Based on TimeMAE-PFM in Wearable Scenarios	2024-04-07
9. 引导大语言模型生成计算机可解析内容	2024-04-07
10. Terrain Point Cloud Inpainting via Signal Decomposition	2024-04-05


Public comments Anonymous comments Send only to author