您当前的位置: > 详细浏览

Galaxy Morphology Classification Using a Semi-Supervised Learning Algorithm Based on Dynamic Threshold

请选择邀稿期刊:
  • 作者:
  • 作者单位: 暂无;
  • 提交时间:2023-09-08

摘要: Machine learning has become a crucial technique for classifying the morphology of galaxies as a result of the meteoric development of galactic data. Unfortunately, traditional supervised learning has significant learning costs since it needs a lot of labeled data to be effective. FixMatch, a semi-supervised learning algorithm that serves as a good method, is now a key tool for using large amounts of unlabeled data. Nevertheless, the performance degrades significantly when dealing with large, imbalanced datasets since FixMatch uses a fixed threshold to filter pseudo labels. Therefore, this study proposes a dynamic threshold alignment (DTA) algorithm based on the FixMatch model. First, the class with the highest amount has its reliable pseudo label ratio determined, and the remaining classes' reliable pseudo label ratios are approximated in accordance. Second, based on the predicted reliable pseudo label ratio for each category, dynamically calculate the threshold for choosing pseudo labels. By employing this dynamic threshold, the accuracy bias of each category is decreased and the learning of classes with less samples is improved. Experimental results show that in galaxy morphology classification tasks, compared with supervised learning, the proposed algorithm significantly improves performance. When the amount of labeled data is 100, the accuracy and F1-score are improved by 12.8% and 12.6%, respectively. Compared with popular semi-supervised algorithms such as FixMatch and MixMatch, the proposed algorithm has better classification performance, greatly reducing the accuracy bias of each category. When the amount of labeled data is 1000, the accuracy of the cigar-shaped smooth galaxy with the least samples is improved by 25.87% compared to FixMatch.

版本历史

[V1] 2023-09-08 14:39:45 ChinaXiv:202309.00137V1 下载全文

相关论文推荐

点击下载全文
预览
许可声明
metrics指标
  •  点击量62
  •  下载量38
评论
分享