Your conditions: 丁敬安
  • 基于梯度提升决策模型的空间占用检测研究

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-18 Cooperative journals: 《计算机应用研究》

    Abstract: With the green buildings and green-economic environmental cities are gradually formed, "big data green building" energy conservation systems come into being. However, a large number of multi-dimensional building data are not fully utilized and occupancy detection with accuracy of traditional algorithms is not accurate with the higher time complexity. This article acquired the data of Occupancy Detection from UCI. Add a timestamp to the original dataset, the accuracy is increased. Using the MCMR method to select features with maximum correlation and minimum redundancy, random forest is using as classifier to verify classification effect . The XGBoost model constructed by the optimal subset is compared with the random forest model (RF) , and the classification accuracy is higher and the time complexity is lower.

  • 基于特征选择与集成学习的钓鱼网站检测方法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-05-02 Cooperative journals: 《计算机应用研究》

    Abstract: In view of the fact that most phishing websites detection methods have the problems of low detection accuracy and high false positive rate and other issues, this paper proposed a phishing website detection method based on feature selection and integrated learning. Firstly, the FSIGR algorithm was used to select feature. The FSIGR algorithm combined with the advantages of filter and wrapper modes. First, it carried out a comprehensive measurement of features from two aspects of information correlation and classification ability. Second, it used recursive elimination after increasing forward strategy to select the features, and used the classification accuracy as the evaluation index to measure and select the feature subset. Finally, it obtained the optimal feature subset. Then, based on random forest integrated learning classification algorithm, it trained the selected optimal feature subset. Experiments on the UCI dataset show that this method can improve the accuracy of phishing websites detection and reduce the false positive rate.

  • 基于filter+wrapper模式的特征选择算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-17 Cooperative journals: 《计算机应用研究》

    Abstract: Feature selection is one of the most important issues in data mining, machine learning and pattern recognition. Aiming at the problem of preference of traditional information gain algorithm in feature selection when the class and feature are unevenly distributed, this paper proposes a new feature selection algorithm based on information gain ratio and random forest. The proposed algorithm combined with the advantages of Filter and Wrapper modes. First, a comprehensive measurement of features is carried out from two aspects of information correlation and classification ability. Second, Sequential Forward Selection (SFS) strategy is used to select the features, and the classification accuracy is used as the evaluation index to measure the feature subset. Finally, obtain the optimal feature subset. The experimental results show that the proposed algorithm can not only achieve the effect of dimension reduction in feature space, but also effectively improve the classification performance and recall rate of classification algorithm.