摘要：The principal component analysis (PCA) is a frequently used machine learning method. In this paper, the PCA operation is explained by examples with Python program illustration. A proof of the diagonalizability of real symmetric matrix is also included, which may help to understand the mathematics behind PCA.
摘要： From a modelling perspective, our first contribution is to propose generalised linear regression GARMA (GLRGARMA) model and generalised linear regression SARMA (GLRSARMA) model with a innovative function of explanatory variables in order to extend GLGARMA to incorporate relevant information for model fitting and forecast in tourism area. Besides, the generalised Poisson (GP) distribution is adopted to accommodate over- equal- and under-dispersion for certain tourism data. Moreover, the performance of GLRGARMA model and GLRSARMA model with their nested sub-models are compared and evaluated using several well-known selection criteria. Our second contribution is to investigate the behaviour of tourism data. The pattern of long memory is examined. The analysis of Hurst exponent, ACF plot and periodogram plot shows that Gegenbauer long memory features are presented in tourism data. Furthermore, the distinct characteristics between Gegenbauer long memory and seasonality are demonstrated to reveal the that the GLRGARMA model is more suitable for modelling tourism data. Our third contribution is to derive a Bayesian approach via the efficient and user-friendly Rstan package in estimating our proposed models. For ML approach, the likelihood function is untractable because of involving very high dimensional integrals. Several monitors of convergence of posterior samples are discussed, such as the number of effective sample and bR estimate. The criteria for modelling performance are also derived.
摘要： Objective: The lifetime difference in adjacent parallel structure components becomes small as the number of components belonging to the same parallel structure increases. To infer the system structure, we must clarify the components that belong to the same parallel structure. Methods: A strengthened change point detection model (SCPDM) for weak mean difference data (WMDD) is established, which usually indicates that, as affected by a large variance, the mean difference in two subsignals for one data sequence becomes nonsignificant. For repeatedly retrievable WMDD, we performed two enhanced operations that doubled the mean difference by using the variance information and analyzed the asymptotic properties of the enhanced data. Then, we proposed an SCPDM based on the asymptotic results.Results: Finally, we compared the SCPDM with two other main change point detection models and verified that the SCPDM is superior to other models using WMDD change point detection by the simulation method.Limitations: This paper also have several limitations. First, we only discussed that are independent with normal distribution and single change point. Second, the reason why the relationship between and has an important influence on the accuracy of change point detection is not discussed in depth. We only defined the ratio boundary of WMDD by experience and simulation. Conclusions: Traditional change point detection models may become insensitive or ineffective for WMDD. We gave some asymptotic analysis and established a enhanced change point detection model (SCPDM) based on the asymptotic results. Compared with the traditional method, SCPDM can effectively detect the change point.