## 1. chinaXiv:202105.00070 [pdf]

Subjects: Statistics >> Mathematical Statistics
Subjects: Computer Science >> Computer Application Technology
Subjects: Information Science and Systems Science >> Basic Disciplines of Information Science and Systems Science

 统计独立性是统计学和机器学习领域的基础性概念，如何表示和度量统计独立性是该领域的基本问题。Copula理论提供了统计相关性表示的理论工具，而Copula熵理论则给出了度量统计独立性的概念工具。本文综述了Copula熵的理论和应用，概述了其基本概念和定理，以及估计方法。介绍了Copula熵研究的最新进展，包括其在统计学四个基本问题（结构学习、关联发现、变量选择和时序因果发现等）上的理论应用。讨论了四个理论应用之间的关系，以及其对应的深层次的相关性和因果性概念之间的联系，并将Copula熵的（条件）独立性度量框架与基于核函数和距离的相关性度量框架进行了对比。简述了Copula熵在水文学、环境气象学、生态学、认知神经学、系统生物学、老年医学、公共卫生学，以及能源工程、制造工程和可靠性工程等领域的实际应用。

submitted time 2021-08-19

## 2. chinaXiv:202102.00001 [pdf]

Subjects: Statistics >> Applied Statistical Mathematics

 From a modelling perspective, our first contribution is to propose generalised linear regression GARMA (GLRGARMA) model and generalised linear regression SARMA (GLRSARMA) model with a innovative function of explanatory variables in order to extend GLGARMA to incorporate relevant information for model fitting and forecast in tourism area. Besides, the generalised Poisson (GP) distribution is adopted to accommodate over- equal- and under-dispersion for certain tourism data. Moreover, the performance of GLRGARMA model and GLRSARMA model with their nested sub-models are compared and evaluated using several well-known selection criteria. Our second contribution is to investigate the behaviour of tourism data. The pattern of long memory is examined. The analysis of Hurst exponent, ACF plot and periodogram plot shows that Gegenbauer long memory features are presented in tourism data. Furthermore, the distinct characteristics between Gegenbauer long memory and seasonality are demonstrated to reveal the that the GLRGARMA model is more suitable for modelling tourism data. Our third contribution is to derive a Bayesian approach via the efficient and user-friendly Rstan package in estimating our proposed models. For ML approach, the likelihood function is untractable because of involving very high dimensional integrals. Several monitors of convergence of posterior samples are discussed, such as the number of effective sample and bR estimate. The criteria for modelling performance are also derived.

submitted time 2021-01-30

## 3. chinaXiv:202002.00079 [pdf]

Subjects: Medicine, Pharmacy >> Preventive Medicine and Hygienics

 目的 建立一种数据驱动的实用方法预测突发全新传染性疾病的疫情趋势，通过动态预判疫情风险与分级为防控策略提供量化依据。方法 在移动平均法的基础上予以改进，提出一种移动平均预测限（Moving Average Prediction Limits, MAPL）方法，采用既往重症急性呼吸综合征（Severe Acute Respiratory Syndrome，SARS）疫情数据验证MAPL方法对疫情趋势和风险预判的实用性。追踪本次新型冠状病毒（COVID-19）感染疫情从2020年1月16日起的官方公布数据，采用MAPL方法预判疫情变动趋势与疫区适时风险分级。 结果 基于MAPL方法分析显示，2020年2月初全国COVID-19感染疫情达到峰值。经过前期积极防控，2月中旬起全国疫情整体呈下降趋势。到2月下旬各地疫情有明显的区域性差异。与湖北地区相比，非湖北地区新增病例数下降速度快且未来疫情加重的风险相对较小。在几个重要的疫情输入省份，新增确诊病例数和可疑病例数的发展趋势一致，但消减速度在各省份间存在差异。 结论 MAPL方法可以辅助判断疫情趋势并适时预判风险分级，各疫情输入区可结合当地实际与疫情风险分级规划落实差异化精准防控策略。

submitted time 2020-02-28

## 4. chinaXiv:202002.00028 [pdf]

Subjects: Survey & Drawing Science and Technology >> Photogrammetry and Remote Sensing
Subjects: Statistics >> Biomedical Statistics

 目前，新型冠状病毒（2019-nCoV）疫情正受到全球各科研工作者的广泛关注。然而，当前尚没有一个官方的渠道对2019-nCoV疫情数据进行实时开源，为了促进本次疫情相关的科研工作，本研究旨在为广大科研工作者提供权威的、开放的和多尺度的新型冠状病毒（2019-nCoV）时空数据集，为疫情监测、防控、预测和预警提供重要的数据来源。此外，该数据集还能应用于2019-nCoV疫情的多尺度、多时相制图和可视化，为疫情的空间分布、演化、趋势分析和模拟预测提供指导。

submitted time 2020-02-19

## 6. chinaXiv:201904.00096 [pdf]

Subjects: Statistics >> Applied Statistical Mathematics

 Objective: The lifetime difference in adjacent parallel structure components becomes small as the number of components belonging to the same parallel structure increases. To infer the system structure, we must clarify the components that belong to the same parallel structure. Methods: A strengthened change point detection model (SCPDM) for weak mean difference data (WMDD) is established, which usually indicates that, as affected by a large variance, the mean difference in two subsignals for one data sequence becomes nonsignificant. For repeatedly retrievable WMDD, we performed two enhanced operations that doubled the mean difference by using the variance information and analyzed the asymptotic properties of the enhanced data. Then, we proposed an SCPDM based on the asymptotic results.Results: Finally, we compared the SCPDM with two other main change point detection models and verified that the SCPDM is superior to other models using WMDD change point detection by the simulation method.Limitations: This paper also have several limitations. First, we only discussed that are independent with normal distribution and single change point. Second, the reason why the relationship between and has an important influence on the accuracy of change point detection is not discussed in depth. We only defined the ratio boundary of WMDD by experience and simulation. Conclusions: Traditional change point detection models may become insensitive or ineffective for WMDD. We gave some asymptotic analysis and established a enhanced change point detection model (SCPDM) based on the asymptotic results. Compared with the traditional method, SCPDM can effectively detect the change point.

submitted time 2019-04-22

## 7. chinaXiv:201809.00188 [pdf]

Subjects: Statistics >> Mathematical Statistics

 受许多事物具有齐夫现象的启发，本文提出了排序后PPS抽样方法，并给出了修正汉森-赫维茨估计量及其方差。在此过程中本文解决了，长期以来抽样调查实践中将重要单元直接入样时，多少重要单元直接入样没有明确方法的问题，本文给出了理论依据和具体的确定方法。最后通过一个例子和中国城市人口抽样调查的案例，展示了修正汉森-赫维茨估计量的优势，并对这一研究方法做了总结和展望。

submitted time 2018-09-26

