Current Location:home > Browse

1. chinaXiv:202105.00070 [pdf]

Copula熵:理论和应用

马健
Subjects: Statistics >> Mathematical Statistics
Subjects: Computer Science >> Computer Application Technology
Subjects: Information Science and Systems Science >> Basic Disciplines of Information Science and Systems Science

统计独立性是统计学和机器学习领域的基础性概念,如何表示和度量统计独立性是该领域的基本问题。 Copula 理论提供了统计相关性表示的理论工具,而 Copula 熵理论则给出了度量统计独立性的概念工具。本文综述了 Copula熵的理论和应用,概述了其基本概念定义、定理和性质,以及估计方法。介绍了 Copula 熵研究的最新进展,包括其在统计学四个基本问题(结构学习、关联发现、变量选择和时序因果发现等)上的理论应用。讨论了四个理论应用之间的关系,以及其对应的深层次的相关性和因果性概念之间的联系,并将 Copula 熵的(条件)独立性度量框架与基于核函数和距离的相关性度量框架进行了对比。简述了 Copula 熵在理论物理学、化学信息学、水文学、环境气象学、生态学、农学、认知神经学、运动神经学、计算神经学、系统生物学、生物信息学、临床诊断学、老年医学、公共卫生学、经济政策学、社会学、政治学,以及能源工程、土木工程、制造工程、可靠性工程、航空航天、通信工程、测绘工程和金融工程等领域的实际应用。

submitted time 2022-05-12 Hits34415Downloads3675 Comment 0

2. chinaXiv:202205.00012 [pdf]

单样本率比较(单组目标值法)样本量计算不同方法的比较

曾治宇; 李青
Subjects: Statistics >> Biomedical Statistics

" 目的 对单样本率比较(单组目标值法)样本量计算的不同方法进行比较,为实际应用中选择合适的方法提供依据。 方法 构建目标值π0和预计值π1,以正态近似法、通用法、反正弦法、确切经典法及确切保守法等5种方法分别计算各自所需的样本量,编程计算相应的最低成功率,并进行计算机模拟获得检验效能。 结果 5种方法在π0π1不接近0或1时表现较为相似,但π0逐渐接近0时,正态近似法和通用法得到的样本量相对较小,并逐渐损失了检验效能;π0逐渐接近1时,正态近似法和通用法得到的样本量相对较大,检验效能也比预设值逐渐增高。从检验效能来看,反正弦法的结果与确切经典法接近而显得更为离散,而确切保守法几乎能保证预设的检验效能,但在π0>0.5时,确切保守所需样本量比确切经典法逐渐增加。不同方法对实际成功率的要求总体相似,但存在细小差别。 结论 单个率比较的样本量计算方法的选择较为复杂,对检验效能要求比较高时,宜优选确切经典法和确切保守法,其次可考虑反正弦法,而通用法和正态近似法在率偏向两侧时,样本量会过大或过小,应具体权衡。

submitted time 2022-05-02 Hits828Downloads124 Comment 0

3. chinaXiv:202203.00018 [pdf]

基于二十四节气对中国上证指数收益率的影响的统计研究

Tianbao ZHOU; Xinghao LI; Junguang ZHAO
Subjects: Statistics >> Economic Statistics

In this study, readers will see the impact on Chinese stock index brought by twenty-four solar terms, a unique division of annual season in Chinese tradition. Based on the data in the past 26 years, the statistics focused on whether the daily return (revenue) of Shanghai Index shows significant value and special feature on and after each solar term. On several solar terms did the index return result large mean value and high probability of extreme value occurrence such as on solar term No.1 and No.3 while on solar term No.2 and solar term No.4, the results were completely opposite. The study also found that the volatility of index return during those solar terms in the beginning of the year were greatly active than the rest of them. Index return 10 days and 15days after solar term No.6 and solar term No.8 displayed high final return and large volatility whereas in any cases, the index went very steady after solar term No.18. The study also proposed that it’s almost impossible to make numeric prediction with the current technical analysis tools, the effective way in stock analysis to collect more feature and characteristics based on historical data, identifying if the similar situation is happening when similar feature of stock shows up in the future.

submitted time 2022-03-04 Hits2035Downloads310 Comment 0

4. chinaXiv:202109.00067 [pdf]

经济复杂度测算方法及在经济技术进步分析上的应用

刘新建
Subjects: Management Science >> Management Metrology
Subjects: Statistics >> Economic Statistics

经济发展进步使得生产过程越来越复杂,复杂度指数可以反映经济及产业部门的技术进步水平。文章修正了一种基于投入产出技术的经济复杂度指数,并用于分析一个地区的产业部门及经济总体技术进步水平。实证结果表明,修正过的复杂度指数能很好地表达经济及产业部门的经济技术进步水平,比修正前的计算公式更合理。

submitted time 2021-09-27 Hits10633Downloads694 Comment 0

5. chinaXiv:202102.00001 [pdf]

Application of generalised linear regression GARMA in tourism area

闫弘轩
Subjects: Statistics >> Applied Statistical Mathematics

From a modelling perspective, our first contribution is to propose generalised linear regression GARMA (GLRGARMA) model and generalised linear regression SARMA (GLRSARMA) model with a innovative function of explanatory variables in order to extend GLGARMA to incorporate relevant information for model fitting and forecast in tourism area. Besides, the generalised Poisson (GP) distribution is adopted to accommodate over- equal- and under-dispersion for certain tourism data. Moreover, the performance of GLRGARMA model and GLRSARMA model with their nested sub-models are compared and evaluated using several well-known selection criteria. Our second contribution is to investigate the behaviour of tourism data. The pattern of long memory is examined. The analysis of Hurst exponent, ACF plot and periodogram plot shows that Gegenbauer long memory features are presented in tourism data. Furthermore, the distinct characteristics between Gegenbauer long memory and seasonality are demonstrated to reveal the that the GLRGARMA model is more suitable for modelling tourism data. Our third contribution is to derive a Bayesian approach via the efficient and user-friendly Rstan package in estimating our proposed models. For ML approach, the likelihood function is untractable because of involving very high dimensional integrals. Several monitors of convergence of posterior samples are discussed, such as the number of effective sample and bR estimate. The criteria for modelling performance are also derived.

submitted time 2021-01-30 Hits8282Downloads1082 Comment 0

6. chinaXiv:202002.00079 [pdf]

基于移动平均预测限预判新型冠状病毒感染疫情趋势与适时风险分级

何豪; 何韵婷; 翟晶; 王筱金; 王炳顺
Subjects: Medicine, Pharmacy >> Preventive Medicine and Hygienics

目的 建立一种数据驱动的实用方法预测突发全新传染性疾病的疫情趋势,通过动态预判疫情风险与分级为防控策略提供量化依据。方法 在移动平均法的基础上予以改进,提出一种移动平均预测限(Moving Average Prediction Limits, MAPL)方法,采用既往重症急性呼吸综合征(Severe Acute Respiratory Syndrome,SARS)疫情数据验证MAPL方法对疫情趋势和风险预判的实用性。追踪本次新型冠状病毒(COVID-19)感染疫情从2020年1月16日起的官方公布数据,采用MAPL方法预判疫情变动趋势与疫区适时风险分级。 结果 基于MAPL方法分析显示,2020年2月初全国COVID-19感染疫情达到峰值。经过前期积极防控,2月中旬起全国疫情整体呈下降趋势。到2月下旬各地疫情有明显的区域性差异。与湖北地区相比,非湖北地区新增病例数下降速度快且未来疫情加重的风险相对较小。在几个重要的疫情输入省份,新增确诊病例数和可疑病例数的发展趋势一致,但消减速度在各省份间存在差异。 结论 MAPL方法可以辅助判断疫情趋势并适时预判风险分级,各疫情输入区可结合当地实际与疫情风险分级规划落实差异化精准防控策略。

submitted time 2020-02-28 Hits36523Downloads5390 Comment 0

7. chinaXiv:202002.00028 [pdf]

新型冠状病毒(2019-nCoV)时空数据集及其典型应用

林浩; 鲍君雅
Subjects: Survey & Drawing Science and Technology >> Photogrammetry and Remote Sensing
Subjects: Statistics >> Biomedical Statistics

目前,新型冠状病毒(2019-nCoV)疫情正受到全球各科研工作者的广泛关注。然而,当前尚没有一个官方的渠道对2019-nCoV疫情数据进行实时开源,为了促进本次疫情相关的科研工作,本研究旨在为广大科研工作者提供权威的、开放的和多尺度的新型冠状病毒(2019-nCoV)时空数据集,为疫情监测、防控、预测和预警提供重要的数据来源。此外,该数据集还能应用于2019-nCoV疫情的多尺度、多时相制图和可视化,为疫情的空间分布、演化、趋势分析和模拟预测提供指导。

submitted time 2020-02-19 Hits46003Downloads6876 Comment 0

8. chinaXiv:202002.00028 [pdf]

新型冠状病毒(2019-nCoV)时空数据集及其典型应用

林浩; 鲍君雅
Subjects: Survey & Drawing Science and Technology >> Photogrammetry and Remote Sensing
Subjects: Statistics >> Biomedical Statistics

目前,新型冠状病毒(2019-nCoV)疫情正受到全球各科研工作者的广泛关注。然而,当前尚没有一个官方的渠道对2019-nCoV疫情数据进行实时开源,为了促进本次疫情相关的科研工作,本研究旨在为广大科研工作者提供权威的、开放的和多尺度的新型冠状病毒(2019-nCoV)时空数据集,为疫情监测、防控、预测和预警提供重要的数据来源。此外,该数据集还能应用于2019-nCoV疫情的多尺度、多时相制图和可视化,为疫情的空间分布、演化、趋势分析和模拟预测提供指导。

submitted time 2020-02-17 Hits201Downloads130 Comment 0

9. chinaXiv:201904.00096 [pdf]

Strengthened change point detection model for weak mean difference data

Zhou, Qi; Huang, Shaoqian
Subjects: Statistics >> Applied Statistical Mathematics

Objective: The lifetime difference in adjacent parallel structure components becomes small as the number of components belonging to the same parallel structure increases. To infer the system structure, we must clarify the components that belong to the same parallel structure. Methods: A strengthened change point detection model (SCPDM) for weak mean difference data (WMDD) is established, which usually indicates that, as affected by a large variance, the mean difference in two subsignals for one data sequence becomes nonsignificant. For repeatedly retrievable WMDD, we performed two enhanced operations that doubled the mean difference by using the variance information and analyzed the asymptotic properties of the enhanced data. Then, we proposed an SCPDM based on the asymptotic results.Results: Finally, we compared the SCPDM with two other main change point detection models and verified that the SCPDM is superior to other models using WMDD change point detection by the simulation method.Limitations: This paper also have several limitations. First, we only discussed that are independent with normal distribution and single change point. Second, the reason why the relationship between and has an important influence on the accuracy of change point detection is not discussed in depth. We only defined the ratio boundary of WMDD by experience and simulation. Conclusions: Traditional change point detection models may become insensitive or ineffective for WMDD. We gave some asymptotic analysis and established a enhanced change point detection model (SCPDM) based on the asymptotic results. Compared with the traditional method, SCPDM can effectively detect the change point.

submitted time 2019-04-22 Hits21086Downloads2213 Comment 0

10. chinaXiv:201809.00188 [pdf]

排序下PPS抽样估计量的修正与应用

王峰
Subjects: Statistics >> Mathematical Statistics

受许多事物具有齐夫现象的启发,本文提出了排序后PPS抽样方法,并给出了修正汉森-赫维茨估计量及其方差。在此过程中本文解决了,长期以来抽样调查实践中将重要单元直接入样时,多少重要单元直接入样没有明确方法的问题,本文给出了理论依据和具体的确定方法。最后通过一个例子和中国城市人口抽样调查的案例,展示了修正汉森-赫维茨估计量的优势,并对这一研究方法做了总结和展望。

submitted time 2018-09-26 Hits28307Downloads2887 Comment 0

  [1 Pages/ 10 Totals]