统计独立性是统计学和机器学习领域的基础性概念，如何表示和度量统计独立性是该领域的基本问题。 Copula 理论提供了统计相关性表示的理论工具，而 Copula 熵理论则给出了度量统计独立性的概念工具。本文综述了 Copula熵的理论和应用，概述了其基本概念定义、定理和性质，以及估计方法。介绍了 Copula 熵研究的最新进展，包括其在统计学四个基本问题（结构学习、关联发现、变量选择和时序因果发现等）上的理论应用。讨论了四个理论应用之间的关系，以及其对应的深层次的相关性和因果性概念之间的联系，并将 Copula 熵的（条件）独立性度量框架与基于核函数和距离的相关性度量框架进行了对比。简述了 Copula 熵在理论物理学、化学信息学、水文学、环境气象学、生态学、农学、认知神经学、运动神经学、计算神经学、系统生物学、生物信息学、临床诊断学、老年医学、公共卫生学、经济政策学、社会学、政治学，以及能源工程、土木工程、制造工程、可靠性工程、航空航天、通信工程、测绘工程和金融工程等领域的实际应用。
Liu, Yu; Di, Zengru; Gerlee, Philip
The notion of information and complexity are important concepts in many scientific fields such as molecular biology, evolutionary theory, and exobiology. Most measures of these quantities, such as Shannon entropy and related complexity measures, are only defined for objects drawn from a statistical ensemble and cannot be computed for single objects. Based on assembly theory, we attempt to fill this gap by introducing the notion of a ladderpath which describes how an object can be decomposed into a hierarchical structure using repetitive elements. From the ladderpath two measures naturally emerge: the ladderpath-index and the order-index, which represent two axes of complexity. We show how the ladderpath approach can be applied to both strings and spatial patterns and argue that all systems that undergo evolution can be described as ladderpaths. Further, we discuss possible applications to human language and the origins of life. The ladderpath approach provides a novel characterization of the information that is contained in a single object (or a system) and could aid in our understanding of evolving systems and the origin of life in particular.
Process industry is the pillar industry of national economy, particularly, the process of producing magnesia by fused magnesia furnace system is a typical category of process industry. Due to the complex smelting mechanism and changing production factors, abnormal working conditions often occur in fused magnesia furnace. The semi-molten condition is the most typical and harmful abnormal condition. In this paper, an adaptive pretraining-inference-dynamic training-validation semantic segmentation method based on industrial video is proposed for dynamic prediction of semi-molten condition of multiple fused magnesium furnaces. The experimental results show that compared with the prediction model without adaptive learning, the prediction performance of the adaptive learning model in this paper for multiple fused magnesium melting processes is significantly improved.
For the cofactor-free 1-H-3-hydroxy-4-oxoquinaldine-2,4-dioxygenase (HOD), the dioxygen (O2) dependent steps are rate-limiting along with a spin state crossover to the singlet spin state. Here, the primary triplet O2 molecule activation on the 2-methyl-3-hydroxy-4(1H)-quinolone (MHQ) is investigated, and the catalytic role of the intersystem crossing effects is highlighted by directly comparing results from the Born-Oppenheimer dynamics and non-adiabatic surface hopping dynamics. This work confirms non-adiabatic dynamical effects are essential to modulate the O2 activation on the substrate MHQ. The time scale of the equilibration and conversion from triplet to singlet state should be in the range of a few hundreds of femtoseconds. We hope this work provides us a fresh look at the underlying physics of dioxygen activation reactions involving more than one spin state.
许建峰; 刘振宇; 王树良; 郑涛; 王雅实; 王赢飞; 党迎旭
Liu, Yu; Lin, Qiguang; Hong, Binbin; Hjerpe, Daniel; Liu, Xiaofeng
|The shortest path problem (SPP) is a classic problem and appears in a wide range of applications. Although a variety of algorithms already exist, new advances are still being made, mainly tuned for particular scenarios to have better performances. As a result, they become more and more technically complex and sophisticated. Here we developed a novel nature-inspired algorithm to compute all possible shortest paths between two nodes in a graph: Resonance Algorithm (RA), which is surprisingly simple and intuitive. Besides its simplicity, RA turns out to be much more time-efficient for large-scale graphs than the extended Dijkstra's algorithm (such that it gives all possible shortest paths). Moreover, RA can handle any undirected, directed, or mixed graphs, irrespective of loops, unweighted or positively-weighted edges, and can be implemented in a fully decentralized manner. These good properties ensure RA a wide range of applications.|
|Starting from finding approximate value of a function, introduces the measure of approximation-degree between two numerical values, proposes the concepts of "strict approximation" and "strict approximation region", then, derives the corresponding one-dimensional interpolation methods and formulas, and then presents a calculation model called "sum-times-difference formula" for high-dimensional interpolation, thus develops a new interpolation approach ? ADB interpolation. ADB interpolation is applied to the interpolation of actual functions with satisfactory results. Viewed from principle and effect, the interpolation approach is of novel idea, and has the advantages of simple calculation, stable accuracy, facilitating parallel processing, very suiting for high-dimensional interpolation, and easy to be extended to the interpolation of vector valued functions. Applying the approach to instance-based learning, a new instance-based learning method ? learning using ADB interpolation ? is obtained. The learning method is of unique technique, which has also the advantages of definite mathematical basis, implicit distance weights, avoiding misclassification, high efficiency, and wide range of applications, as well as being interpretable, etc. In principle, this method is a kind of learning by analogy, which and the deep learning that belongs to inductive learning can complement each other, and for some problems, the two can even have an effect of “different approaches but equal results” in big data and cloud computing environment. Thus, the learning using ADB interpolation can also be regarded as a kind of “wide learning” that is dual to deep learning.|
Lipeng Pan; Yong Deng
|" Dempster-Shafer evidence theory, as an extension of Probability theory, is widely used in the field of information fusion due to it satisfies weaker conditions than probability theory in dealing with uncertain information. Nevertheless , the description space of the current evidence theory is only a real space, and it cannot effectively describe and process the uncertain information in the face of multidimensional characteristic data and periodic data with phase angle changes. Based on this gap , in this paper, Dempster-Shafer evidence theory is extended to the complex Dempster-Shafer evidence theory. In complex Dempster-Shafer evidence theory, mass function that used to describe the uncertain information extends from the real space to the complex space, named as complex mass function, and the modulus of the mass function indicates the degree of support for the proposition. On this basis, other basic concepts used to describe uncertainty information are also defined and discussed, such as complex belief function, complex plausibility function, etc. In order to perfect the complex Dempster-Shafer evidence theory, the complex Dempster combination rule (CDCR) is supplemented. CDCR is an extension of Dempster combination rule (CDR), which satisfies the commutative and associative laws just as CDR does, and it can degenerate into CDR under certain condition. In addition, we propose a method to generate complex mass function and apply it to target recognition. The recognized results show that compared with the mass function of the real plane, the target recognition rate can be larger by using complex mass function to describe the uncertain information.|