Your conditions: 谭光明
  • Computing system for simulation intelligence

    Subjects: Statistics >> Social Statistics submitted time 2024-03-27 Cooperative journals: 《中国科学院院刊》

    Abstract: This study refers computer simulation in scientific research to as scientific simulation. Based on its narrow and broad definitions, this study divides scientific simulation into three stages: numerical computation, simulation intelligence, and science brain. Now, scientific simulation is entering the era of simulation intelligence, i. e., driven by scientific big data and artificial intelligence technology, scientific simulation is shifting from traditional numerical simulation to simulation integrated with artificial intelligence. In order to understand what the right computing system for simulation intelligence is, the design guidelines, basic methods, and key technical problems are discussed.

  • Development and Policy of High Performance Computer

    Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-28 Cooperative journals: 《中国科学院院刊》

    Abstract: High performance computing technology and industry have been developing and flourishing. From serving the only demand for peak performance of national strategic departments to serving market-driving, application promotion, and industrialization advancement, China’s high performance computing has surpassed the three mountains of “breaking blockade”, “breaking monopoly”, and “leading innovation”, gradually narrowing the gap with foreign advanced research and development level, and has achieved worldleading achievements in the design of the whole machine system and other key technologies. In this paper, we analyze the trend of the development of super-computer and the way of the development modes of Sugon super computer, then summarize the challenges of high-performance computers into two points: the sustainable construction of Exascale before the failure of Moore’s law and the revolutionary technology of super-computer system in post-Moore’s law era. In response to these challenges, the paper elaborates policy recommendations for “facing the world’s scientific and technological frontiers, facing the country’s major needs, and facing the national economy’s main battlefield”.

  • Agricultural Simulator: Using Intelligent Technology to Get Data Flow for Black Land Protection

    Subjects: Other Disciplines >> Synthetic discipline submitted time 2023-03-28 Cooperative journals: 《中国科学院院刊》

    Abstract: Information technology is deeply penetrating into all walks of life. Through the acquisition of massive data, modeling and analysis in the information space, it has become an effective means to solve practical problems in the information society. At present, China is vigorously implementing black soil conservation projects. Aiming at the complex system protection engineering,it is necessary to rely on the strength of information technology to carry out problem modeling and algorithm solving in the process of black soil utilization and protection, and find the best protection way through simulation and emulation. Based on the analysis of the black land protection measures worldwide, the study puts forward the design idea of agricultural simulator based on the fifth paradigm from the perspective of intelligent technology, gives the organizational structure of the total factor agricultural simulator, and realizes the rapid operation and iteration of data flow through the intelligent OODA (observe, orient, decide, act) loop to continuously optimize the black soil protection technology. Finally, the study proposes the idea and framework of building agricultural simulator in the black land protection demonstration area, as well as the policy suggestions for the application and promotion of agricultural simulator in the process of black land protection.

  • 一种新型高效的算法级容错技术及实现

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2017-03-10

    Abstract:随着高性能计算系统规模的不断扩大,节点失效愈加频发。传统的容错技术大都基于检查点 (checkpoint)方式。但是,检查点技术的开销随着系统规模的扩大而不断增加,在百亿亿次(Exaflops) 规模下其容错效率难以满足系统需求。算法失效恢复技术相比检查点方式具有更高的效率。然而,该技术依然基于停等模式。对于大规模系统,停等模式在很大程度上会影响程序的并行效率。本文提出了一种非 停等的算法级容错策略——热替换策略。在程序运行过程中若发生节点失效,不用停等恢复失效节点上的 数据,而用冗余节点替换失效节点,使计算能继续进行。终的正确结果可以通过一个线性变换求出。为 了论证方案的有效性,我们结合 MPICH 的容错特性实现了容错的 High Performance Linpack (HPL),并评估 了方案的性能。实验结果表明,即使在小规模下,我们的方案的性能也明显优于算法失效恢复技术。

  • CPU/ATI GPU混合体系结构上DGEMM的性能研究

    Subjects: Computer Science >> Computer Hardware Technology submitted time 2017-03-10

    Abstract:本文报道了我们在 CPU/ATI GPU 混合体系结构上优化双精度矩阵乘法(DGEMM)的工作。在真 实应用中, CPU 与图形处理器(GPU)之间的数据传输是影响性能的关键因素。由于软件流水可以降低 数据传输开销,我们提出了三种软件流水算法,分别是双缓存(Double Buffering)、数据重用(Data Reuse) 和数据存储优化(Data Placement)。在 AMD 公司的图形处理器(GPU)ATI HD5970 上,优化后 DGEMM 性能达到 758 GFLOP/s,对应效率为 82%,是 ACML-GPU v1.1 性能的两倍。在 Intel Westmere EP 和 ATI HD5970 组成的异构系统上,性能达到 844 GFLOP/s,效率为 80%。我们进一步考察了多个 CPU 和多个 GPU 上 DGEMM 的扩展性,详细分析了体系结构方面的影响因素。分析表明,PCIe 总线和内存总线的竞争是异 构系统上程序性能降低的重要影响因素。

  • 选择最优存储格式实现稀疏矩阵乘法的研究

    Subjects: Computer Science >> Computer Application Technology submitted time 2016-11-15

    Abstract:稀疏矩阵向量乘法是科学和工程领域中重要的核心子程序之一,也是稀疏BLAS(Basic Linear AlgebraSubprograms,基本线性代数子程序)库的重要组成。本文提出一个稀疏矩阵向量乘法的自动调优器SMAT。对于一个给定的稀疏矩阵,SMAT 可以选择并返回最优的存储格式。我们使用佛罗里达大学的2316 个稀疏矩阵作为测试集,SMAT 获得性能达到所选格式最好性能96%以上。SMAT 在Intel X5680 平台上的预测准确率为89.34% (单精度)和 86.18%(双精度),在AMD Opteron 6168 平台上准确率达到了85.10%(单精度)和82.09%(双精度)。同时,SMAT 的在线搜索时间在需要调用上百次稀疏矩阵向量乘法的应用中是可以接受的。

  • 综述:可扩展应用与可扩展系统

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2016-11-02

    Abstract:可扩展计算机系统在各个领域得到了越来越广泛的应用,这些应用往往具有可扩展的需求,而这些可扩展应用的特征却有很大差异。过去20 年间,用于可扩展应用的系统平台层出不穷,这些平台的优势也各不相同,评价一类应用与某种系统平台相适应的程度成为用户关注的一个关键问题。本文对可扩展应用及可扩展系统进行了综述与分析,并提出了一些评价应用与系统平台匹配程度的参考因素;同时,本文还对近来业界提出的一些热点新名词进行了解释与分析,比较了它们之间的异同。本文的目的是帮助人们深入理解可扩展应用与可扩展系统的特征,帮助用户选择合适的平台以提高应用的效率和资源的利用率,同时激发科研人员进一步探索适应应用新需求的系统平台技术。

  • 一种新型高效的算法级容错技术及实现

    Subjects: Computer Science >> Computer Software submitted time 2016-06-08

    Abstract:随着高性能计算系统规模的不断扩大,节点失效愈加频发。传统的容错技术大都基于检查点(checkpoint)方式。但是,检查点技术的开销随着系统规模的扩大而不断增加,在百亿亿次(Exaflops)规模下其容错效率难以满足系统需求。算法失效恢复技术相比检查点方式具有更高的效率。然而,该技术依然基于停等模式。对于大规模系统,停等模式在很大程度上会影响程序的并行效率。本文提出了一种非停等的算法级容错策略——热替换策略。在程序运行过程中若发生节点失效,不用停等恢复失效节点上的数据,而用冗余节点替换失效节点,使计算能继续进行。最终的正确结果可以通过一个线性变换求出。为了论证方案的有效性,我们结合MPICH 的容错特性实现了容错的High Performance Linpack (HPL),并评估了方案的性能。实验结果表明,即使在小规模下,我们的方案的性能也明显优于算法失效恢复技术。

  • CPU/ATI GPU 混合体系结构上DGEMM 的性能研究

    Subjects: Computer Science >> Computer Software submitted time 2016-06-08

    Abstract:本文报道了我们在CPU/ATI GPU 混合体系结构上优化双精度矩阵乘法(DGEMM)的工作。在真实应用中, CPU 与图形处理器(GPU)之间的数据传输是影响性能的关键因素。由于软件流水可以降低数据传输开销,我们提出了三种软件流水算法,分别是双缓存(Double Buffering)、数据重用(Data Reuse)和数据存储优化(Data Placement)。在AMD 公司的图形处理器(GPU)ATI HD5970 上,优化后DGEMM性能达到758 GFLOP/s,对应效率为82%,是ACML-GPU v1.1 性能的两倍。在Intel Westmere EP 和ATIHD5970 组成的异构系统上,性能达到844 GFLOP/s,效率为80%。我们进一步考察了多个CPU 和多个GPU上DGEMM 的扩展性,详细分析了体系结构方面的影响因素。分析表明,PCIe 总线和内存总线的竞争是异构系统上程序性能降低的重要影响因素。

  • 一种低开销软硬件混合的细粒度内存

    Subjects: Computer Science >> Computer Application Technology submitted time 2016-05-04

    Abstract:内存行为分析是进行内存系统调度、体系结构及应用访存性能等优化的基础,而细粒度的内存行为 分析能够标识内存系统性能瓶颈的源头,并为优化提供丰富的语义信息。常用的内存行为分析手段包括插 桩、模拟器、硬件计数器等,但它们分别存在开销大,准确性不足,无法提供详细信息等问题。本文提出 了一种软硬件混合的细粒度内存行为分析方法,能够对程序的完整访存序列进行函数级和对象级分析。硬 件方面使用HMTT 卡监控系统访存请求,软件方面采用二进制插桩方式来获取函数入口、出口信息,通过 导出内核页表及对象内存分配信息来得到每个对象的内存空间信息。实验结果表明,本文提出的方法能够 以较低的开销,准确地获取真实系统上的函数及对象级的访存序列。