国防科技大学
国防科技大学计算机学院
1007-130X
43-1258/TP
1973
计算机工程与科学
信息科技
月刊
1-3个月
95955次
42-153
湖南省长沙市
410073
指数和对数函数是浮点计算中重要的超越函数,在不同应用领域使用广泛。现代处理器向量寄存器宽度呈现逐代增加的趋势,为了进一步提高上层应用对向量部件的利用率,研究向量指数和对数函数优化方法具有重要的科学价值和现实意义。针对现有向量函数实现的性能瓶颈,设计和实现了面向向量部件的指数和对数函数优化方法,包括基于硬件加速指令的向量查表优化、分支优化和精度性能取舍优化。模拟器上的实验表明,优化实现的向量指数和对数函数均达到业界高精度标准,函数性能优于当前最佳开源实现,加速比达1.44以上。真实应用测试进一步表明,应用程序在优化的向量函数支持下可以实现高效向量化,相比原始标量实现平均性能提升达2.53倍。
Exponential and logarithmic functions are important transcendental functions in floating-point computation, widely used in various application fields. Modern processors exhibit a trend of increasing vector register width with each generation. To further enhance the utilization of vector units by upper-layer applications, researching optimization methods for vector exponential and logarithmic functions holds significant scientific value and practical importance. Addressing the performance bottlenecks of existing vector function implementations, this paper has designed and implemented optimization methods for exponential and logarithmic functions tailored for vector units. These methods include vector lookup table optimization based on hardware acceleration instructions, branch optimization, and precision-performance trade-off optimization. Experiments on simulators demonstrate that the optimized vector exponential and logarithmic functions meet industry-standard high precision and outperform the current best open-source implementations, achieving a speedup ratio of over 1.44. Real-world application tests further show that applications can achieve efficient vectorization with the support of the optimized vector functions, resulting in an average performance improvement of 2.53 times compared to the original scalar implementations.
相关文章
| [1] | 傅游, 韩昊, 孙月娇, 梁建国, 叶雨曦, 花嵘. 基于OpenMP的硅晶体分子动力学模拟的空间分解着色及向量化研究#br#[J]. 计算机工程与科学, 2024, 46(09): 1566-1575. |
| [2] | 范小康, 夏泽宇, 龙思凡, 杨灿群. 基于ARM SVE的光滑粒子流体动力学SIMD加速方法[J]. 计算机工程与科学, 2021, 43(06): 989-996. |
| [3] | 荀长庆, 陈照云, 文梅, 孙海燕, 马奕民. 以编译为导向的Matrix-DSP程序分析与优化[J]. 计算机工程与科学, 2020, 42(10高性能专刊): 1791-1800. |
| [4] | 郭娜,路梅,赵向军. 习题的关联分析及其向量化表示方法[J]. 计算机工程与科学, 2017, 39(10): 1950-1957. |
| [5] | 李春江,黄娟娟,徐颖,董钰山. 基于数据对齐属性指导的GCC自动向量化优化[J]. J4, 2014, 36(06): 1011-1017. |
| [6] | 张民选. YH—2算法a^x,logax,x^y函数的算法设计[J]. J4, 1997, 19(3): 55-58. |