An Energy-Efficient Architecture of Approximate Softmax Functions for Transformer in Edge Computing

作者：贾东宁发布时间：2025-04-03浏览次数：10

By

Li, SH (Li, Suhang) [1] ; Yin, B (Yin, Bo) [1] ; Zhang, H (Zhang, Hao) [1]

Book Group Author

IEEE COMPUTER SOC

Source

2023 6TH INTERNATIONAL CONFERENCE ON ELECTRONICS AND ELECTRICAL ENGINEERING TECHNOLOGY, EEET 2023

Page

179-185

DOI

10.1109/EEET61723.2023.00015

Published

2023

Indexed

2024-10-18

Document Type

Proceedings Paper

Conference

Meeting

6th International Conference on Electronics and Electrical Engineering Technology (EEET)

Location

Nanjing, PEOPLES R CHINA

Date

DEC 01-03, 2023

Abstract

Recently, Transformer models have provided better accuracy than traditional models in fields such as computer vision (CV) and natural language processing (NLP). However, compared with traditional convolutional neural networks (CNNs), a large number of softmax function calculations in Transformer models involve expensive exponential and division calculations, resulting in huge consumption of computing and storage resources and waste of power consumption, which makes Transformer networks unable to be effectively applied to edge computing. To solve this problem, in this paper an approximate softmax computation architecture is proposed. Compared with the classic base-e softmax function or the base-2 softmax function proposed in recent years, our proposed design uses simpler shift and fixed-point addition operations to replace complex exponential and division operations, and thus significantly reduces the resource consumption while maintaining a high computation accuracy. The proposed architecture can effectively accelerate the Soft-max computation of Transformer models in edge computing. Experimental results show that for 16-bit Softmax computation, the proposed design can achieve less resource occupation and energy consumption. Vision transformer(ViT) models are also used to verify the effectiveness of the proposed method in real transformer models.

Keywords

Author Keywords

softmaxtransformeredge-computingapproximate computing

导航