An Implementation of Configurable and Small-Area AES IP Core Oriented Avalon Bus

X.C. Tao  
Institution of VSLI Design, Hefei University of Technology  
China

D.L. Zhang  
Institution of VSLI Design, Hefei University of Technology  
China

Y.K. Song  
Institution of VSLI Design, Hefei University of Technology  
China

Abstract—The Advanced Encryption Standard (AES) issued by the National Institute of Standards and Technology in 2001 has become the new widely-used symmetric block cipher standard. A lot of efforts have been made on the various hardware implementations of the AES algorithm. Some focus on achieving low-cost constructions, while others focus on designing high throughput. Given the specific requirement of wireless communication and portable devices, this article presents an AES IP core with an acceptable trade-off between performance and area. By introducing composite fields Sboxes and researching optimization of MixColumn can reduce resources. The AES IP is designed based on Avalon bus. It is compatible with five modes including ECB, CBC, OFB, CFB, and CTR. 128,196,256 bits key are also supported. Meanwhile, it can be flexibly configured according to the specific circumstances. This design and implementation of the AES core has a certain value for the generalization the wireless communication terminal hardware platform.

Keywords—AES; configurable; SoC; Avalon

I. INTRODUCTION

Encryption has become an essential component for secure data and telecommunications networks. It makes electronic commerce, payment systems and transactions over networks possible. It has become one of the main tools for privacy, trust, access control, corporate security and countless other areas. AES Algorithm is safe, efficient and high-performance. Hence, there is an increasing need of hardware implementation of AES algorithm.

Research efforts have been made for the implementation of AES security algorithms on hardware for embedded system applications using FPGA. In the designs [2] [3], the core is designed with Microblaze processor core. [4] [5] use Altera NIOS II core. In this paper, we propose an Advanced Encryption Standard (AES) based on Altera NIOS II core, implement the design and simulation of AES module compatible with five modes based on Avalon bus. At the same time, AES module, NIOS II core and peripherals have been integrated with SOPC technology. The whole design has a simple hard structure, and it is flexibility and safety. It implements an efficient, cost-effective solution. In this paper, our specific contributions are:

- Design of AES module compatible with five modes (including ECB(Electronic codebook) CBC(Cipher-block chaining) OFB(Output feedback) CFB(Cipher feedback) CTR(Counter)).
- The AES module supports 128,196,256 bit key.
- The design is optimized in terms of resources.
- The design based on Avalon bus can be configured by slave port and the mutual communication for data and key exchanges are achieved by a suitable implementation of the Avalon bus.

The rest of the paper organized as follows. A brief introduction to AES algorithm is stated in Section II. In Section III, we discuss on the algorithm optimization of AES. Section IV gives a description of the proposed system architecture and its major building blocks. Conclusions are made in Section V.

II. ADVANCED ENCRYPTION STANDARD

The AES is an iterative, symmetric block algorithm with input packet size of 128 bits and key lengths of 128, 192, and 256 bits. The Algorithm per round operates on the intermediate results, called state, which is a rectangular array of bytes. Since the block size is 128 bits, the rectangular array is of dimensions 4x4. Similarly, the round key, which is expanded from the initial key, can be imagined as a rectangular array.

![FIGURE I. MIXCOLUMN/INVMIXCOLUMN.](image)

Figure I. describes the AES algorithm with 128bit key. There are different steps in each round of the encryption/decryption data path. The individual transformations are Subbyte, Shiftrow, Mix-column, and Addroundkey.

© 2015. The authors - Published by Atlantis Press
III. ALGORITHM OPTIMIZATION OF AES

A. Optimization of S-Box

The single nonlinear step is the SubBytes step, which operates independently on each byte of the state using a substitution table (S-box). This nonlinear function involves two transformations, including taking the multiplicative inverse in the Galois field GF(28) and applying affine transformation. The Inverse is a complex calculation, so that a LUT is used in many current implementations.

But for hardware implementations of AES algorithm, one main drawback of the LUT approach to the S-box occurs: each copy of the LUT requires 256×1 bytes of storage. [6] [7] all put forward some implementations of S-box with combinational logic gate. The main idea is reduced order of Galois Fields. After considering the methods from the reference, we choose this solution that the multiplicative inversion in GF(28) being carried out in GF((2/4)).

The multiplicative inversion in GF(28) can be carried out in GF(24) by the architecture illustrated in Figure. II × represents multiplication, x−1 represents inversion in GF(28) by the architecture illustrated in Figure. II × represents multiplication, x−1 represents inversion in GF(28) being carried out in GF((2/4)).

δ

FIGURE II. MIXCOLUMN/INVMIXCOLUMN.

Isomorphic mapping can be described as a matrix multiplication δ(x) = T • x . [7] proposed an algorithm of searching T matrix with \( \lambda = \{1110\} \).

B. Implementation of Mixcolumn

In the step of mixcolumn/invmixcolumn, two operations are defined as following.

\[
\begin{align*}
\text{out}_x &= [2 \ 3 \ 1 \ 1] \bullet [a \ b \ c \ d]^T \\
\text{out}_y &= [E \ B \ D \ 9] \bullet [a \ b \ c \ d]^T
\end{align*}
\]

[8] Put forward optimization of mixcolumn/invmixcolumn. 03 = 02 + 01, 09 = 08 + 01, 0B = 09 + 02, 0D = 0B + 02, 0E = 0D + 01. Only the products of the byte with 01, 02, 04 and 08 are need to be calculated. All elements of the matrices can be formed by summing these products. For \( b \in GF(2^8) \) that is expressed by \( b_7b_6b_5b_4b_3b_2b_1b_0 \). ×2 will be \( b_6b_5b_4b_3b_2b_1b_0 \). which is showed in Fig.4. In this case, ×4 and ×8 can be figured out based on ×2.

\[
\begin{align*}
\times 4 &= b_2b_4(b_2^\lambda b_7)(b_2^\lambda b_6)^T(b_1^\lambda b_7)(b_1^\lambda b_6)(b_1^\lambda b_5) \\
\times 8 &= b_4(b_4^\lambda b_7)(b_4^\lambda b_6)^T(b_4^\lambda b_5)
\end{align*}
\]

IV. SYSTEM ARCHITECTURE

In this section we introduce the system design of AES, the structure of main operation module and system architecture.

A. Configurable Architectures of AES

The AES module designed in this paper is compatible with five modes, including ECB, CBC, CFB, OFB and CTR. The main structure of the model is illustrated by the following image. It consists of the following sections: two main registers, logical operators of encryption/decryption, a number of switch MUX and XOR.

B. Construction of AES Core

The following figure shows a main construction of AES core. The datapath of the core is 128 bit. The main modules are: AES arithmetic unit, key expansion unit, control unit, memory and invkey module. Key expansion unit is to expand encryption key. AES arithmetic unit is to complete the encryption or decryption operation. Control unit is to generate round signals, effective signals, enable signal and so on. The invkey module is used to “invert” the decryption key. The scheme shown in the figure requires a 60 x 32 memory. The AES core works on the principle of the “Equivalent Inverse Cipher” [9]. A memory is only needed during decryption actually. No storage is required for encryption since keys can be expanded on the fly.

C. System Design of AES Core

Now we introduce the system design of AES. We use Avalon bus as the interface of AES core. Burst mode transmission of read and basic mode transmission of write is used. The figure below shows the signal relationship between AES core and Avalon bus.
There are an interrupt and three channels in the design, including configuration channel, read channel and write channel.

Configuration channel: Avalon slave interface is used to configure registers of AES core, including an initial key, IV, mode, feedback size, number of operation, reading/writing address, start flag and interrupt identification.

Read channel: After the initialization and configure all the registers. Write “1” to start flag register, then AES core will read data from the reading address set through configuration channel. Since the datapath of Avalon master interface is 32 bits, but the AES core’ datapath is 128 bits, so shifting registers are used to transform data width.

Write channel: Shifting registers are used to transform data width(from 128 to 32). When there is result from AES arithmetical unit, 32bits data is output to the writing address configured.

IRQ: Once the input data number reachs number of operation configured, IRQ is generated. Meanwhile, the Avalon master read interface of AES core stops working. IRQ register should be cleared when starting a new operation.

D. System Structure of Verification

The system structure of verification is described as follows. 264 codec, video capture and wireless transmission are used in this system. A Digital camera is used to data collection. After testing with system clock set 64Mkhz, verify whether the AES encryption/decryption procedure is done successfully. After testing with system clock set 64Mkhz, the system work stable. We can see intact image consecutively.

The implementation in [2] has reduced the circuit area to a few hundred slices at the expanse of lowering speed, but the datapath is 32-bit, and is not based on Avalon bus architecture. [7] supports 128,196,256 bit key with normal performance. In design [3], it achieves low-cost, but it doesn’t support all 5 kinds of mode and have lower throughput than our design. Design proposed in [4] achieves higher throughput with respect to our design, but it takes much more resources than our design. The implementation in [5] is based on co-processor architecture, but it does not support all 5 modes and has lower throughput.

The AES core in our design which can be flexibly configured according to the specific circumstances is full-featured with moderate performance. Meanwhile, it can enhance the integration of system with standard Avalon bus interface. The design has high value in practical application.

V. Analysis of Results and Conclusions

We use Verilog HDL to achieve AES core, and verifies them on Cyclone V FPGA. All test schemes are based on Nios II standard core. Synthesis result and performance between those from references is shown in Table 1. The system architecture is shown in Fig. VI. Two boards are communicated via wireless. Decrypted image is displayed on the screen to verify whether the AES encryption/decryption procedure is done successfully. After testing with system clock set 64Mkhz, the system work stable. We can see intact image consecutively.

ACKNOWLEDGMENT

Support for this work was provided by The National Natural Science Foundation of China (61204024, 61179036, 61106020) and Natural Science Foundation of Jiangsu Province of China (BK2011185)

REFERENCES


Hanjia Liu. The design and verification of AES encryption algorithm IP core [D][D]. Shanghai Jiao Tong University, 2009.
