SONY

The 128-bit Blockcipher

Implementation of CLEFIA

CLEFIA is designed as a well-balanced blockcipher between security and performance. Below are details of the performance of CLEFIA.

Table 1 shows the performance of CLEFIA in hardware implementations expressed by several numerical values in various aspects.

Table 1 : Hardware Implementation Results of CLEFIA
Key
(bits)
Enc/Dec
(cycles)
Key Setup
(cycles)
Optimi-
zation
Area
(gates)
Freq.
(MHz)
Speed
(Mbps)
Speed/Area
(Kbps/gate)
128 18 12 area 5,979 225.83 1,605.94 268.63
speed 12,009 422.29 3,003.00 250.06
36 24 area 4,950 201.28 715.69 144.59
speed 9,377 389.55 1,385.10 147.71
192 22 20 area 8,536 206.56 1,201.85 140.81
256 26 20 area 8,482 206.56 1,016.95 119.89

The values in the table are measured by using implementations for which the 0.09 µm standard cell library is utilized. The results present CLEFIA's potentiality for high-speed and compact implementations. CLEFIA can be implemented with less than 5K gates when the required clock cycle is 36 per block and the throughput goes over 700Mbps. Implemented to take 18 cycles per block, CLEFIA achieves a maximum throughput of over 1.6 Gbps requiring just 6K of gate size, which is the best current hardware gate efficiency.

CLEFIA provides advanced capabilities for wide environments, even in restrictive environments such as smart cards and mobile devices.

Table 2 shows the performance of CLEFIA in compact hardware implementations. The values in the table are measured by using implementations for which the 0.13 µm standard cell library is utilized.

Table 2 : Compact Hardware Implementation Results of CLEFIA
Key
(bits)
Mode Enc/Dec
(cycles)
Key Setup
(cycles)
Area
(gates)
Speed@100KHz
(Kbps)
128 Enc 176 128 2,678 73
Enc/Dec 176 128 2,781 73
Enc 328 224 2,488 39
Enc/Dec 328/320 224 2,604 39/40

This table shows that CLEFIA encryption can be implemented with less than 3K gates with 176 clock cycles per block, and that it can be implemented with less than 2.5Kgate with 328 clock cycles.

Table 3 shows the performance of CLEFIA in software implementations.

Table 3 : Software Implementation Results of CLEFIA
Type of
implement
Key Length
(bits)
Enc.
(cycles/byte)
Dec.
(cycles/byte)
Key Setup
(cycles)
Table Size
(KB)
1 block 128 12.9 13.3 217 8
192 15.8 16.2 272
256 18.3 18.4 328
2 block
parallel
128 11.1 11.1 217 16
192 13.3 13.3 272
256 15.6 15.6 328

These values are measured by using assembly codes for AMD AthlonTM Processor 4000+ using Windows XP 64-bit Edition. The results show that the required cycles for processing one byte are only 12.9 on the average for a 128-bit key case of CLEFIA, which means the achieved throughput is over 1.48Gbps on the processor. Thus, the software performance of CLEFIA is very fast compared with the existing 128-bit blockciphers. Besides the platform picked up here, CLEFIA is designed to be implemented with good performance in wide-ranging environments. We have confirmed that the performance of CLEFIA in other software platforms is also fine.

The reason CLEFIA has a good balance between compactness and high speed without compromising security is that it is based on several new design techniques which will be explained in the following pages.