The following are publications in Cryptology that are related to my Doctoral thesis. Some other publications are listed here and all publications in Cryptology and related fields are available at my DBLP page.
- Designing Integrated Accelerator for Stream Ciphers with Structural Similarities
[▾ abstract] [full text] IACR ePrint (2012) - S. Sen Gupta, A. Chattopadhyay and A. Khalid
Till date, the basic idea for implementing stream ciphers has been confined to individual standalone designs. In this paper, we introduce the notion of integrated implementation of multiple stream ciphers within a single architecture, where the goal is to achieve area and throughput efficiency by exploiting the structural similarities of the ciphers at an algorithmic level. We present two case studies to support our idea.
First, we propose the merger of SNOW 3G and ZUC stream ciphers, which constitute a part of the 3GPP LTE-Advanced security suite. We propose HiPAcc-LTE, a high performance integrated design that combines the two ciphers in hardware, based on their structural similarities. The integrated architecture reduces the area overhead significantly compared to two distinct cores, and also provides almost double throughput in terms of keystream generation, compared with the state-of-the-art implementations of the individual ciphers.
As our second case study, we present IntAcc-RCHC, an integrated accelerator for the stream ciphers RC4 and HC-128. We show that the integrated accelerator achieves a slight reduction in area without any loss in throughput compared to our standalone implementations. We also achieve at least 1.5 times better throughput compared to general purpose processors. Long term vision of this hardware integration approach for cryptographic primitives is to build a flexible core supporting multiple designs having similar algorithmic structures.
- High Performance Hardware Implementation for RC4 Stream Cipher
[▾ abstract] [full text] IEEE-TC (2012) - S. Sen Gupta, A. Chattopadhyay, K. Sinha, S. Maitra and B.P. Sinha
RC4 is the most popular stream cipher in the domain of cryptology. In this paper, we present a systematic study of the hardware implementation of RC4, and propose the fastest known architecture for the cipher. We combine the ideas of hardware pipeline and loop unrolling to design an architecture that produces 2 RC4 keystream bytes per clock cycle. We have optimized and implemented our proposed design using VHDL description, synthesized with 130 nm, 90 nm and 65 nm fabrication technologies at clock frequencies 625 MHz, 1.37 GHz and 1.92 GHz respectively, to obtain a final RC4 keystream throughput of 10 Gbps, 21.92 Gbps and 30.72 Gbps in the respective technologies.
- RC4: (Non-)Random Words from (Non-)Random Permutations
[▾ abstract] [full text] IACR ePrint (2011) - S. Sen Gupta, S. Maitra, G. Paul and S. Sarkar
RC4 has been the most popular stream cipher in the history of symmetric key cryptography till date. Its internal state contains a pseudo-random permutation over all $n$-bit words (typically $n = 8$) and it attempts to generate a pseudo-random sequence of words by extracting elements of this permutation. Since more than last twenty years, numerous cryptanalytic results on RC4 stream cipher have been published. Many of these results are based on some non-random (biased) events involving the secret key or the state variables or the output sequence, or a combination of them.
Though biases based on the secret key is common in RC4 literature, none of the existing ones depends on the length of the secret key. In the first part of this paper, we report significant biases involving the length of the secret key, for the first time in the literature.
In the second part of the paper, theoretical proofs of some significant initial-round empirical biases observed by Sepehrdad, Vaudenay and Vuagnoux [SAC 2010] are presented. Another important result presented here is the derivation of the complete probability distribution of the first byte of RC4 output sequence, a problem left open for a decade since the observation by Mironov [CRYPTO 2002]. Further, the existence of positive biases towards zero for all the initial bytes 3 to 255 is proved and exploited towards a generalized broadcast attack on RC4 stream cipher.
The above biases discussed in this paper, like most of the existing biases in RC4 literature, are short-term and do not last after a few initial rounds. The last part of this paper investigates the long-term manifestation of short-term biases in RC4 output sequence. A careful analysis of the periodic structure of RC4 evolution proves the first long-term generalization of Mantin and Shamir's [FSE 2001] famous second-byte bias.
- HiPAcc-LTE: An Integrated High Performance Accelerator for 3GPP LTE Stream Ciphers
[▾ abstract] [full text] Indocrypt 2011 - S. Sen Gupta, A. Chattopadhyay and A. Khalid
Stream ciphers SNOW 3G and ZUC are the major players in the domain of next generation mobile security as both of them have been included in the security portfolio of 3GPP LTE-Advanced, the potential candidate for 4G mobile broadband communication standard. In this paper, we propose HiPAcc-LTE, a high performance integrated design that combines the two ciphers in hardware, based on their structural similarities. The integrated architecture reduces the area overhead significantly compared to two distinct cores, and also provides almost double throughput in terms of keystream generation. This is in comparison with the state-of-the-art implementations of the individual ciphers, both in the academic literature as well as in the commercial domain. We present detailed description of the design idea, optimization techniques and comparison results in this paper. Long term vision of this hardware integration approach for cryptographic primitives is to build a flexible core supporting multiple designs having similar algorithmic structures.
- Proof of Empirical RC4 Biases and New Key Correlations
[▾ abstract] [full text] SAC 2011 - S. Sen Gupta, S. Maitra, G. Paul and S. Sarkar
In SAC 2010, Sepehrdad, Vaudenay and Vuagnoux have reported some empirical biases between the secret key, the internal state variables and the keystream bytes of RC4, by searching over a space of all linear correlations between the quantities involved. In this paper, for the first time, we give theoretical proofs for all such significant empirical biases. Our analysis not only builds a framework to justify the origin of these biases, it also brings out several new conditional biases of high order. We establish that certain conditional biases reported earlier are correlated with a third event with much higher probability. This gives rise to the discovery of new keylength-dependent biases of RC4, some as high as $50/N$, where $N$ is the size of the RC4 permutation. The new biases in turn result in successful keylength prediction from the initial keystream bytes of the cipher.
- Attack on Broadcast RC4 Revisited
[▾ abstract] [full text] FSE 2011 - S. Maitra, G. Paul and S. Sen Gupta
In this paper, contrary to the claim of Mantin and Shamir (FSE 2001), we prove that there exist biases in the initial bytes (3 to 255) of the RC4 keystream towards zero. These biases immediately provide distinguishers for RC4. Additionally, the attack on broadcast RC4 to recover the second byte of the plaintext can be extended to recover the bytes 3 to 255 of the plaintext given $\Omega(N^3)$ many ciphertexts. Further, we also study the non-randomness of index $j$ for the first two rounds of PRGA, and identify a strong bias of $j_2$ towards 4. This in turn provides us with certain state information from the second keystream byte.
- One Byte per Clock: A Novel RC4 Hardware
[▾ abstract] [full text] Indocrypt 2010 - S. Sen Gupta, K. Sinha, S. Maitra and B.P. Sinha.
RC4, the widely used stream cipher, is well known for its simplicity and ease of implementation in software. In case of a special purpose hardware designed for RC4, the best known implementation till date is 1 byte per 3 clock cycles. In this paper, we take a fresh look at the hardware implementation of RC4 and propose a novel architecture which generates 1 keystream byte per clock cycle. Our strategy considers generation of two consecutive keystream bytes by unwrapping the RC4 cycles. The same architecture is customized to perform the key scheduling algorithm at a rate of 1 round per clock.