## Two Simple Hash Function

The input (message, file, etc.) is viewed as a sequence of $n$-bit blocks. The input is processed one block at a time in an iterative fashion to produce an $n$-bit hash function.
One of the simplest hash functions is the bit-by-bit exclusive-OR (XOR) of every block. This can be expressed as
$C i=b i 1 \oplus b i 2 \oplus$ $\qquad$ $\oplus b i m$
where
$C i=i$ th bit of the hash code, $1 \ldots i \ldots n$
$m=$ number of $n$-bit blocks in the input
bij $=i$ th bit in jth block
$\oplus=$ XOR operation
This operation produces a simple parity for each bit position and is known as a longitudinal redundancy check.
A simple way to improve matters is to perform a one-bit circular shift, or rotation, on the hash value after each block is processed. The procedure can be summarized as follows.

1. Initially set the $n$-bit hash value to zero.
2. Process each successive $n$-bit block of data as follows:
a. Rotate the current hash value to the left by one bit.
b. XOR the block into the hash value.

$M$ consisting of a sequence of 64 -bit blocks $X 1, X 2, \ldots, X N$
$h=\mathrm{H}(M)$ as the block-by block XOR of all blocks and append the hash code as the final block:
$h=X N+1=X 1 \oplus X 2 \oplus \ldots \oplus X N$
Next, encrypt the entire message plus hash code using CBC mode to produce the encrypted message $Y 1, Y 2, \ldots, Y N+1$.
$X 1=I V \oplus D(K, Y 1)$
$X i=Y i-1 \oplus \mathrm{D}(K, Y i)$
$X N+1=Y N \oplus D(K, Y N+1)$

But $X N+1$ is the hash code:
$X N+1=X 1 \oplus X 2 \oplus \ldots \oplus X N$
$=[I V \oplus \mathrm{D}(K, Y 1)] \oplus[Y 1 \oplus \mathrm{D}(K, Y 2)] \oplus \mathrm{c} \oplus[Y N-1 \oplus \mathrm{D}(K, Y N)]$

Most widely used hash function has been the Secure Hash Algorithm (SHA)

|  | SHA-1 | SHA-224 | SHA-256 | SHA-384 | SHA-512 |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Message <br> Digest Size | 160 | 224 | 256 | 384 | 512 |
| Message <br> Size | $<2^{64}$ | $<2^{64}$ | $<2^{64}$ | $<2^{128}$ | $<2^{128}$ |
| Block Size | 512 | 512 | 512 | 1024 | 1024 |
| Word Size | 32 | 32 | 32 | 64 | 64 |
| Number of <br> Steps | 80 | 64 | 64 | 80 | 80 |

"Secure Hash Standard." SHA is based on the hash function MD4

## SHA-512 Logic

The algorithm takes as input a message with a maximum length of less than $2^{128}$ bits and produces as output a 512 -bit message digest. The input is processed in 1024-bit blocks.


+     - word-by-word addition mod $2^{64}$

Step 1 Append padding bits. The message is padded so that its length is congruent to 896 modulo 1024 [length K 896(mod 1024)]. Padding is always added, even if the message is already of the desired length. Thus, the number of padding bits is in the range of 1 to 1024. The padding consists of a single 1 bit followed by the necessary number of 0 bits.

Step 2 Append length. A block of 128 bits is appended to the message. This block is treated as an unsigned 128 -bit integer (most significant byte first) and contains the length of the original message (before the padding).

The outcome of the first two steps yields a message that is an integer multiple of 1024 bits in length. In Figure 11.9, the expanded message is represented as the sequence of 1024-bit blocks M1, M2, c, MN, so that the total length of the expanded message is $N$ * 1024 bits.

Step 3 Initialize hash buffer. A 512-bit buffer is used to hold intermediate and final results of the hash function. The buffer can be represented as eight 64-bit registers (a, b, c, d, e, f, g, h). These registers are initialized to the following 64-bit integers (hexadecimal values):
$a=6 A 09 E 667$ F3BCC908 e $=510$ E527FADE682D1
$\mathrm{b}=$ BB67AE8584CAA73B $\mathrm{f}=9$ B05688C2B3E6C1F
$\mathrm{c}=3 \mathrm{C} 6 \mathrm{EF} 372 \mathrm{FE} 94 \mathrm{~F} 82 \mathrm{Bg}=1$ F83D9ABFB41BD6B
$\mathrm{d}=\mathrm{A} 54 F F 53 A 5 F 1 \mathrm{D} 36 \mathrm{~F} 1 \mathrm{~h}=5 \mathrm{BE}$ COCD19137E2179

These values are stored in big-endian format, which is the most significant byte of a word in the low-address (leftmost) byte position. These words were obtained by taking the first sixtyfour bits of the fractional parts of the square roots of the first eight prime numbers.

Step 4 Process message in 1024-bit (128-word) blocks. The heart of the algorithm is a module that consists of 80 rounds; this module is labeled F in Figure.

Each round takes as input the 512-bit buffer value, abcdefgh, and updates the contents of the buffer. At input to the first round, the buffer has the value of the intermediate hash value, Hi1. Each round $t$ makes use of a 64-bit value $W t$, derived from the current 1024-bit block being processed (Mi). These values are derived using a message schedule described subsequently. Each round also makes use of an additive constant $K t$, where $0<=t<=79$ indicates one of the 80 rounds. These words represent the first 64 bits of the fractional parts of the cube roots of the first 80 prime numbers. The constants provide a "randomized" set of 64 -bit patterns, which should eliminate any regularities in the input data.

The output of the eightieth round is added to the input to the first round (Hi-1) to produce Hi . The addition is done independently for each of the eight words in the buffer with each of the corresponding words in $\mathrm{Hi}-1$, using addition modulo $2^{64}$.


Figure 11.10 SHA-512 Processing of a Single 1024-Bit Block

Step 5 Output. After all $N$ 1024-bit blocks have been processed, the output from the $N$ th stage is the 512-bit message digest.

## SHA-512 Constants

428a2f98d728ae22
7137449123ef65cd
3956c25bf3

We can summarize the behavior of SHA-512 as follows:
HO = IV
Hi = SUM64(Hi-1, abcdefghi)
$M D=H N$
where
IV = initial value of the abcdefgh buffer, defined in step 3
$\mathrm{abcdefgh}_{i}=$ the output of the last round of processing of the ith message block
$N=$ the number of blocks in the message (including padding and length fields)
SUM64 = addition modulo 264 performed separately on each word of the pair of inputs
$M D=$ final message digest value

## SHA-512 Round Function



Figure 11.11 Elementary SHA-512 Operation (single round)
Let us look in more detail at the logic in each of the 80 steps of the processing of one 512 bit block. Each round is defined by the following set of equations:
$T 1=h+\operatorname{Ch}(e, f, g)+1 a$
512
$1 e 2+W t+K t$
$T 2=1 \mathrm{a}$
512
$0 a 2+\operatorname{Maj}(a, b, c)$
$h=g$
$g=f$
$f=e$
$e=d+T 1$
$d=c$
$c=b$
$b=a$
$a=T 1+T 2$

```
\(t \quad=\) step number; \(0 \leq t \leq 79\)
\(\mathrm{Ch}(e, f, g)=(e \mathrm{AND} f) \oplus(\mathrm{NOT} e \mathrm{AND} g)\)
    the conditional function: If e then \(f\) else \(g\)
\(\operatorname{Maj}(a, b, c)=(a \mathrm{AND} b) \oplus(a \mathrm{AND} c) \oplus(b \mathrm{AND} c)\)
    the function is true only of the majority (two or three) of the
    arguments are true
\(\left(\sum_{0}^{512} a\right)=\operatorname{ROTR}^{28}(a) \oplus \operatorname{ROTR}^{34}(a) \oplus \operatorname{ROTR}^{39}(a)\)
\(\left(\sum_{1}^{512} e\right)=\operatorname{ROTR}^{14}(e) \oplus \operatorname{ROTR}^{18}(e) \oplus \operatorname{ROTR}^{41}(e)\)
\(\operatorname{ROTR}^{n}(x)=\) circular right shift (rotation) of the 64-bit argument \(x\) by \(n\) bits
```

$W_{t}=\mathrm{a} 64$-bit word derived from the current 1024-bit input block
$K_{t}=\mathrm{a} 64$-bit additive constant
$+=$ addition modulo $2^{64}$
Two observations can be made about the round function.

1. Six of the eight words of the output of the round function involve simply permutation ( $b, c$, $d, f, g, h)$ by means of rotation. This is indicated by shading in Figure 11.11.
2. Only two of the output words ( $a, e$ ) are generated by substitution. Word $e$ is a function of input variables ( $d, e, f, g, h$ ), as well as the round word $W t$ and the constant $K t$. Word $a$ is a function of all of the input variables except $d$, as well as the round word $W t$ and the constant $K t$.

It remains to indicate how the 64-bit word values $W t$ are derived from the 1024-bit message. Figure 11.12 illustrates the mapping. The first 16 values of $W t$ are taken directly from the 16 words of the current block. The remaining values are defined as

$$
W_{t}=\sigma_{1}^{512}\left(W_{t-2}\right)+W_{t-7}+\sigma_{0}^{512}\left(W_{t-15}\right)+W_{t-16}
$$

where

$$
\begin{aligned}
\sigma_{0}^{512}(x) & =\operatorname{ROTR}^{1}(x) \oplus \operatorname{ROTR}^{8}(x) \oplus \operatorname{SHR}^{7}(x) \\
\sigma_{1}^{512}(x) & =\operatorname{ROTR}^{19}(x) \oplus \operatorname{ROTR}^{61}(x) \oplus \operatorname{SHR}^{6}(x) \\
\operatorname{ROTR}^{n}(x) & =\text { circular right shift (rotation) of the } 64 \text {-bit argument } x \text { by } n \text { bits } \\
\operatorname{SHR}^{n}(x) & =\text { left shift of the } 64 \text {-bit argument } x \text { by } n \text { bits with padding by zeros } \\
& \text { on the right } \\
+ & =\text { addition modulo } 2^{64}
\end{aligned}
$$

Thus, in the first 16 steps of processing, the value of $W t$ is equal to the corresponding word in the message block. For the remaining 64 steps, the value of $W t$ consists of the circular left shift by one bit of the XOR of four of the preceding values of $W t$, with two of those values subjected to shift and rotate operations. This introduces a great deal of redundancy and interdependence into the message blocks that are compressed, which complicates the task of finding a different message block that maps to the same compression function output.


Figure 11.12 Creation of 80-word Input Sequence for SHA-512 Processing of Single Block

