- The implementation essentially involves absorption and squeezing phases.
- The input data is partitioned into equal sized blocks. The block size = rate.
- Multi-rate padding rule is used.
Absorbtion
for each block:
state[0:r] = state [0:r] ^ block
permutation(state)
Squeezing
Z = ""
while length(Z) < output_length:
Z = Z || S[0:r] S = permutation(S)
State
- The state is (I think) implemented as a giant 1-D array.
- But it’s useful to also be able to think of it as a 5x5x64 3-D matrix (not sure where 64 came from).
- 5, 5 are rows and columns and each lane is 64 bits deep.
- Most operations (I think) are done lane-wise.
- This leads to state being an array of 1600 elements.
Permutation Function
- Has 5 steps: theta, rho, pi, chi, iota.
Theta
- Calculate parity of all the columns (bits of lanes in each column).
- I don’t think parity really means odd/even parities here. It’s just xor.
- Calculate deltas: xor of the left column parity with the rotated right column parity.
- D[y]=C[y−1]⊕ROTL(C[y+1],1)
- ROTL is just rotating the bits left before xoring.
- Updating the state: for each n-th bit in the state, XOR it to the n-th bit of the Delta of the column that bit belongs to.
Rho
- Each lane is literally just a left rotate by a constant.
- Rotated per lane.
- The constant is based on col, row.
Pi
- This moves the lanes into different indices.
\begin{flalign*}
(x, y) \rightarrow (x', y') \\
x' = y\\
y' = (2x + 3y)\ \%\ 5
\end{flalign*}
Chi
- Each bit is xored with the ( xor of the bits in the next col and next-to-next col, but same row).
Iota
- Each bit is XORed with a constant.