Encrypted Matrix Inversion
The article details my winning solution for the Invertible Matrix challenge.
Please refer to the challenge description page for comprehensive details, including the objective, input/output format, evaluation metrics, and other requirements.
Introduction
Matrix inversion is a cornerstone of linear algebra, but performing it on encrypted data using HE presents unique challenges. The FHERMA Invertible Matrix Challenge tasked us with inverting a nonsingular 64x64 matrix efficiently—within approximately 30 minutes per matrix. The complexity arises from the deep computation circuits required for matrix operations over ciphertexts, which significantly increases both time and resource demands.
The Core Idea: Iterative Matrix Inversion
Based on a review of recent literature, we adopted an iterative algorithm proposed in [1], which combines Goldschmidt’s and Newton’s methods. This hybrid approach is ideal for HE, as it maintains a low multiplicative depth—a critical factor given the substantial overhead introduced by each ciphertext multiplication.
Below is the algorithm in Python:
""" Compute the inverse of a matrix A using the Goldschmidt algorithm. """
A_inv = np.linalg.inv(A) # Ground truth for comparison
I = np.eye(A.shape[0])
norm_A = 24 # Empirically chosen constant
X = A.T / (norm_A ** 2)
R = I - A @ A.T / (norm_A ** 2)
for iteration in range(max_iterations):
X = X @ (I + R)
R = R @ R
# Check for convergence
residual_norm = np.abs(X - A_inv).max()
if residual_norm < tolerance:
print(f"Converged in {iteration + 1} iterations.")
return X
In this context, the value norm_A represents the trace norm of the matrix product A @ A.T. However, to reduce computational overhead, we empirically substitute this value with a constant (24). Experimental results indicate that the algorithm converges within 26 iterations with high probability. Analyzing the inner loop of the algorithm reveals that each iteration incurs a multiplicative depth of only one, consistent with the claim in [1]. Nevertheless, due to the nature of matrix multiplication over encrypted data, additional multiplicative levels are required to perform slot masking and realignment within the ciphertexts of matrices X and R after each iteration.
Key Building Block: Ciphertext Matrix Multiplication with Tile Tensors
There is extensive research on ciphertext matrix-matrix multiplication. In this work, we adopt the tile tensor abstraction introduced in [2] and IBM HELayer SDK [3], owing to its clarity and implementation ease. For a detailed introduction to tile tensors and the set of operations supported on this data structure, please refer to [2].
Understanding Tile Tensors
Tile tensors split matrices into smaller “tiles” that fit into ciphertext slots. For a matrix $X$ of size $a \times b$, we encode it as $X[\frac{a}{t_1}, \frac{b}{t_2}, \frac{1}{t_3}]$, where $t_1, t_2, t_3$ are tiling dimensions. This structure supports operations like duplication, summation, and multiplication along specific axes.
Performing Matrix Multiplication
Given two matrices $X[a,b]$ and $Y[b,c]$, to compute $Z=XY$, we encode $X$ and $Y$ so the contracted dimension (the dimension over which the summation runs over) aligns in their tensor shapes. One possible encoding is $X[\frac{a}{t_1},\frac{b}{t_2},\frac{1}{t_3}]$ and $Y[\frac{1}{t_1},\frac{b}{t_2},\frac{c}{t_3}]$. The multiplication proceeds as:
$X[\frac{a}{t_1},\frac{b}{t_2},\frac{*}{t_3}] = duplicate(X[\frac{a}{t_1},\frac{b}{t_2},\frac{1}{t_3}], 3)$,
$Y[\frac{*}{t_1},\frac{b}{t_2},\frac{c}{t_3}] = duplicate(Y[\frac{1}{t_1},\frac{b}{t_2},\frac{c}{t_3}], 1)$,
$Z[\frac{a}{t_1},\frac{1?}{t_2},\frac{c}{t_3}] = sum(X[\frac{a}{t_1},\frac{b}{t_2},\frac{*}{t_3}]*Y[\frac{*}{t_1},\frac{b}{t_2},\frac{c}{t_3}], 2)$,
where $duplicate(T,i)$ duplicates the tensor along the $i$-th dimension, and $sum(T,i)$ sums over the $i$-th dimension. The resulting tensor $Z[\frac{a}{t_1},\frac{1?}{t_2},\frac{c}{t_3}]$ contains unknown values in the second dimension, denoted by $(1?)$. To enable further computations, we clear unused slots using the $clear$ operator:
$clear(Z[\frac{a}{t_1},\frac{1?}{t_2},\frac{c}{t_3}]) = Z[\frac{a}{t_1},\frac{1}{t_2},\frac{c}{t_3}]$.
The $clear$ operation involves multiplication with a plaintext mask, incurring one multiplicative level.
Matrix transposition, e.g., from $X[\frac{a}{t_1},\frac{b}{t_2},\frac{1}{t_3}]$ to $X^\top[\frac{1}{t_1},\frac{b}{t_2},\frac{a}{t_3}]$, is performed as follows:
$X[\frac{a}{t_1},\frac{b}{t_2},\frac{*}{t_3}] = duplicate(X[\frac{a}{t_1},\frac{b}{t_2},\frac{1}{t_3}], 3)$,
$I[\frac{a}{t_1},\frac{*}{t_2},\frac{a}{t_3}] = duplicate(I[\frac{a}{t_1},\frac{1}{t_2},\frac{a}{t_3}], 2)$,
$X^\top[\frac{1?}{t_1},\frac{b}{t_2},\frac{a}{t_3}] = sum(X[\frac{a}{t_1},\frac{b}{t_2},\frac{*}{t_3}]*I[\frac{a}{t_1},\frac{*}{t_2},\frac{a}{t_3}], 1)$,
$X^\top[\frac{1}{t_1},\frac{b}{t_2},\frac{a}{t_3}] = clear(X^\top[\frac{1?}{t_1},\frac{b}{t_2},\frac{a}{t_3}])$,
where $I$ is the identity matrix of size $a \times a$.
The Encrypted Algorithm
We now construct the complete encrypted matrix inversion algorithm. The input ciphertext $A$ is encoded as a tile tensor of shape $[\frac{1}{t_1},\frac{a}{t_2},\frac{b}{t_3}]$, with $a=b=64$, $t_1=t_2=64, t_3=32$, fitting $131,072$ slots. Following the cleartext algorithm, we first compute $R = I - AA^\top / norm_A$. We transpose $A[\frac{1}{t_1},\frac{a}{t_2},\frac{b}{t_3}]$ to $A^\top[\frac{b}{t_1},\frac{a}{t_2},\frac{1}{t_3}]$, then calculate:
$A[\frac{*}{t_1},\frac{a}{t_2},\frac{b}{t_3}] = duplicate(A[\frac{1}{t_1},\frac{a}{t_2},\frac{b}{t_3}], 1)$,
$A^\top[\frac{b}{t_1},\frac{a}{t_2},\frac{*}{t_3}] = duplicate(A^\top[\frac{b}{t_1},\frac{a}{t_2},\frac{1}{t_3}], 3)$,
$AA^\top[\frac{b}{t_1},\frac{1?}{t_2},\frac{b}{t_3}] = sum(A[\frac{*}{t_1},\frac{a}{t_2},\frac{b}{t_3}]*A^\top[\frac{b}{t_1},\frac{a}{t_2},\frac{*}{t_3}], 2)$,
$R[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}] = I[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}] - clear(AA^\top[\frac{b}{t_1},\frac{1?}{t_2},\frac{b}{t_3}]) / norm_A$
Since $AA^\top$ and $R$ are symmetric, we can select either dimension (e.g., 1 or 3) as the contracted dimension for multiplications.
With $X[\frac{1}{t_1},\frac{a}{t_2},\frac{b}{t_3}]$ and $R[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}]$, we enter the iteration loop. At each iteration, $X$ is updated as $X = X(I+R)$, which is straightforward due to compatible shapes. However, updating $R=R^2$ requires transposing one of $R$’s dimensions (e.g., from $[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}]$ to $[\frac{1}{t_1},\frac{b}{t_2},\frac{b}{t_3}]$) to enable multiplication. As mentioned earlier, transposition involves multiplying with an appropriately encoded identity matrix, resulting in two matrix products per iteration and doubling the multiplicative depth.
To address this, we use a simple optimization. Instead of maintaining one encoding of $R$, we store three: $R_1[\frac{1}{t_1},\frac{b}{t_2},\frac{b}{t_3}]$, $R_2[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}]$, and $R_3[\frac{b}{t_1},\frac{b}{t_2},\frac{1}{t_3}]$, which are derived from $R$ before the loop. Within the loop, these encodings are used pairwise to compute updates, e.g.: $R_1[\frac{1?}{t_1},\frac{b}{t_2},\frac{b}{t_3}] = sum(R_2[\frac{b}{t_1},\frac{1}{t_2},\frac{b}{t_3}]*R_3[\frac{b}{t_1},\frac{b}{t_2},\frac{1}{t_3}]), 1)$, and similarly for $R_2$ and $R_3$. This keeps the multiplicative depth per iteration at 2 (equivalent to one matrix multiplication), at the cost of four matrix multiplications per iteration instead of two. This trade-off is justified, as the computational cost is lower than the savings from reduced depth. The total depth is: 26 (iterations) * 2 + 1 (scaling) + 4 (computing $R_i$) + 4 (reshaping result) = 61.
References
[1] Ahn, T.M., Lee, K.H., Yoo, J.S., Yoon, J.W. (2024). Cheap and Fast Iterative Matrix Inverse in Encrypted Domain. In: Tsudik, G., Conti, M., Liang, K., Smaragdakis, G. (eds) Computer Security – ESORICS 2023. Lecture Notes in Computer Science, vol 14344. Springer, Cham.
[2] Aharoni, E., Adir, A., Baruch, M., Drucker, N., Ezov, G., Farkash, A., Greenberg, L., Masalha, R., Moshkowich, G., Murik, D., Shaul, H., Soceanu, O. (2023). HeLayers: A Tile Tensors Framework for Large Neural Networks on Encrypted Data. Privacy Enhancing Technology Symposium (PETs).
[3] https://ibm.github.io/helayers/