FHE Benchmarking: ML Inference Workload

Results - MNIST

Specification

The ML inference workload implements an encrypted inference functionality. The input is a collection of 28x28 images from the MNIST dataset representing handwritten digits. The goal is to classify these images between 0 and 9 using encrypted ML Inference using homomorphic encryption.

The workload includes two interfaces for benchmark submitters to implement:

Size Records (N)
Small 100
Medium 1,000
Large 10,000

Hence, there are a total of four variants of this workload: single inference and batch inference for each one of the three sizes. Submitters need not implement all four, instead each submitter can implement and report the results of any subset.

Submission to the benchmarking suite must set the implementation parameters so as to achieve security level of at least 128 bits (against a semi-honest server). Submitters must document their choice of parameters and explain why they believe that it meets the 128-bit security mandate. (For example, for LWE-based schemes without a sparse key, they can rely on Table 5.2 or Table 5.3 in the HE-security-guidelines document of Bossuat et al. [BCC+24].)

Submissions are also required to meet the quality bar of correct inference result for single inference and at least 90% accuracy for batch inference.

The ml-inference harness contains a script that can be called to run the implementation of submitters, that script accepts command-line arguments to specify which interface of what instance size to run.


$ python3 harness/run_submission.py -h
usage: run_submission.py [-h] [--num_runs NUM_RUNS] [--seed SEED] [--count_only] [--remote]
                         {0,1,2,3}

Run the ml-inference FHE benchmark.

positional arguments:
  {0,1,2,3}            Instance size (0-toy/1-small/2-medium/3-large)

options:
  -h, --help           show this help message and exit
  --num_runs NUM_RUNS  Number of times to run steps 4-9 (default: 1)
  --seed SEED          Random seed for dataset and query generation
  --count_only         Only count # of matches, do not return payloads
  --remote             Run example submission in remote backend mode

Note that in the future, this workload will be updated to support more models and more datasets.

You can find more details on the ml-inference Github repository.

Bibliography

[BCC+24] Security guidelines for implementing homomorphic encryption. Jean-Philippe Bossuat, Rosario Cammarota, Ilaria Chillotti, Benjamin R. Curtis, Wei Dai, Huijing Gong, Erin Hales, Duhyeong Kim, Bryan Kumara, Changmin Lee, Xianhui Lu, Carsten Maple, Alberto Pedrouzo-Ulloa, Rachel Player, Yuriy Polyakov, Luis Antonio Ruiz Lopez, Yongsoo Song, and Donggeon Yhee. IACR Communications in Cryptology, 1(4):26, 2024.