Skip to main

Zero-Knowledge Proof - Arkworks Usage Guide

This blog showcases how to use Arkworks ecosystem to implement a zk-SNARK logic into a Rust application

ByVitali Bestolkau Explainers

19 min read

Zero-Knowledge Proof - Arkworks Usage Guide

Introduction

Arkworks is not the most famous but definitely one of the most intriguing projects in Zero-Knowledge field. This article provides an in-depth explanation of the versatile and robust Arkworks ecosystem. It covers its general usage and provides a guide to ZKP programming with Groth16, supported by code samples to make the implementation aspect easier to understand.

0 Prerequisites

The Zero-Knowledge by itself is quite a difficult subject, but understanding how to program it can be even harder. That is why the reader is advised to meet the following criteria:

  1. Basic knowledge of crypto primitives such as Collision-Resistant Hash (CRH), Merkle Trees, Elliptic Curves, and Pedersen Commitments.
  2. Basic understanding of zk-SNARKs work, such as (public) input and witness, R1CS, Proving key, and Verifying key generation.
  3. Basic Rust coding skills.
  1. Collision resistance
  2. Merkle Trees
  3. Elliptic Curves and Pedersen Commitments
  4. General purpose zk-SNARKs
  5. Rust Basics

1 What is Arkworks?

Arkworks is the ecosystem for zk-SNARK implementation into applications, which is built in Rust. However, besides the Zero Knowledge Proof (ZKP) implementation, Arkworks offers crates for Algebra, Crypto Primitive, Curves and many other, which makes Arkworks way more versatile comparing to other existing Zero-Knowledge tools. Although, this aspect makes Arkworks very flexible to implement, at the same time it brings another layer of complexity, as the developers also need to know how certain cryptographic primitives and needed algebra work.

But why should we use the difficult to understand Arkworks ecosystem for ZKP implementation if there are already numerous existing tools that do that in a more simplified way, such as Circom and bellman? It is better to ask this question to companies such as Mina Protocol, Polygon, zkSync, ZCash and many others. Because according to the Arkworks' latest presentation (15-04-2022) all these projects utilize Arkworks ecosystem. They don’t use Arkworks directly, but they take some code as a backbone and improve it to meet their project needs. Not to mention, that Arkworks is recognized by IT giants and being funded by Google and Ethereum (source).

So, the point of this article is to explain how to make use of Arkworks, as unfortunately, they don’t have good documentation yet. However, be wary: at the moment of writing this article the Arkworks repositories are not production ready.

2 Initial code

Throughout the whole article there will be small pieces of code samples, which are small parts of our Proof of Concept.

You can check it for even more context, but keep in mind that the code in the article is a cleaner and easier to understand version of the code from the repository, so they are not identical.

If you don't understand anything don't get scared, it is normal. This article will explain step by step all the techniques used in the initial code.

So, lets begin the deep dive into Arkworks.

3 General

Although, Arkworks has a lot of repositories for different purposes, there are some traits/patterns, which are very common. Knowing these patterns you can already understand half of the Arkworks code and the whole system becomes less confusing.

3.1 <type> and <type>Var

Almost every Arkworks repository at some point introduces certain types. Then later in the code this repository introduces the same types, but the only visible difference between two is the addition “Var” to the end of the name of the previously mentioned type. But what is the logic behind it?

The first types are the general type you use while programming in Rust. Let’s call this type <type>.

The second type is the one that can be accepted by R1CS. So, it is the same value as of the type <type>, but now it can be operated within R1CS. This type is usually called <type>Var. So, if the name of the <type> is Name, then the name of the <type>Var will be NameVar.

💡

Quick reminders: R1CS stands for the “Rank 1 Constraint System”, which is a low-level computation of values utilized by zk-SNARK. That’s why before being used in the SNARK system, all the inputs have to be converted into their R1CS representation first. Learn more from Vitalik Buterin's article.

R1CS-compatability can be achieved using “native” types specified in the r1cs-std repository. Or if you want to use your own struct or type in R1CS, then this struct has to implement the AllocVar trait. And to be able to generate values for the constraint system (R1CS), e.g., witness, inputs, constants, etc., you need to implement the method “new_variable,” which converts the value of the type <type> into type Field or simply, R1CS readable type. This is how it usually looks like:

1use ark_bls12_381::Fr;
2use ark_r1cs_std::bits::uint64::UInt64;
3use ark_r1cs_std::prelude::*;
4use std::borrow::Borrow;
5use ark_relations::r1cs::{Namespace, SynthesisError};
6
7pub struct Number(pub u64);
8
9pub struct NumberVar(pub UInt64<Fr>);
10
11impl AllocVar<Number, Fr> for NumberVar {
12 #[tracing::instrument(target = "r1cs", skip(cs, f, mode))]
13 fn new_variable<T: Borrow<Number>>(
14 cs: impl Into<Namespace<Fr>>,
15 f: impl FnOnce() -> Result<T, SynthesisError>,
16 mode: AllocationMode,
17 ) -> Result<Self, SynthesisError> {
18 UInt64::new_variable(cs.into(), || f().map(|u| u.borrow().0), mode).map(Self)
19 }
20}

💡

To better understand new implementations, check the following links:

  1. SynthesisError → Error handling section
  2. Fr → a particular Finite Field implementation. More about fields in Arkworks here.
  3. UInt64 → a R1CS equivalent for uint64.
  4. Namespace → honestly, we are not sure what this struct does, but you need to use it as shown in the example for allocating a variable in R1CS.
  5. Tracing annotation → can be seen as Backtrace for R1CS as mentioned here.

3.2 <trait> and <trait>Gadget

Sometimes, you may encounter traits whose names differ only by the word “Gadget” in the end. This is very similar to the example with the types.

The trait <trait> is used for general computation, and the <trait>Gadget is used for computation within R1CS. Moreover, in Arkworks, the structs that implement the <trait> trait make use of the <type> variables, while <trait>Gadget structs use <type>Var variables.

For example, the traits CRHScheme and TwoToOneCRHScheme use variables of types Input, Output, and Parameters. At the same time, the traits  CRHSchemeGadget and TwoToOneCRHSchemeGadget use variables of type InputVar, OutputVar, and ParametersVar. Later in the article a more elaborate and clear example will be presented.

3.3 Error Handling

When writing functions, you need to know what type of Error is returned every time to put it in the return type Result<T, E>. Luckily, right now all Arkworks error handling comes down to a single error type: SynthesisError. This is an enum, which also implements Display trait, which allows displaying more elaborate error explanation in the console, that helps during debugging. However, these messages are not that helpful as they could have been, but the names of all the SynthesisError options are self-explanatory.

3.4 Algebra

For certain cases it is better to use the Algebra crate, which allows to use Finite Fields, Elliptic Curves and Pairings.

For example, in our example it is used to calculate big numbers. Usually, when multiplying two big numbers, the compiler throws an error. And BigInt crate was not accurate enough for our use-case, but Finite Fields worked perfectly.

Although this crate is very useful and easy to use, it has some limitations. One of them is that you can not divide the variables. Most likely, it was done in order to avoid any inaccuracies, as division most likely would lead to numbers with infinite decimal digits, which would be impossible to store and would lead to wrong results.

To learn more about the Arkworks Algebra crate and how to use it check their README file in this repository.

4 Crypto Primitives

Crypto primitives is an arkworks repository that provides the implementation for different cryptographical concepts: from Merkle Trees to Signatures. All these primitives have their own purpose and can be used according to your needs, but generally in the SNARK implementation they serve as public inputs and witness and they get manipulated in the constraint system to form a circuit logic, but this will be shown later in the article.

As mentioned, Crypto Primitives repository implements many different concept, and it would be very difficult to cover them in one article. So here we will focus only on the Collision-Resistant Hash (CRH) implementation.

4.1 Collision-Resistant Hash (CRH)

CRHs are mostly used in the cases when you want to be sure that a single value will lead to a single unique hash. It also means that any other value different from the initial value will never result in the same hash and that the same value will always result in the same value. The password management tool is a prime use-case for that.

In Arkworks, CRH has several traits that users can implement with their custom logic. However, at the same time there are already Arkworks implementations which utilize CRH logic, such as Pedersen Commitments, SHA256, and others.

CRH has four traits that can be implemented:

We will go through all of them.

CRHScheme

The CRHScheme trait is used to create a common CRH, which is used in all the cases except one: generating parent nodes from their leaves in Merkle Trees. This is the responsibility of the TwoToOneCRHScheme trait.

The meaning behind its values and functions will be explained below. The TwoToOneCRHScheme explanation will follow the same structure.

  1. Firstly, CRHScheme has three internal types:

    • Input. Represents the input from which the hash is generated.
    • Output. Represents the output of the hash generation function.
    • Parameters. Parameters required for the hash generation.
  2. Also CRHScheme has two methods to implement:

a. Setup

1fn setup<R: Rng>(r: &mut R) -> Result<Self::Parameters, Error>;

This method requires a random variable. Arkworks usually use their implementation in the tests, which is not secure, as they mention themselves. Instead, an Rng variable from the rand crate can be used like that:

1let mut rng = rand::thread_rng();

As an output, the setup method returns a Result variable. The Result variable is unwrapped then to get the Parameters value. It looks like this:

1let mut rng = rand::thread_rng();
2let params = MyCRH::setup(&mut rng).unwrap();

💡

Note: further on the Result part will be omitted. Meaning that whenever a variable of the type Result<T, E> is returned, it will be said that the variable of the type T is returned instead. It should be assumed that the Result value was handled appropriately.

b. Evaluate

1fn evaluate<T: Borrow<Self::Input>>(
2 parameters: &Self::Parameters,
3 input: T,
4 ) -> Result<Self::Output, Error>;

There are two parameters for this function: one of the type Parameters and the other of type Input. The first value is received from the setup() method mentioned earlier. Input is usually an array of bytes. The majority of types usually already implement methods similar to to_bytes(), so you can use this function to get the needed Input value.

💡

Note: in Arkworks, as well as in some other Rust crates, there are two different types of bytes: Big-Endian (bytes_be) and Little-Endian (bytes_le). (For detailed information about their difference, check the source.) Usually, Arkworks specifies which kind of bytes they need as a parameter for a function, but sometimes they require bytes. It most likely means that the type of bytes doesn’t matter, but we are not sure. After all, in the cases where the type of the bytes is undefined, we would suggest using Big-Endian bytes.

The Output value received from the evaluate() function can be considered an actual CRH.

TwoToOneCRHScheme

As mentioned, the only purpose of this trait is to generate the parent nodes of their leaves in Merkle Trees. The variables and functions of this trait and their purposes are shown below.

  1. Same as CRHScheme TwoToOneCRHScheme has three internal types:

    • Input. Represents the input from which the hash is generated.
    • Output. Represents the output of the hash generation function.
    • Parameters. Parameters required for the hash generation.
  2. Similar to CRHScheme TwoToOneCRHScheme has setup() and evaluate() functions, but it also has an additional compress() function.

a. Setup

The setup function acts precisely the same as in the CRHScheme trait.

b. Evaluate and Compress

Both functions calculate a parent node from two leaves/child nodes. This is how the code in the interface looks like:

1fn evaluate<T: Borrow<Self::Input>>(
2 parameters: &Self::Parameters,
3 left_input: T,
4 right_input: T,
5 ) -> Result<Self::Output, Error>;
6
7fn compress<T: Borrow<Self::Output>>(
8 parameters: &Self::Parameters,
9 left_input: T,
10 right_input: T,
11 ) -> Result<Self::Output, Error>;

In both methods, three parameters are needed: a well-known Parameters variable, a left_input variable, which represents the left child, and a right_input variable, which refers to the right child.

The only difference between the functions is the type of input that should be provided. Both functions accept the same type of parameters, but in the evaluate() function, both left_input and right_input are of Input type. In the compress() function, they are of the Output type.

This may mean that the evaluate() method calculates nodes from the bytes of the leaves. Compress(), on the other hand, calculates the node hash from the given node hashes. Possibly, that is why Arkworks, in their Merkle Tree membership verification implementation, use the evaluate() function only for the first two leaves. Then they use only compress() until the Root is calculated.

CRHSchemeGadget and TwoToOneCRHSchemeGadget

These traits implement the same functions, which work the same as in the CRHScheme and TwoToOneCRHScheme, respectively. The only difference is that Gadget traits use <type>Var and are used for R1CS.

4.2 CRH Variants

In Arkworks, CRH is just an interface. However, Arkworks already have different implementations of all CRH traits inside Pedersen commitment, SHA256, Poseidon, and others. These CRH variants are ready for use for general purposes and R1CS-compatible usage.

When using a CRH variants, there is a lot to define: a specific curve, window, etc. To make life easier, it is suggested to initialize the CRHs as Arkworks does in their tests. For example, for Pedersen commitments it looks the following way:

1use ark_crypto_primitives::crh::pedersen;
2use ark_ed_on_bls12_381::{constraints::EdwardsVar, EdwardsProjective as JubJub};
3
4#[derive(Clone, PartialEq, Eq, Hash)]
5pub struct Window;
6
7impl pedersen::Window for Window {
8 const WINDOW_SIZE: usize = 128;
9 const NUM_WINDOWS: usize = 8;
10}
11
12pub type MyCRH = pedersen::CRH<JubJub, Window>;
13pub type MyCRHGadget = pedersen::constraints::CRHGadget<JubJub, EdwardsVar, Window>;

5 ZKP Implementation

After all the “basics” are covered, it’s time to get serious. In this section we will explain all the phases of the full SNARK implementation process: from building a circuit to verifying a proof. Every explanation is also followed by the corresponding code to make the implementation part even more clear.

5.1 ConstraintSynthesizer

Before any zk-SNARK computation, a constraint system is required together with a circuit, where a constraint system defines constraints (e.g. witness, (public) input, constant) and a circuit defines the logic. Also, all the checks are handled in the circuit.

The ConstraintSynthesizer trait is responsible for both. It has a single function, generate_constraints(), where all the constraints should be initialized and defines the circuit logic. This means that in this function, you should make use of <type>Var::new_variable() and similar functions, like new_input(), new_witness(), and new_constant(), and that you should describe here what kind of calculations and checks should be executed here for the witness to be verified.

Usually, the ContraintSynthesizer trait is implemented by a struct that stores all the variables that will be used in the constraint system and the circuit. The implementation looks like this:

1pub struct HashDataVar {
2 data: HashData,
3 params: DataParams,
4 public_hash_commitment: Option<Commitment>,
5}
6
7impl ConstraintSynthesizer<ScalarField> for HashDataVar {
8 #[tracing::instrument(target = "r1cs", skip(self, cs))]
9 fn generate_constraints(
10 self,
11 cs: ConstraintSystemRef<ScalarField>,
12 ) -> Result<(), SynthesisError> {
13
14 let wallet_address_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.wallet_address.to_be_bytes());
15 let first_half_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.first_pass_half.to_be_bytes());
16 let second_half_var_scalar = ScalarField::from_be_bytes_mod_order(&self.data.second_pass_half.to_be_bytes());
17
18 let final_scalar = wallet_address_var_scalar * first_half_var_scalar - second_half_var_scalar;
19
20 let final_commitment = HashDataVar::scalar_to_commitment(final_scalar, &self.params);
21
22 let pub_hash_commitment_var = CommitmentVar::new_input(
23 ark_relations::ns!(cs, "The commitment of the public hash"),
24 || { Ok(self.public_hash_commitment.unwrap()) },
25 )?;
26
27 let final_commitment_var = CommitmentVar::new_witness(
28 ark_relations::ns!(cs, "The commitment of wallet address, first half of the password and the second half of the password"),
29 || { Ok(final_commitment) },
30 )?;
31
32 pub_hash_commitment_var.enforce_equal(&final_commitment_var)?;
33
34 Ok(())
35 }
36}

💡

Explanation of new structs

To begin with, all the new structs, except ScalarField, are custom structs that are not part of the Arkworks ecosystem. So, the explanation:

  1. HashData → the struct that holds additional data: wallet_address, first_pass_half and second_pass_half. They are used in the beginning of the generate_constraint() function.
  2. Commitment → actually an Output type from the Pedersen Commitment implementation of the CRHScheme trait.
  3. CommitmentVar → the OutputVar type from the Pedersen Commitment implementation of the CRHSchemeGadget trait.
  4. DataParams → the Parameters type from the Pedersen Commitment implementation of the CRHScheme trait.
  5. ScalarField → actually a Fr type mentioned in types section the Arkworks struct used here to compute big numbers.

Explanation of new functions

There are two new methods. Although they are self-explanatory, here is a brief description:

  1. HashDataVar::scalar_to_commitment(ScalarField, &DataParams) → converts a ScalarField value into a Commitment. DataParams value is used here to ensure that all the generated Commitments in this constraint system have the same parameters so that if the inputs scalar_field_1 == scalar_field_2 , then the generated commitments are also equal. Because if scalar_field_1 == scalar_field_2 && params_1 != params_2 then commitment_1 != commitment_2 .
  2. enforce_equal() → checks if the values are equal. If not, then return a SynthesisError.

If you want to check the whole code and understand what this code is trying to do, check this repository. And the Arkworks-related code is in this file only.

5.2 Groth16

Finally, the Zero-Knowledge Proof itself. Luckily, the Groth16 implementation is quite easy to use. All the actual SNARK logic is done under the hood. Mainly, we just need to make sure that we provide the functions with proper parameters.

It is worth mentioning that there is also a Marlin implementation, which uses a universal setup. We didn’t use it because the Universal Setup doesn’t change anything if you have only one circuit, but it is worth checking if you want an extensive application with numerous circuits.

Arkworks Groth16 implementation has three phases:

  1. Key generation phase
  2. Proof generation phase
  3. Proof verification phase.

We will go through each of them and show the proper usage techniques.

Key Generation

The key generation phase generates a proving key pk and a verifying key vk, which are also connected to the provided circuit:

1let circuit_defining_cs = HashDataVar::new(HashDataVar::default());
2
3let mut rng = rand::thread_rng();
4let (pk, vk) =
5Groth16::<Bls12_381>::circuit_specific_setup(circuit_defining_cs, &mut rng)?;

The function HashDataVar::default() provides hardcoded dummy data. This is done because, in this phase, the provided data is irrelevant. The pk and vk are only bonded with the structure of the circuit, and the data is not considered during the bonding process. Besides, the constraints (witness, input, etc.) must be of the same type and there should be the same number of the constraints (but not of the same values) as in an actual circuit. The only requirement from the data (values) is to be correct (meaning that it should pass the checks in the circuit). Otherwise, the Error will be thrown from the circuit itself.

Proof Generation

After creating keys, we generate proof of the fact that the witness is known:

1let proof = Groth16::prove(&pk, data, &mut rng)?;

Where data is a circuit that should be verified. Or to be more specific, data is a variable of the type HashDataVar which also contains the actual data but not the dummy one as in the Key Generation case.

Also, Arkworks checks if the provided data is correct while generating the proof, which means the data passes all the checks in the circuit. If the data is incorrect, Arkworks will throw a SynthesisError in this step and will not move to the Proof Verification phase.

Proof Verification

When the proof is generated, all that is left is to verify it. But before that, the public inputs should be defined. Because the verifier doesn’t learn the public inputs and witnesses from the proof.

1let public_input= [
2data.public_hash_commitment.unwrap().x,
3data.public_hash_commitment.unwrap().y
4];
5
6let valid_proof = Groth16::verify(&vk, &public_input, &proof)?;

In our case, the public input is a Pedersen Commitment. Arkworks doesn’t accept the plain Commitment value as public input. But it accepts the coordinates of the Pedersen Commitment as public inputs (a Pedersen Commitment can also be represented as a point on an Elliptic Curve, that’s why we can get coordinates x and y for it).

During the verification, Arkworks also checks if the vk is paired with this proof or not. If not, the SynthesisError is thrown. The Error is also thrown if the provided public inputs are incorrect. If all the checks are passed, the verify function returns true.

All Together

This is how everything looks together, with some comments to make certain parts clear:

1pub fn prove_with_zkp(data: HashDataVar) -> Result<bool, SynthesisError> {
2if let None = data.public_hash_commitment {
3return Err(SynthesisError::AssignmentMissing);
4}
5
6// Use a circuit just to generate the circuit
7// This circuit is used to tell the SNARK the setup of the circuit that we are going to verify.
8// Thus SNARK generates proving key (pk) and verifying key (vk) and "connects" them (meaning
9// that only this vk can verify this pk and only this pk can be verifier by this vk)
10let circuit_defining_cs = HashDataVar::new(HashDataVar::default());
11
12let mut rng = rand::thread_rng();
13let (pk, vk) =
14Groth16::<Bls12_381>::circuit_specific_setup(circuit_defining_cs, &mut rng)?;
15
16let public_input= [
17data.public_hash_commitment.unwrap().x,
18data.public_hash_commitment.unwrap().y
19];
20
21let proof = Groth16::prove(&pk, data, &mut rng)?;
22let valid_proof = Groth16::verify(&vk, &public_input, &proof)?;
23
24Ok(valid_proof)
25}

The only new thing here is the if let statement in the beginning. It checks whether the provided data already has the commitment that is public input. If not, the SynthesisError is returned.

6 Final Tips

Here are some more tips you might find helpful while working with Arkworks and ZKP.

  1. Check Arkworks’ rollup tutorial for more examples of the ZKP usage with Arkworks. Despite not enough explanation of the used types, traits, their implementation, etc. with the information from this article, it will be easier to understand how everything works. However, be careful while using the code from the tutorial because, apparently, this code needs to be updated to match the latest versions of the dependencies.
  2. It would be best for SNARK verification to have multiple witnesses, two public inputs (that represent the initial and the final state of the same value), and a complex verification process in the circuit. This is also how the SNARK part works in the mentioned rollup tutorial.
    1. Multiple witnesses will make it more difficult to brute-force them.
    2. Two public inputs are the same value but in different states: initial and final. It is a more common usage for SNARKs, as in the Merkle Tree, where the public inputs are the initial Root of the tree and the final Root. Of course, there is always room for creativity. Still, for beginners, we would suggest this workflow with initial and final value states as public inputs, and the witness that alters the initial value to get the final one.
    3. Complex circuit logic is also advised to decrease the likelihood of brute forcing.
  3. To better understand how to implement any Arkworks “elements” (traits, types, implementations, etc.), always check the corresponding tests. The tests usually show the basic implementation of the needed code, so when you start using some these elements try to use the code from the test first.

Conclusion

This article provides a detailed insight into both Arkworks’ general usage and the usage of their ZKP implementation. After clarifying the crucial difference between <type> and <type>Var and between <trait> and <trait>Gadget, the article explains the CRH structure and how to work with CRH traits. In the end, the blog shows the whole ZKP workflow: from the constraint system and circuit generation with the ConstraintSynthesizer to the Proof verification. It summarizes everything with helpful tips for Arkworks development.

Resources

Prerequisites

  1. Collision resistance
  2. Merkle Trees
  3. Elliptic Curves and Pedersen Commitments
  4. General purpose zk-SNARKs
  5. Rust Basics

Why Arkworks

  1. Bellman
  2. Circom
  3. Arkworks presentation

Arkworks repositories mentioned

  1. Arkworks
  2. R1CS-Tutorials
  3. Algebra
  4. Groth16
  5. Crypto Primitives
  6. R1CS-std
  7. SNARK
  8. Ed on BLS12 381
  9. Relations
  10. Marlin

Rank 1 Constraint System (R1CS)

  1. Vitalik Buterin's article

Byte types

  1. Big-endian vs Little-endian

Code resource

  1. Proof of Concept: Zero-Knowledge Password Manager