Tofn (t-of-n): a threshold cryptography library in Rust

Tofn provides the following:

Tofn (t-of-n): a threshold cryptography library in Rust

Tofn provides the following:

  • An implementation of the GG20 threshold-ECDSA protocol.
  • A general-purpose SDK (software development kit) to facilitate the development and use of threshold cryptography protocols such as GG20.

Setup

  • Get the latest version of Rust stable (currently 1.53.0).
  • Clone this repo.
  • You might need to install the GMP library. On MacOS:

On Ubuntu:

Demo and tests

Tofn integration tests (in tests/integration) serve to demo use of the library. (These demos would be in the examples directory but we want them to run automatically as part of the normal test sequence.)

Demo test hieararchy:

Multi-threaded tests

Tests in multi_thread are a more accurate reflection of typical use than those in single_thread.

Threshold cryptography protocols are multi-party computation protocols: parties exchange messages with one another and perform local computations in order to reach consensus on an output of the protocol.

The multi_thread tests aim to simulate such an environment by spawning an independent thread for each party. These tests simulate network communication by using concurrency primitives from Rust's standard library (sync::mpsc) to pass messages between threads.

Run all multi-threaded integration tests:

Single-threaded tests

Multi-threaded code is inherently more difficult to write, test, and debug. For development purposes it is useful to have a single-threaded reference implementation so as to eliminate concurrency as a source of bugs. Most examples of tofn functionality occur in single_thread.

Some tests illustrate the fault protection and identification properties of the GG20 protocol. These tests require one or more parties to act maliciously during protocol execution. Malicious behaviour is enabled in tofn via the malicious crate feature---see Malicious crate feature below.

Run all single-threaded integration tests with only honest parties:

Run all single-threaded integration tests, including those with malicious parties:

Tests using malicious display extensive log messages to the terminal. For example:

Two types of tofn user

The tofn SDK supports two types of users:

  1. Library users
  2. Protocol implementers

Library users

A typical library user should need code from only the following tofn modules:

  • tofn::sdk::api and tofn::collections for generic tofn SDK.
  • tofn::gg20 for protocol-specific code for the GG20 protocol.

See Demo and tests for working code to illustrate use of tofn.

See the Tofnd crate for usage of tofn in production code.

The core of the API is a generic Protocol type:

where

  • F is the type of the final output of the protocol. (Examples: keygen produces a secret key share, sign produces a signature.)
  • K, P are marker types for typed collection indices. See Typed collection indices below.

Specific protocol implementations provide constructors that create a new concrete Protocol instance. Examples:

  • tofn::gg20::keygen::new_keygen returns a new keygen protocol
  • tofn::gg20::sign::new_sign returns a new sign protocol

Each party in the protocol has its own Protocol instance. A Protocol can be either Done or NotDone. The Done variant has ProtocolOutput data defined like so:

where the ProtocolOutput variants are:

  • Ok(output): the protocol completed in happy path, producing output of type F (eg. secret key share or signature).
  • Err(faulters): the protocol completed in sad path. An output of type F could not be produced because one or more parties was malicious. faulters is a list of malicious parties detected by this Protocol instance during execution. ProtocolFaulters is a custom collection type describing the faulty parties.

The NotDone variant has Round data with several core methods used to exchange messages and progress the protocol to the next round:

  • bcast_out, p2ps_out: outgoing messages to be sent over the network to other parties.
  • msg_in: incoming messages received over the network from other parties.
  • expecting_more_msgs_this_round: have we received all the incoming messages we expect for this round? Library users use this method to determine whether it's safe to progress to the next round.
  • execute_next_round: proceed to the next round of the protocol with whatever messages we have received so far. Consumes self and returns a new Protocol instance for the next round.
    • If a message from party A is missing then A is flagged as a faulter. This is how tofn facilitates timeout faults.

Protocol implementers

A typical protocol implementer would use code from the following tofn modules:

  • tofn::sdk::api
  • tofn::sdk::implementer_api
  • tofn::collections

See the tofn::gg20 module for an example of a protocol built using the tofn SDK.

The intent of the implementer API is to allow the protocol implementer to concentrate only on the stateless math functions for each round that map

The implementer does not need to worry about generic work such as collecting incoming messages, deserializing data, identifying generic faults (timeout, message corruption), etc.

Concretely, protocol implementers must supply:

  • A constructor for a party in the protocol. Examples: gg20::keygen::new_keygen, gg20::sign::new_sign.
  • For each round of the protocol: a struct implementing the Executer trait from one of the modules no_messages, bcast_only, p2p_only, bcast_and_p2p according to which types of messages are expected in this round.

All messages delivered to all parties

Tofn requires that all messages be delivered to all parties. Specifically:

  • p2p: Any p2p message from A to B should also be delivered to all other parties C.
  • self-delivery: A party A treats a missing message from any party P the same way, even if P=A: party P is declared as a faulter.

Support for multiple shares per party

Tofn protocols may allow one party to have multiple shares in the protocol. For example, keygen could be invoked with 5 parties having share counts 2,3,6,2,1 for a total of 14 shares.

Protocol implementers need not concern themselves with the distinction between parties and shares. Indeed, the tofn SDK does not even expose information about parties in the Executer trait implemented by each protocol round.

Each Protocol instance corresponds to a share, not a party. Rounds that produce outgoing p2p messages must produce one message for each other share.

Incoming messages indicate only the party from whom the message is sent and not the individual share controlled by that party. The tofn SDK automatically bundles metadata into each message so that incoming messages can be routed to the appropriate share.

Protocol implementers identify faulty shares but the tofn API attributes faults only to a party, not a share. The tofn SDK automatically translates share faults provided by the protocol implementer into party faults consumed by tofn users.

Avoid panic: TofnResult is for fatal errors only

Tofn strives to avoid panics. (Currently most but not all potential panic points have been eliminated from tofn.)

Instead, tofn has a TofnResult type reserved only for fatal errors:

A library user who encounters a TofnFatal should gracefully terminate the protocol according to the context of the specific application. Any Protocol method that returns TofnFatal should be viewed as a confession of fault by that party.

Protocol implementers should not return TofnFatal when malicious behaviour is detected. Instead, move the protocol to Done state and return a faulters list Ok(Done(Err(faulters))) that is processed by the tofn SDK---see gg20 protocol implementation for details.

Malicious crate feature

Enabling the malicious crate feature allows the user to specify that a given party should behave maliciously at a certain point during the protocol.

Example

With malicious enabled the new_keygen function takes an additional argument of type Behaviour.

The following code instructs the party to corrupt her commitment to the elliptic curve point y_i computed during round 1 of the GG20 keygen protocol:

Tofn collection types

The module tofn::collections provides several custom collection types such as VecMap, FillVecMap, HoleVecMap, etc. These collection types are especially useful for threshold cryptography. They build on the Vec collection type from Rust's standard library.

TODO: more to come.

Typed collection indices

The VecMap collection type is a wrapper for Vec from Rust's standard library except that items in the collection are not indexed by usize. Instead items are indexed by a wrapper type TypedUsize defined as follows:

Users create their own marker type like so:

and then specify the index type of the collection:

Several other crates exist for this purpose but none of them has the unique combination of features we desire in tofn. See the typed_index_collections crate for a list of similar cates.

The purpose of TypedUsize is to avoid accidental misuse of indices as a source of bugs.

Example

Threshold signature schemes consist of two protocols: keygen and sign. A group of 6 parties will participate in keygen in order to produce secret shares of an ECDSA public key PK. Let us label these parties 0,1,2,3,4,5.

Later a subset 1,3,5 of those parties participates in sign in order to sign a message under PK. These sign participants are each assigned a new label for the duration of the sign protocol: 0,1,2. So each sign party label has an associated keygen party label: 0->1, 1->3, 2->5.

Some collection types used in the sign implementation index only over sign parties, whereas others index over all keygen parties. If both of these index types are usize then it is easy to confuse them, creating bugs. TypedUsize allows us to leverage Rust's type system to eliminate index confusion at compile time.

To complicate matters further, tofn supports multiple shares per party. Each keygen protocol specifies both the number of parties in the protocol and the number of shares allocated to each party.

For example, keygen could be invoked with 5 parties having share counts 2,3,6,2,1 for a total of 14 shares. We thus have four distinct index types: keygen parties, keygen shares, sign parties, and sign shares.

Security notes

  • In our security model, we don't guarantee security if the attacker has access to the device.

Message authenticity and integrity

Protocol messages from other parties are delivered via the msg_in API call:

where

  • from is an identifier for the party A from whom the message is received. (Party identifiers in tofn are implemented as a usize.)
  • bytes is the serialized payload of A's message.

It is assumed that these messages have both authenticity and integrity: if party B receives a call to msg_in containing a message bytes from party A then that message really did come from A, it really was intended for B, and the bytes are exactly what A intended.

As such, if bytes is malformed or malicious then B will accuse A of faulty behaviour.

Message ordering

  • We assume that an honest party's Round x message is sent before Round x + i.
  • We also assume that no party receives a Round x + i message from any other party before their Round x message.

License

All crates licensed under either of

  • Apache License, Version 2.0
  • MIT license

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Issues

Collection of the latest Issues

leontiad

leontiad

0

Would be more elegant in terms of testing to compress the assignment of total shares per party to an agreed convention instead of expanding the entire vector with indices. The conversion for testing purposes can be 1 share per physical entity, e.g: to instantiate a local network with 20 shares and 1 share per participant with threshold equals 12 - meaning 13 shares are needed - code looks likes that:

https://github.com/axelarnetwork/tofn/blob/main/tests/integration/single_thread/mod.rs#L42:

that could be compressed in pseudocode in sth like that:

that is sth minor

naughtyfox

naughtyfox

1

Hey guys! We'd like to use tofn library, but it has no releases or version tags. How about adding ones? Thank you for awesome product!

milapsheth

milapsheth

2

To prevent further message replayability of GG20 protocol messages, we should add the session nonce as another layer of domain separation to the challenge computation of all the zk proofs. This could be done after switching to the new NIZK trait from #163 to avoid passing session nonce to every zk proof as an argument.

ggutoski

ggutoski

3

Currently we take an ad hoc approach to domain separation. Sometimes we take a single u8 tag. Sometimes we layer multiple u8 tags. None of these approaches avoids the problem that you need a central repository of all tags in order to avoid tag re-use.

Perhaps a better solution is to use hard-coded randomly-selected 32-byte blobs for tags.

Originally posted by @ggutoski in https://github.com/axelarnetwork/tofn/pull/182#discussion_r728471667

ggutoski

ggutoski

low priority
3

Serde does not automatically validate deserialized structs. This is annoying because a struct instance that is otherwise guaranteed to be valid by construction can be made invalid via serde round trip. Examples in tofn include GroupPublicInfo and SharePublicInfo.

I spent a long time digging for ergonomic ways to validate serde-deserialized data. :disappointed: Seems like the best solution is to use deserialize_with as described here https://github.com/serde-rs/serde/pull/1858#pullrequestreview-575089757. Documentation on deserialize_with is sparse---this is all I could find: Field attributes · Serde

If we do go this route then presumably the clean way to do it is for each struct would have a validate() method that is called in two places:

  1. at initial construction (eg. in new())
  2. in deserialize_with

That's a pretty generic pattern---might even be worth a Validator trait or something. Note that this crate exists https://github.com/Keats/validator but at first glance it doesn't seem to meet our needs.

Not sure yet whether we have a strong need for this fix but I can envision a time when we do.

ggutoski

ggutoski

2

The byte length of GG20 keygen round 2 bcast message is proportional to the threshold of the key being generated. That's bad---we want all message byte lengths to be constant.

https://github.com/axelarnetwork/tofn/blob/07dc66353ab175790a33d5903074ca4daae71104/src/gg20/keygen/r2.rs#L22-L25

vss::Commit is a commitment to a verifiable secret share. It's size is proportional to the threshold of the key being generated. In the past we broke up large bcast messages by splitting them into constant-size p2ps, but that works cleanly only when the message size scales with the number of shares. Because the size of this message grows with the threshold and not the number of shares, there is no clean way to implement the p2p solution. (It could be done, but it would be ugly.)

My rough estimate is that this message will become the largest keygen message at threshold ~100. Beyond that point this message will exceed the current maximum limit for message byte length for keygen. (This limit is selected to accommodate the largest keygen message as described in #169 .) If and when that happens we can simply raise the limit to accommodate the threshold we need. The raised limit might be significantly higher than all other keygen messages but it would still be an effective guard against a long-message attack.

Conclusion: it sucks but it's not a clear and present danger.

Question: should we pre-emptively raise the limit now to avoid the need to change it in the future? The limit is currently 9kB. A 100kB limit should suffice for threshold 1000.

milapsheth

milapsheth

3

tofn's integration tests and some of the unit tests (including malicious tests) are only run for a specific choice of parameters (specific keygen, sign party count and threshold). We can improve the test coverage by having a set of test cases that use varying parameter combinations, including various edge cases that all tests can reuse. The malicious tests might require some changes to work with this.

One concern is that this will significantly increase testing time for local testing and PR reviews. Few potential solutions to look into are:

  1. Add a feature to enable the long running tests and locally run the tests every now and then.
  2. Configure the github actions to run the full suite of tests in a nightly build, while the PR reviews can still just the most important tests.
  3. Reduce the unsafe keygen RSA modulus size to 1024/512 bits for faster operations and testing.
  4. Take the hit in testing time. Maybe it's not too bad?
ggutoski

ggutoski

1

Looks like we're missing tests for the following code paths:

I vaguely remember from long ago that it was difficult to make a malicious party hit these code paths without also triggering any earlier sad paths.

milapsheth

milapsheth

0

GMP's release notes state that:

Issues with GMP 6.2.1: While we added support for Apple's new Arm based computers, our support has a problem. The problem is that Apple reserves CPU register x18, but GMP's mpn/arm64 assembly code uses that register. While GMP runs fine in our tests, we expect things to go awry in some execution situation. (Apple has not been kind enough to specify how they use x18. Therefore, we don't know what the consequences of using x18 might be.)

While we haven't observed any failures in practice, we should keep an eye out for any issues and new GMP releases that resolve this.

milapsheth

milapsheth

4

Our safe prime generation implementation independently finds a prime q and then checks whether 2q + 1 is also prime. It's known that this is not optimal since various primes q lead to 2q + 1 being divisible by a small prime (for e.g. if q = 1 mod 3). A combined sieve as described here can optimize this: https://eprint.iacr.org/2003/186.pdf The main issue to implement this is that we currently rely on nextprime function provided by GMP, so this optimization would require modifying GMP or rewriting parts of it in Rust.

ggutoski

ggutoski

4

Based on a previous PR comment.

Two tasks:

  1. Executer API should give incoming messages as a VecMap of enum (bcast_and_p2p, bcast_only, p2p_only) for each player.
  2. Rounds should be able to specify at compile-time the subset of possible message types they can accept. eg. bcast_and_p2p or p2p_only but not bcast_only.
milapsheth

milapsheth

0

For tests, we don't need to use safe primes since it takes quite long to generate them. As a result, we have a function called new_keygen_unsafe that does keygen using non-safe primes. But because we need this for integration tests, we can't hide this behind a #[cfg(test)] gate, making it exposed in the API. @sdaveas suggested to use #[cfg(feature="unsafe")].

cgorenflo

cgorenflo

low priority
0

To recover the correct public key from an Ethereum signature, we need R, S and a recovery bit V. V can be cheaply computed during the sig creation or retrieved by brute-forcing all solutions (4), which we currently do. Not having to try all possible solutions to verify the signature's correctness would be preferable

Information - Updated Jun 20, 2022

Stars: 66
Forks: 8
Issues: 26

The arkworks ecosystem consist of Rust libraries for designing and working with zero knowledge succinct...

This library is released under the MIT License and the Apache v2 License (see License)

The arkworks ecosystem consist of Rust libraries for designing and working with zero knowledge succinct...
Http

740

A general purpose library of common HTTP types

More information about this crate can be found in the MIT license (LICENSE-MIT or

A general purpose library of common HTTP types

A library providing asynchronous, multiplexed tailing for (namely log) files

Also available is the underlying file event-stream (driven by MIT license (LICENSE-MIT or

A library providing asynchronous, multiplexed tailing for (namely log) files

Threshold Secret Sharing

Efficient pure-Rust library for MIT license (LICENSE-MIT or

Threshold Secret Sharing

unrust / uni-snd

This library is a part of MIT license (LICENSE-MIT or

unrust / uni-snd

CBOR Event library

MIT license (LICENSE-MIT or

CBOR Event library

Library for traversing & reading GameCube and Wii disc images

Based on the C++ library MIT license (LICENSE-MIT or

Library for traversing & reading GameCube and Wii disc images

Bindings to libssh

Bindings to GNU Library (or: Lesser) General Public License

Bindings to libssh

SocketCAN based experimental library that implements proposed (PR) embedded-hal CAN traits

SocketCAN based experimental library that implements proposed (MIT license (LICENSE-MIT or socketcan-rs

SocketCAN based experimental library that implements proposed (PR) embedded-hal CAN traits

daemonize is a library for writing system daemons

Inspired by the Python library MIT license (LICENSE-MIT or

daemonize is a library for writing system daemons

Rust library for low-level abstraction of MIPS32 processors

This project is licensed under the terms of the MIT license

Rust library for low-level abstraction of MIPS32 processors

A pure Rust library for reading/writing Windows

A pure Rust library for reading/writing License

A pure Rust library for reading/writing Windows
Facebook Instagram Twitter GitHub Dribbble
Privacy