Crypt4GH utility

Introduction

Bob wants to send a message to Alice, containing sensitive data. Bob uses Crypt4GH, the Global Alliance approved secure method for sharing human genetic data

crypt4gh, a Rust tool to encrypt, decrypt or re-encrypt files, according to the GA4GH encryption file format. How Crypt4GH works

Basic example

Alice and Bob generate both a pair of public/private keys.

crypt4gh keygen --sk alice.sec --pk alice.pub
crypt4gh keygen --sk bob.sec --pk bob.pub

Bob encrypts a file for Alice:

crypt4gh encrypt --sk bob.sec --recipient_pk alice.pub < file > file.c4gh

Alice decrypts the encrypted file:

crypt4gh decrypt --sk alice.sec < file.c4gh

Installation

Requirements

You need to install Rust in order to compile the source code.

To build from source on Windows, you should first have installed the MSVC Build Tools.

Linux, MacOS or another Unix-like OS

To download Rustup and install Rust, run the following in your terminal, then follow the on-screen instructions.

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Windows

Download and run the following executable: rustup-init.exe

Other ways to install Rust

If you prefer not to use the shell script, you may directly download rustup-init for the platform of your choice here.

Building from source (recommended)

Once installed, you can install crypt4gh executing the following in your terminal:

cargo install crypt4gh

Standalone binaries

In the releases page, You can find compiled binaries for:

Issues

If you have any issue with the installation please create an issue on Github.

Encryption Algorithm - Crypt4GH

Encryption Format

A random session key (of 256 bits) is generated to seed a ChaCha20 engine, with Poly1305 authentication mode. For each segment of at most 64kB of data, a nonce is randomly generated and prepended to the segment. Using the two latters, the original file is segmented and each segment is encrypted.

The header is prepended to the encrypted data.

Informally, the header contains, the word crypt4gh, the format version (currently 1), the number of header packets, and the sequence of header packets.

A header packet is a length followed by its content. The content can be a data encryption packet or an edit list packet.

All packets are encrypted using a Curve25519-based encryption.

Encryption process

Features

The advantages of the format are, among others:

  • Re-encrypting the file for another user requires only to decrypt the header and encrypt it with the user’s public key.
  • Header packets can be encrypted for multiple recipients.
  • Re-arranging the file to chunk a portion requires only to decrypt the header, re-encrypt with an edit list, and select the cipher segments surrounding the portion. The file itself is not decrypted and reencrypted.

Crypt4GH Key Format

This utility supports OpenSSH key-format (version 6.5 or above) if the key was generated with type ed25519 (i.e. with ssh-keygen -t ed25519 ...). Otherwise, this utility can generate keys in the following format:

Keys

A key is stored in the following PEM format:

-----BEGIN CRYPT4GH <type> KEY-----
BASE64-ENCODED DATA
-----END CRYPT4GH <type> KEY-----

where <type> is either PUBLIC or PRIVATE.

Public key data

For a public key, the key data is the byte representation of the plaintext key material.

Private key data

For a private key, we use the following encoding format.

byte[]  MAGIC_WORD
string  kdfname
string  (rounds || salt)     # included if kdfname is not "none"
string  ciphername
string  private blob         # Key material encrypted or not
string  comment              # Optional
  1. The MAGIC_WORD is the byte-representation of the ASCII word "c4gh-v1".

    Everything string consists of a length n (encoded as 2 big-endian bytes) and a sequence of n bytes (i.e. the string "hello", is encoded as \x00\x05hello).

  2. The kdfname is the name of the Key Derivation Function. We support either "scrypt", "pbkdf2_hmac_sha256", "bcrypt", or "none". The Rust implementation uses scrypt when available, and defaults to bcrypt for generating keys.

  3. The rounds is a 4 big-endian bytes representation of the number of iterations used in the KDF.

  4. The ciphername describes which symmetric algorithm is used to generate the encrypted data, as follows. The only supported cipher is "chacha20_poly1305" (so far), or "none".

    When kdfname is none, so should the ciphername be (and vice-versa), and the (rounds || salt) string is not included. This is used when the key material is not encrypted.

  5. In case the key material is encrypted, the KDF is used to derive a secret from a user-supplied passphrase. A nonce is randomly generated, and used in conjunction with the secret to encrypt the private key, using Chacha20 and authenticated with Poly1305. The nonce is prepended to the encrypted data.

  6. Finally, an optional comment can be used at the end of the encoded format.

Examples

Crypt4GH Key generation

crypt4gh keygen --sk user.sec --pk user.pub

OpenSSH Key generation

ssh-keygen -t ed25519 -f <output_filepath> -N <passphrase>

Usage & Examples

The usual --help flag shows you the different options that the tool accepts.

$ crypt4gh --help

Utility for the cryptographic GA4GH standard, reading from stdin and outputting to stdout.

USAGE:
    crypt4gh [FLAGS] [SUBCOMMAND]

FLAGS:
    -h, --help       Prints help information
    -v, --verbose    Sets the level of verbosity
    -V, --version    Prints version information

SUBCOMMANDS:
    decrypt      Decrypts the input using your secret key and the (optional) public key of the sender.
    encrypt      Encrypts the input using your (optional) secret key and the public key of the recipient.
    help         Prints this message or the help of the given subcommand(s)
    keygen       Utility to create Crypt4GH-formatted keys.
    rearrange    Rearranges the input according to the edit list packet.
    reencrypt    Decrypts the input using your (optional) secret key and then it reencrypts it using the
                 public key of the recipient.

Keygen

$ crypt4gh keygen --help

crypt4gh-keygen
Utility to create Crypt4GH-formatted keys.

USAGE:
    crypt4gh keygen [FLAGS] [OPTIONS]

FLAGS:
    -f               Overwrite the destination files
    -h, --help       Prints help information
        --nocrypt    Do not encrypt the private key. Otherwise it is encrypted in the Crypt4GH key
                     format (See https://crypt4gh.readthedocs.io/en/latest/keys.html)
    -V, --version    Prints version information

OPTIONS:
    -C, --comment <comment>    Key's Comment
        --pk <keyfile>         Curve25519-based Public key [env: C4GH_PUBLIC_KEY] [default:
                               ~/.c4gh/key.pub]
        --sk <keyfile>         Curve25519-based Private key [env: C4GH_SECRET_KEY] [default:
                               ~/.c4gh/key]

Generate a Crypt4GH Key for Alice and Bob.

crypt4gh keygen --sk alice.sec --pk alice.pub
crypt4gh keygen --sk bob.sec --pk bob.pub

Encrypt

$ crypt4gh encrypt --help

crypt4gh-encrypt
Encrypts the input using your (optional) secret key and the public key of the recipient.

USAGE:
    crypt4gh encrypt [OPTIONS] --recipient_pk <path>...

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --range <start-end>         Byte-range either as  <start-end> or just <start> (Start
                                    included, End excluded)
        --recipient_pk <path>...    Recipient's Curve25519-based Public key
        --sk <path>                 Curve25519-based Private key [env: C4GH_SECRET_KEY]

Alice encrypts a file file.txt for Bob:

crypt4gh encrypt --sk alice.sec --recipient_pk bob.pub < original_file.txt > encrypted_file.c4gh

Decrypt

$ crypt4gh decrypt --help

crypt4gh-decrypt
Decrypts the input using your secret key and the (optional) public key of the sender.

USAGE:
    crypt4gh decrypt [OPTIONS]

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --sender_pk <path>    Peer's Curve25519-based Public key to verify provenance (akin to
                              signature)
        --sk <path>           Curve25519-based Private key. [env: C4GH_SECRET_KEY]

Bob decrypts an encrypted file:

crypt4gh decrypt --sk bob.sec < encrypted_file.c4gh > decrypted_file.txt

If Bob wants to, optionally, verify that the message indeed comes from Alice, he needs to fetch Alice's public key via another trusted channel. He can then decrypt and check the provenance of the file with:

crypt4gh decrypt --sk bob.sec --sender_pk alice.pub < encrypted_file.c4gh > decrypted_file.txt

Reencrypt

$ crypt4gh reencrypt --help

crypt4gh-reencrypt
Decrypts the input using your (optional) secret key and then it reencrypts it using the public key
of the recipient.

USAGE:
    crypt4gh reencrypt [FLAGS] [OPTIONS] --recipient_pk <path>...

FLAGS:
    -h, --help       Prints help information
    -t, --trim       Keep only header packets that you can decrypt
    -V, --version    Prints version information

OPTIONS:
        --recipient_pk <path>...    Recipient's Curve25519-based Public key
        --sk <path>                 Curve25519-based Private key [env: C4GH_SECRET_KEY]

Bob reencrypts a file for alice and for himself:

crypt4gh reencrypt --sk bob.sec --recipient_pk alice.pub bob.pub < encrypted_file.c4gh > reencrypted_file.c4gh

Rearrange

$ crypt4gh rearrange --help

crypt4gh-rearrange
Rearranges the input according to the edit list packet.

USAGE:
    crypt4gh rearrange [OPTIONS] --range <start-end>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

OPTIONS:
        --range <start-end>    Byte-range either as  <start-end> or just <start> (Start included,
                               End excluded)
        --sk <path>            Curve25519-based Private key [env: C4GH_SECRET_KEY]

Bob rearranges an encrypted file with the bytes from 65535 to 131074:

crypt4gh rearrange --sk bob.sec --range 65535-131074 < encrypted_file.c4gh > rearranged_file.c4gh

Rust Library

You can check the documentation of the Rust library on docs.rs.