Crypt4GH utility
Introduction
Bob wants to send a message to Alice, containing sensitive data. Bob uses Crypt4GH, the Global Alliance approved secure method for sharing human genetic data
crypt4gh, a Rust tool to encrypt, decrypt or re-encrypt files, according to the GA4GH encryption file format.
Basic example
Alice and Bob generate both a pair of public/private keys.
crypt4gh keygen --sk alice.sec --pk alice.pub
crypt4gh keygen --sk bob.sec --pk bob.pub
Bob encrypts a file for Alice:
crypt4gh encrypt --sk bob.sec --recipient_pk alice.pub < file > file.c4gh
Alice decrypts the encrypted file:
crypt4gh decrypt --sk alice.sec < file.c4gh
Installation
Requirements
You need to install Rust in order to compile the source code.
To build from source on Windows, you should first have installed the MSVC Build Tools.
Linux, MacOS or another Unix-like OS
To download Rustup and install Rust, run the following in your terminal, then follow the on-screen instructions.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
Windows
Download and run the following executable: rustup-init.exe
Other ways to install Rust
If you prefer not to use the shell script, you may directly download rustup-init for the platform of your choice here.
Building from source (recommended)
Once installed, you can install crypt4gh
executing the following in your terminal:
cargo install crypt4gh
Standalone binaries
In the releases page, You can find compiled binaries for:
Issues
If you have any issue with the installation please create an issue on Github.
Encryption Algorithm - Crypt4GH
Encryption Format
A random session key (of 256 bits) is generated to seed a ChaCha20 engine, with Poly1305 authentication mode. For each segment of at most 64kB of data, a nonce is randomly generated and prepended to the segment. Using the two latters, the original file is segmented and each segment is encrypted.
The header is prepended to the encrypted data.
Informally, the header contains, the word crypt4gh
, the format version (currently 1), the number of header packets, and the sequence of header packets.
A header packet is a length followed by its content. The content can be a data encryption packet or an edit list packet.
All packets are encrypted using a Curve25519-based encryption.
Features
The advantages of the format are, among others:
- Re-encrypting the file for another user requires only to decrypt the header and encrypt it with the user’s public key.
- Header packets can be encrypted for multiple recipients.
- Re-arranging the file to chunk a portion requires only to decrypt the header, re-encrypt with an edit list, and select the cipher segments surrounding the portion. The file itself is not decrypted and reencrypted.
Crypt4GH Key Format
This utility supports OpenSSH key-format (version 6.5 or above) if the key was generated with type ed25519 (i.e. with
ssh-keygen -t ed25519 ...
). Otherwise, this utility can generate keys in the following format:
Keys
A key is stored in the following PEM format:
-----BEGIN CRYPT4GH <type> KEY-----
BASE64-ENCODED DATA
-----END CRYPT4GH <type> KEY-----
where <type>
is either PUBLIC or PRIVATE.
Public key data
For a public key, the key data is the byte representation of the plaintext key material.
Private key data
For a private key, we use the following encoding format.
byte[] MAGIC_WORD
string kdfname
string (rounds || salt) # included if kdfname is not "none"
string ciphername
string private blob # Key material encrypted or not
string comment # Optional
-
The MAGIC_WORD is the byte-representation of the ASCII word "c4gh-v1".
Everything
string
consists of a length n (encoded as 2 big-endian bytes) and a sequence of n bytes (i.e. thestring
"hello", is encoded as\x00\x05hello
). -
The kdfname is the name of the Key Derivation Function. We support either
"scrypt"
,"pbkdf2_hmac_sha256"
,"bcrypt"
, or"none"
. The Rust implementation uses scrypt when available, and defaults to bcrypt for generating keys. -
The rounds is a 4 big-endian bytes representation of the number of iterations used in the KDF.
-
The ciphername describes which symmetric algorithm is used to generate the encrypted data, as follows. The only supported cipher is
"chacha20_poly1305"
(so far), or "none".When kdfname is none, so should the ciphername be (and vice-versa), and the (rounds || salt) string is not included. This is used when the key material is not encrypted.
-
In case the key material is encrypted, the KDF is used to derive a secret from a user-supplied passphrase. A nonce is randomly generated, and used in conjunction with the secret to encrypt the private key, using Chacha20 and authenticated with Poly1305. The nonce is prepended to the encrypted data.
-
Finally, an optional comment can be used at the end of the encoded format.
Examples
Crypt4GH Key generation
crypt4gh keygen --sk user.sec --pk user.pub
OpenSSH Key generation
ssh-keygen -t ed25519 -f <output_filepath> -N <passphrase>
Usage & Examples
The usual --help
flag shows you the different options that the tool accepts.
$ crypt4gh --help
Utility for the cryptographic GA4GH standard, reading from stdin and outputting to stdout.
USAGE:
crypt4gh [FLAGS] [SUBCOMMAND]
FLAGS:
-h, --help Prints help information
-v, --verbose Sets the level of verbosity
-V, --version Prints version information
SUBCOMMANDS:
decrypt Decrypts the input using your secret key and the (optional) public key of the sender.
encrypt Encrypts the input using your (optional) secret key and the public key of the recipient.
help Prints this message or the help of the given subcommand(s)
keygen Utility to create Crypt4GH-formatted keys.
rearrange Rearranges the input according to the edit list packet.
reencrypt Decrypts the input using your (optional) secret key and then it reencrypts it using the
public key of the recipient.
Keygen
$ crypt4gh keygen --help
crypt4gh-keygen
Utility to create Crypt4GH-formatted keys.
USAGE:
crypt4gh keygen [FLAGS] [OPTIONS]
FLAGS:
-f Overwrite the destination files
-h, --help Prints help information
--nocrypt Do not encrypt the private key. Otherwise it is encrypted in the Crypt4GH key
format (See https://crypt4gh.readthedocs.io/en/latest/keys.html)
-V, --version Prints version information
OPTIONS:
-C, --comment <comment> Key's Comment
--pk <keyfile> Curve25519-based Public key [env: C4GH_PUBLIC_KEY] [default:
~/.c4gh/key.pub]
--sk <keyfile> Curve25519-based Private key [env: C4GH_SECRET_KEY] [default:
~/.c4gh/key]
Generate a Crypt4GH Key for Alice and Bob.
crypt4gh keygen --sk alice.sec --pk alice.pub
crypt4gh keygen --sk bob.sec --pk bob.pub
Encrypt
$ crypt4gh encrypt --help
crypt4gh-encrypt
Encrypts the input using your (optional) secret key and the public key of the recipient.
USAGE:
crypt4gh encrypt [OPTIONS] --recipient_pk <path>...
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--range <start-end> Byte-range either as <start-end> or just <start> (Start
included, End excluded)
--recipient_pk <path>... Recipient's Curve25519-based Public key
--sk <path> Curve25519-based Private key [env: C4GH_SECRET_KEY]
Alice encrypts a file file.txt
for Bob:
crypt4gh encrypt --sk alice.sec --recipient_pk bob.pub < original_file.txt > encrypted_file.c4gh
Decrypt
$ crypt4gh decrypt --help
crypt4gh-decrypt
Decrypts the input using your secret key and the (optional) public key of the sender.
USAGE:
crypt4gh decrypt [OPTIONS]
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--sender_pk <path> Peer's Curve25519-based Public key to verify provenance (akin to
signature)
--sk <path> Curve25519-based Private key. [env: C4GH_SECRET_KEY]
Bob decrypts an encrypted file:
crypt4gh decrypt --sk bob.sec < encrypted_file.c4gh > decrypted_file.txt
If Bob wants to, optionally, verify that the message indeed comes from Alice, he needs to fetch Alice's public key via another trusted channel. He can then decrypt and check the provenance of the file with:
crypt4gh decrypt --sk bob.sec --sender_pk alice.pub < encrypted_file.c4gh > decrypted_file.txt
Reencrypt
$ crypt4gh reencrypt --help
crypt4gh-reencrypt
Decrypts the input using your (optional) secret key and then it reencrypts it using the public key
of the recipient.
USAGE:
crypt4gh reencrypt [FLAGS] [OPTIONS] --recipient_pk <path>...
FLAGS:
-h, --help Prints help information
-t, --trim Keep only header packets that you can decrypt
-V, --version Prints version information
OPTIONS:
--recipient_pk <path>... Recipient's Curve25519-based Public key
--sk <path> Curve25519-based Private key [env: C4GH_SECRET_KEY]
Bob reencrypts a file for alice and for himself:
crypt4gh reencrypt --sk bob.sec --recipient_pk alice.pub bob.pub < encrypted_file.c4gh > reencrypted_file.c4gh
Rearrange
$ crypt4gh rearrange --help
crypt4gh-rearrange
Rearranges the input according to the edit list packet.
USAGE:
crypt4gh rearrange [OPTIONS] --range <start-end>
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
--range <start-end> Byte-range either as <start-end> or just <start> (Start included,
End excluded)
--sk <path> Curve25519-based Private key [env: C4GH_SECRET_KEY]
Bob rearranges an encrypted file with the bytes from 65535 to 131074:
crypt4gh rearrange --sk bob.sec --range 65535-131074 < encrypted_file.c4gh > rearranged_file.c4gh
Rust Library
You can check the documentation of the Rust library on docs.rs.