Note: These definitions were taken from https://github.com/w3f/polkadot-spec/blob/main/docs. They are provided here for reference. All credits to respective authors.

SCALE

The Polkadot Host uses the "Simple Concatenated Aggregate Little-Endian" (SCALE) codec to encode byte arrays as well as other data structures. SCALE provides a canonical encoding to produce consistent hash values across their implementation, including the Merkle hash proof for the State Storage.

Decoding

refers to the decoding of a blob of data. Since the SCALE codec is not self-describing, it’s up to the decoder to validate whether the blob of data can be deserialized into the given type or data structure.

It’s accepted behavior for the decoder to partially decode the blob of data. This means any additional data that does not fit into a data structure can be ignored.

caution Considering that the decoded data is never larger than the encoded message, this information can serve as a way to validate values that can vary in size, such as sequences. The decoder should strictly use the size of the encoded data as an upper bound when decoding in order to prevent denial of service attacks.

Notation

Let be the set of all byte sequences.
Let , denotes the -th byte of , and denotes the -th bit of the -th byte of .

Definitions

Little Endian

By the little-endian representation of a non-negative integer, , represented as

in base 256, we refer to a byte array such that

Accordingly, we define the function :

Scale Types

Fixed Length Integers

The SCALE codec, , for fixed length integers not defined here otherwise, is equal to the little-endian encoding of those values.

Tuple

The SCALE codec for Tuple, , such that:

Where ’s are values of different types, is defined as:

In the case of a tuple (or a structure), the knowledge of the shape of data is not encoded even though it is necessary for decoding. The decoder needs to derive that information from the context where the encoding/decoding is happening.

Varying Data Type

This library does not provide means for encoding/decoding varying data types, but the definitions are provided here for completeness and reference. The implementation is left to the user of the library.

We define a varying data type to be an ordered set of data types.

A value of varying data type is a pair where for some and is its value of type , which can be empty. We define , unless it is explicitly defined as another value in the definition of a particular varying data type.

The SCALE codec for value of varying data type , formally referred to as is defined as follows:

The SCALE codec does not encode the correspondence between the value and the data type it represents; the decoder needs prior knowledge of such correspondence to decode the data.

Boolean

The SCALE codec for a boolean value defined as a byte as follows:

Compact

SCALE Length encoding , also known as a compact encoding, of a non-negative number is defined as follows:

and the rest of the bits of store the value of in little-endian format in base-2 as follows:

such that:

Note that denotes the length of the original integer being encoded and does not include the extra byte describing the length. The encoding can be used for integers up to: $$2^{(63+4)8} -1 = 2^{536} -1$$

Sequence

The SCALE codec for sequence such that:

where ’s are values of the same type (and the decoder is unable to infer value of from the context) is defined as:

where is defined here.

In some cases, the length indicator is omitted if the length of the sequence is fixed and known by the decoder upfront. Such cases are explicitly stated by the definition of the corresponding type.

String

The SCALE codec for a string value is an encoded sequence consisting of UTF-8 encoded bytes.

This can be achieved via encoding the UTF-8 sequence as a uint8[] or bytes, which is supported by this library.

Option Type

The Option type is a varying data type of which indicates if data of type is available (referred to as some state) or not (referred to as empty, none or null state). The presence of type none, indicated by , implies that the data corresponding to type is not available and contains no additional data. Whereas the presence of type indicated by implies that the data is available.

Result Type

The Result type is a varying data type of which is used to indicate if a certain operation or function was executed successfully (referred to as "ok" state) or not (referred to as "error" state). implies success, implies failure. Both types can either contain additional data or are defined as empty types otherwise.

Dictionaries, Hashtables, Maps

SCALE codec for dictionary or hashtable D with key-value pairs , such that:

is defined as the SCALE codec of as a sequence of key-value pairs (as tuples):

where is encoded the same way as but argument refers to the number of key-value pairs rather than the length.

SCALE ​

Decoding ​

Notation ​

Definitions ​

Little Endian ​

Scale Types ​

Fixed Length Integers ​

Tuple ​

Varying Data Type ​

Boolean ​

Compact ​

Sequence ​

String ​

Option Type ​

Result Type ​

Dictionaries, Hashtables, Maps ​