Syntax
Each line of a TL-B file is either a TL-B scheme (i.e. type declaration), a comment, or a blank line.TL-B Scheme
The TL-B scheme describes how to serialize a certain algebraic data structure into a binary format. Here are some examples:- Constructor that consists of
- optional constructor name;
- tag: empty,
$
or#
; - prefix code or
_
.
- Fields definitions, each of which is consist of
- optional field name (
ident
); - type expression (type-expr).
- optional field name (
- Constraints: optional expressions that restrict values which are instances of
Nat
type.
- Parameters declarations: declare fields of types
#
(natural numbers) orType
(types of types) that may be used as parameters for parameterized types. Always framed by curly{}
brackets.
- Combinator name: a right side of TL-B scheme that represented name of defined combinator. Could be parameterized.
Comments
The comments follow the same conventions as in C++.Libraries import
You can use TL-B libraries to extend your documents and to avoid writing repetitive schemes. In TL-B libraries there is no concept of cyclic import. Just indicate the dependency on some other document (library) at the top of the document with the keyworddependson
:
Semantics
From a high-level perspective, the right-hand side of each scheme is a type, either simple (such asBit
or True
) or parametrized (such as Hashmap n X
) and
the left-hand side describes a way to define, or even to serialize, a value of the type indicated in the right-hand side.
Below, we gradually describe each component of TL-B schemes.
Constructors
Constructors define a combinator’s type, including its state during serialization. Each constructor begins with the (possibly empty_
) string name,
such as a message
or bool_true
, immediately followed by an optional constructor tag, such as #_
or $10
, which describes the
bitstring used to encode (serialize) the constructor in question.
Tags may be given in either binary (after a dollar sign) or hexadecimal notation (after a hash sign).
If a tag is not explicitly provided, TL-B parser must computes a default 32-bit constructor tag by hashing with
CRC32 algorithm the text of the scheme with | 0x80000000
defining this constructor in a certain fashion. Therefore,
empty tags must be explicitly provided by #_
or $_
.
All constructor names must be distinct, and constructor tags for the same combinator must constitute a prefix code
(otherwise the deserialization would not be unique); i.e. no tag can be a prefix of any other.
Also, there are size limitations:
- maximum number of constructors per type:
64
; - maximum number of bits for a tag:
63
.
10...
that should be an instance of MsgAddress
combinator, the parser extracts the initial two bits that determine the tag.
It then understands that this address is further serialized as add_std
and continues to parse our string relative to the fields defined in this constructor.
All main variations of constructors are presented in the following table:
Constructor | Serialization |
---|---|
some#3f5476ca | A 32-bit uint is serialized from a hex value. |
some#5fe | A 12-bit uint is serialized from a hex value. |
some$0101 or _$0101 | Serialize the 0101 raw bits. |
some or some# | Serialize crc32(equation) | 0x80000000 . |
some#_ or some$_ or _ | Serialize nothing. |
_
character.
This indicates that the tag should be interpreted as the hexadecimal value with the least significant bit (LSB) removed.
For example, consider the following schema, that represents a stack integer value:
0x0201
. To compute the actual tag, remove the LSB from the binary representation of 0x0201
:
0b000000100000000
.
Field definitions
Field definitions follow each constructor and its optional tag. A field definition has the formatident:type-expr
, where:
ident
is the field’s name. If you don’t want to assign a specific name to the field, just leave it as_
.type-expr
is the field’s type. Can be a simple type, a parameterized type with appropriate arguments, or a more complex expression.
1023
bits and 4
references.
TL-B schemes define types. At the same time, the previously defined types can be used in other schemes in fields.
Therefore, in order to properly understand what types can be assigned to fields, we need to simultaneously figure out how to define the types themselves.
Types
Simple
Fields that are simple types are just examples of some previously defined or built-in types. They do not contain parameterization or any conditions. For example, Tick and Tock transactions are designated for special system smart contracts that must be automatically invoked in every block. Tick transactions are executed at the start of each MasterChain block, while Tock transactions are initiated at the end. Here is how they are represented in TL-B:is_tock
, storage_ph
, compute_ph
, aborted
, and destroyed
are fields with simple types.
Below are all the built-in types that can be used in defining fields:
#
: 32-bit unsigned integer;## x
: unsigned integer withx
bits;#< x
: unsigned integer less thanx
bits, stored aslenBits(x - 1)
bits up to 31 bits;#<= x
: unsigned integer less than or equal tox
bits, stored aslenBits(x)
bits up to 32 bits;Any
orCell
: remaining bits and references;Int
: 257 bits;UInt
: 256 bits;Bits
: 1023 bits;uint1
:uint256
- 1 - 256 bits;int1
:int257
- 1 - 257 bits;bits1
:bits1023
- 1 - 1023 bits.
Contained complex expressions
- Multiplicative expression for tuple creation. The expression
x * T
creates a tuple of the natural lengthx
, where each element is of typeT
.
- Serialization in the ref cell:
^[ ... ]
means that the fields inside the brackets are serialized in a separate cell, which is referenced from the current cell.
a
, b
, c
) is stored in a separate cell, resulting in a chain of three referenced cells:
Nat
type only. The Nat
type is a built-in type that represents natural numbers.
The types #
, ## x
, #< x
, and #<= x
together constitute the Nat
type. In TL-B schemes, the +
and *
operations can be performed on Nat
.
- Constraints:
Nat = Nat | Nat <= Nat | Nat < Nat | Nat >= Nat | Nat > Nat
. Each constraint must be enclosed in curly braces{}
and the variables used inside must be defined earlier.
100
.
- Condition operator:
Nat?Type
means that if the natural number is positive, then the field has the typeT
. Otherwise, the field is omitted.
Example
type, the field b
is serialized only if the a
field is equal to 1
.
- Bit selector: The expression
E . B
means to take bitB
from theNat
valueE
.
CondExample
type, the variable b
is serialized only if the second bit of a
is 1
.
For the real-world example, one may consider the following McStateExtra
combinator that describes data stored in each masterchain block.
Parameterized
Parameterized types are patterns in which other types are parameters. Such parameters are declared in curly brackets{}
or must be declared previously
as combinator’s field. Only identifiers of the Nat
and Type
types can be parameters.
A simple example of a parameterized type is the following definition of a type A
that parameterized by a natural number x
:
My32UintValue
, it fetches a 32-bit unsigned integer, as specified by the 32
natural parameter in the A
type.
Let’s consider another example where a combinator A
parameterized by a type variable X
is defined:
Bit
type is passed to A
as a parameter.
There is a possibility to use partial applications with such parameterized types:
b
field is stored inside the a
field. When serializing type A
, we first load the 8-bit unsigned integer from the a
field and then use this value to determine the size of the b
field.
This strategy also works for parameterized types:
Special
Currently, TVM allows the following types of cells:- Ordinary
- PrunedBranch
- Library
- MerkleProof
- MerkleUpdate
Ordinary
. This applies to all cells described in the TL‑B as well.
To enable the loading of special types in the constructor, prepend !
before the constructor.
Example
SPECIAL
cells when printing a structure and ensures proper validation of structures with special cells.
Implicit fields and the negate operator (~
)
Some fields may be implicit. These fields are defined within curly brackets {}
, as constraints and parameters of the parametrized types,
indicating that they are not directly serialized. Instead, their values must be deduced from other data, usually the parameters of the type being serialized.
Some occurrences of the indicators already defined earlier in a scheme are prefixed by a tilde ~
. This indicates that the indicator’s occurrence is used
oppositely from the default behavior. On the left-hand side of the equation, it means that the indicator is deduced (computed) based on this occurrence,
rather than substituting its type’s defined value. Conversely, on the right-hand side, the indicator is not deduced from
the serialized type but instead computed during the deserialization process. In other words,
a ~
transforms an input argument into an output argument or vice versa.
A simple example of the negate operator is the definition of the implicit indicator b
based on another indicator a
:
a
, the value of b
is computed as a + 100
. After this definition, you can use the new indicator as input for Nat
types:
example_dynamic_var
is computed at runtime when we load a
and use its value to determine the size of example_dynamic_var
.
Alternatively, it can be applied to other types:
Negate operator (~
) in type definition
Define ~n m
that takes m
and computes n
by loading it from an m
-bit unsigned integer.
In the Example
type, we store the variable computed by the Define
type into n_from_define
. We also know it’s an 8
-bit unsigned integer because we apply the Define
type with Define ~n_from_define 8
. Now, we can use the n_from_define
variable for other kinds to determine the serialization process.
This technique leads to more complex type definitions such as Unions that represent dynamic chains of some type.
References
- A description of an older version of TL;
- block.tlb: the main TL-B file that describes all basic TON blockchain structures.