“All problems in computer science can be solved by another level of indirection.” – David Wheeler
Computer science professor David Wheeler once famously stated that every problem in computer science can be solved, or made a lot easier to solve, by adding one more level of indirection.
Indirection is an abstract concept we often employ in software engineering. It allows us to write code that can be easily substituted for a different implementation, like a Lego building block.
Let’s take a look at why indirection is relevant to our discussion of Tezos abstract blockchains.
Adapting to Change
You’ve built a nice system for your employer.
It works fine for years but, all of a sudden, your boss calls and says the company is cancelling all closed source / commercial software contracts.
You must now port the entire system to an open source database system like PGSQL or MySQL. All those stored procedures, domain specific language transactions must now be coded again in a different language.
What if this new system doesn’t work and you need to write yet another version, for a different RDMS? You’d need to rewrite everything again.
Indirection
This is where software indirection comes in.
If you add a layer between your software and the underlying database management system, you can write your software to communicate with that layer instead of sending commands directly to the RDMS.
That way, when you change the database system, you only need to modify or create one layer. Everything else stays the same.
This is the main idea behind hardware drivers, layered network protocols, database drivers and even web services.
You create a layer of indirection between your software and the objects you wish to manipulate. That way, when the objects change, only a layer must be changed, leaving the rest of the business logic intact.
Abstract Blockchains
What if we adapted the above concept to blockchains? What if instead of building a software system that directly manipulated one particular blockchain, we inserted an indirection layer between our application code and the physical blockchain?
Then we could extract common blockchain operations into a blockchain driver of sorts. Our software would then interact with this driver which, in turn, interacts with the underlying blockchain. This would allow us to change the blockchain being manipulated without making any changes to our application logic!
Let’s use a popular example to illustrate how this’d work.
Bitcoin Core
Bitcoin Core, the world’s first successful implementation of a blockchain based cryptocurrency, applies a certain set of operations on a blockchain. It adds records (blocks) to a blockchain, it sends transactions and blocks out to network peers and so on.
If we took each of the tasks performed by Bitcoin Core and added a layer of indirection between the calling code and the actual blockchain, we’d end up with a blockchain driver. Once all operations are implemented in the indirection layer, you’ll be able to change the underlying blockchain to Ethereum, Tezos, Litecoin or any other cryptocurrency and the above layer wouldn’t need to be changed.
This is exactly what Tezos does!
Bitcoin Core, on the other hand, manipulates the blockchain directly. Making changes to Bitcoin Core source code is notoriously difficult. When something gets changed on the blockchain itself, several part of Bitcoin Core must be changed accordingly. There is very tight coupling between Bitcoin business logic and its low level blockchain operations.
Tezos Abstract Blockchain
Tezos leverages the concept of abstract blockchains to separate direct operations on the data store from the higher level business logic.
In software engineering terms, we say Tezos’ business logic is loosely coupled to the lower level blockchain operations. Loose coupling means that changes made to one component do not necessarily propagate to others.
Tezos Blocks
Tezos represents block headers as four generic fields:
type raw_block_header = {
pred: Block_hash.t;
header: Bytes.t;
operations: Operation_hash.t list;
timestamp: float;
}
This data structure is made very generic on purpose. It does not prescribe how a block should be assembled, it simply tells us a block must specify the hash of the previous block (pred
), the actual header bytes (header
), a list of operation hashes (operations
) and, finally, a timestamp
that tells us when the block was minted.
These four fields easily adapt to any known blockchain.
You could implement an Ethereum, Bitcoin, Litecoin or any other cryptocurrency block using this abstract specification. Heck, even a Tezos block could be represented using it!
Tezos Block Operations
Having an abstract block definition isn’t very useful unless we can do stuff with it.
Tezos defines an abstract key-value store called a Context. This storage system could be used to store any arbritrary data, but Tezos uses it to manage blocks.
A Context is defined like so:
module Context = sig
type t
type key = string list
val get: t -> key -> Bytes.t option Lwt.t
val set: t -> key -> Bytes.t -> t Lwt.t
val del: t -> key -> t Lwt.t
(*...*)
end
As you can see, there is no specific type of data defined. We only have a type t
of data. This data type, t
, can be anything. Our storage system is polymorphic. Its specification can be morphed into any data type and be able to store, retrieve or delete items of that specific type.
A blockchain is immutable. Therefore, blocks can only be added to it, never removed (unless an exceptional situation occurs, like a forced hard fork).
Despite this, the Context definition includes a del
operation which takes a key
. When called, it will delete an entry from the storage system. As you can guess, this specific action will not be called on blocks in normal circumstances.
Tezos Protocol
Finally, we put everything together to build a Protocol module.
Here’s the abstract implementation of a Tezos protocol:
type score = Bytes.t list
module type PROTOCOL = sig
type operation
val parse_block_header : raw_block_header -> block_header option
val parse_operation : Bytes.t -> operation option
val apply :
Context.t ->
block_header option ->
(Operation_hash.t * operation) list ->
Context.t option Lwt.t
val score : Context.t -> score Lwt.t
(*...*)
end
This is a blueprint for protocols that can be implemented on the Tezos platform. Why a blueprint? Because everything we’ve discussed until now is abstract. This is a very common pattern in functional programming.
OCaml, the language used in the reference Tezos implementation, is a functional programming language. Which means it uses constructs from abstract algebra to build software programs. In a functional language you don’t tell the computer what to do, you tell it what things are.
The protocol specification above doesn’t tell the computer what it should do, it tells it what a protocol is.
Self Amending Protocols
Tezos protocols are also able to self amend. The abstraction mechanism allows us to not only implement this system for several different blockchains, it also allows the system to change itself! There is no hard-coded implementation anywhere in the code we’ve seen. The implementation goes elsewhere. (We’ll discuss self amending Tezos protocols in a separate article.)
The Protocol module defines a few subroutines, namely: parse_block_header
, parse_operation
, apply
and score
.
None of these operations are implemented in the initial specification. As expected, they’re all abstract, they simply tell implementors what actions a protocol should be able to perform. In a sense, this is a bit like the concept of interfaces in the object oriented paradigm.
Seed Protocol
Tezos comes with one initial implementation of a protocol, called the seed protocol.
As the name suggests, a seed protocol is the starting point for all future Tezos protocols – a bit like the concept of a Genesis Block in the Bitcoin blockchain.
The seed protocol is a concrete implementation of the above Protocol module. Every new protocol must implement this interface.
Once implemented, the binary protocol’s files SHA256 hash is processed and proposed for storage on the blockchain. The implementation is packaged in a tar archive, then its SHA256 hash is computed (just like verification hashes for software distributions do).
If the hash gets voted on by a majority of XTZ delegators, it’ll be considered accepted and the protocol gets committed on the blockchain. Tezos clients will then refuse any protocol implementation that does not match the same SHA256 hash.
This is a very clever construct: although the compiled OCaml code, which actually executes on your computer, does not reside on the blockchain, its SHA256 hash does. Since this hash code is guaranteed to be unique (as far as we know SHA256 has no vulnerabilities at this time), the hash code uniquely identifies a piece of software. By verifying the software against this hash, we’ve indirectly referenced executable code on the blockchain without actually storing it there!
Conclusion
I hope this introduction to abstract Tezos blockchains has given you a better idea of how this innovative system works.
By keeping the concepts at a very high level, and only storing metadata about the implementation on the blockchain, the Tezos system is able to adapt itself to any known blockchain system.
The underlying implementation of the blockchain is loosely coupled to the higher level business logic (the shell). Which means the lower level code can be altered without requiring changes to other parts of the system.
The Tezos concept of an abstract blockchain, along with its self amending capability, allows the system to be upgraded in production, without taking the system down. Every component in the Tezos system can be modified and transparently committed on the blockchain. This is only possible because changing certain parts of the system do not break or require others to be modified.