After performing the initial database insert using
toolbox/db/blocks_inserter, new blocks will be received from the network.
To keep the database in sync after the initial bulk insertion,
toolbox/db/blocks_updater should be kept running on its own terminal.
Difference From blocks_inserter
Although similar in structure,
blocks_updater works differently from
blocks_inserter (go here for details on blocks_inserter).
For instance, instead of starting from the oldest block file,
blocks_updater constantly refreshes the block file listing and starts from the end of the queue – most recent file first.
blocks_updater also will process the same file multiple times if needed. This will often happen in the newest block file as new blocks get committed to disk.
blocks_inserter, on the other hand, will not process the same file twice, which can leave missing blocks from the most recent blocks file when the process is restarted. Due to the immense resource consumption by
blocks_inserter, even if there are “holes” (missing blocks/TX’s/etc) in your local DB, leave them as they are and let
blocks_updater fill in the blanks.
- blocks_inserter is the bulk insertion tool. It’s OK if it leaves some holes in the DB, even though it’s designed to work well start to finish.
- blocks_updater is the fine grained DB maintenance tool. It will fill in any missing blocks, TX’s, inputs, outputs and address graph.
Because blocks_updater checks if data structures already exist, it is much slower than inserter.
While blocks can be uniquely identified by block hash, block height or merkle tree signature, other data structures cannot be used as a single field superkey as they may not be unique.
For example, inputs and outputs are identified by a tuple consisting of their sequence number and a TX hash. Every single input and output must be checked before it can be updated.
Single Thread Operation
blocks_updater is not meant for bulk inserts, but instead its purpose is to keep the local database up to date, it only needs one thread to run.
There is no throughput requirement in Bitcoin or other popular cryptos to require more than one CPU for the update process. Any low cost CPU and disk combination performing local reads will provide much higher throughput than most popular cryptos will require in the near future.