After performing the initial database insert using toolbox/db/blocks_inserter
, new blocks will be received from the network.
To keep the database in sync after the initial bulk insertion, toolbox/db/blocks_updater
should be kept running on its own terminal.
Although similar in structure, blocks_updater
works differently from blocks_inserter
(go here for details on blocks_inserter).
For instance, instead of starting from the oldest block file, blocks_updater
constantly refreshes the block file listing and starts from the end of the queue – most recent file first.
This means blocks_updater
also will process the same file multiple times if needed. This will often happen in the newest block file as new blocks get committed to disk. blocks_inserter
, on the other hand, will not process the same file twice, which can leave missing blocks from the most recent blocks file when the process is restarted. Due to the immense resource consumption by blocks_inserter
, even if there are “holes” (missing blocks/TX’s/etc) in your local DB, leave them as they are and let blocks_updater
fill in the blanks.
In short:
Because blocks_updater checks if data structures already exist, it is much slower than inserter.
While blocks can be uniquely identified by block hash, block height or merkle tree signature, other data structures cannot be used as a single field superkey as they may not be unique.
For example, inputs and outputs are identified by a tuple consisting of their sequence number and a TX hash. Every single input and output must be checked before it can be updated.
Since blocks_updater
is not meant for bulk inserts, but instead its purpose is to keep the local database up to date, it only needs one thread to run.
There is no throughput requirement in Bitcoin or other popular cryptos to require more than one CPU for the update process. Any low cost CPU and disk combination performing local reads will provide much higher throughput than most popular cryptos will require in the near future.