Downloader

Seed/download historical data

Downloader is the service for seeding/downloading historical data using the BitTorrent protocol. This data is stored in the form of snapshots, which are actually immutable .seg files.

The ETH core instructs the downloader component to download (and then seed) specific files from the BitTorrent network. The files are specified by their "info hashes", which are a form of content addressing. The files that ETH core instructs to download are block headers and block bodies. Downloader interacts then with the BitTorrent network to retrieve files needed by ETH core.

circle-info

Info: While all Erigon components are separable and can be run on different machines, the downloader must run on the same machine as Erigon to be able to share downloaded and seeded files.

Start Erigon with snapshots support

Like many other Erigon components (txpool, sentry, rpc daemon) downloader can be integrated into Erigon or run as a separate process.

Downloader run by default inside Erigon with the --snapshots flag:

./build/bin/erigon  --snapshots --datadir=<your_datadir>
circle-check

Running downloader as a separate process

It's possible to start Downloader as independent process with --snapshots --downloader.api.addr=127.0.0.1:9093 flag.

Before using a separate downloader process the executable must be built:

cd erigon
make downloader

And you can then start the downloader

./build/bin/downloader --downloader.api.addr=127.0.0.1:9093 --torrent.port=42068 --datadir=<your_datadir>

--downloader.api.addr - is for internal communication with Erigon

--torrent.port=42068 - is for public BitTorrent protocol listen

circle-info

You can increase/limit the network usage by adding the following flags:

--torrent.download.rate=512mb --torrent.upload.rate=512mb

The default download speed is 16mb/sec.

Erigon on startup sends list of .torrent files to Downloader and waits for 100% download completion

Use --snap.keepblocks=true to not delete retired blocks from DB.

Any network/chain can start with snapshot sync:

  • node will only download snapshots registered in the next repo https://github.com/erigontech/erigon-snapshotarrow-up-right

  • node will move old blocks from DB to snapshots of 1K blocks size, then merge snapshots to bigger range, until snapshots of 500K blocks, then automatically start seeding new snapshot

Creation of a new network or bootnode

You may need to create new snapshots and start seeding them

Creating new snapshots will dump blocks from Database to .seg files

Will create the .torrent files that downloader will automatically seed. The output format is compatible with https://github.com/erigontech/erigon-snapshotarrow-up-right.

circle-info

You can change the snapshot size by using the flag

--from=0 --to=1_000_000 --segment.size=500_000

Start downloader (seeds automatically)

circle-info

Erigon is not required for snapshots seeding, but Erigon with --snapshots also does seeding.

Additional info

Snapshots creation does not require a fully-synced Erigon, few first stages are enough. For example:

STOP_AFTER_STAGE=Senders

But for security it is better to have a fully-synced Erigon.

Erigon can use snapshots only after indexing them. Erigon will automatically index them but also can run (this step is not required for seeding):

Architecture

Downloader works based on <your_datadir>/snapshots/*.torrent files. Such files can be created in 4 ways:

  • Erigon can do grpc call downloader.Download(list_of_hashes), it will trigger creation of .torrent files

  • Erigon can create new .seg file, Downloader will scan .seg file and create .torrent

  • operator can manually copy .torrent files (rsync from other server or restore from backup)

  • operator can manually copy .seg file, Downloader will scan .seg file and create .torrent

Erigon does:

  • connect to Downloader

  • wait for download of all snapshots

  • when .seg file is available it automatically create .idx files - secondary indices, for example to find block by hash

  • then switch to normal staged sync (which doesn't require connection to Downloader)

  • ensure that snapshot downloading happens only once: even if new Erigon version does include new pre-verified snapshot hashes, Erigon will not download them (to avoid unpredictable downtime) - but Erigon may produce them by self.

Downloader does:

Technical details

  • To prevent attack .idx creation using random Seed - all nodes will have different .idx file (and same .seg files)

  • If you add/remove any .seg file manually you also need to remove <your_datadir>/snapshots/db folder

How to verify that .seg files have the same checksum as current .torrent files

Use it if you see strange behavior, bugs, bans, hardware problems, etc.

Faster rsync

Release details

Start automatic commit of new hashes to branch master

crontab -e @hourly cd <erigon_source_dir> && ./cmd/downloader/torrent_hashes_update.sh <your_datadir> <network_name> 1>&2 2>> ~/erigon_cron.log

It does push to branch auto, before release - merge auto to main manually

Command line options

To display available options for downloader digit:

The --help flag listing is reproduced below for your convenience.

Commands

Last updated

Was this helpful?