The following is a guide on how to run the Ziggurat 3.0 Crawler for Zcash as well as the associated programs Crunchy and P2P-Viz on Ubuntu 22.04 for gathering and visualizing Zcash network information.
The linked video below follows the same process.
The Zcash Crawler lives inside of a folder named 'zcash' so it may be advisable to create a new directory before cloning the crawler (runziggurat/zcash repo).
From the /Home directory, Run the following commands:
mkdir runziggurat
cd runziggurat
git clone https://github.com/runziggurat/zcash.git
cd zcash
Or open the readme at
'/runziggurat/zcash/src/tools/crawler/README.md'
This page contains information about specific usage.
$ cargo run --release --features crawler --bin crawler -- --help
OPTIONS:
-c, --crawl-interval <CRAWL_INTERVAL>
The main crawling loop interval in seconds [default: 5]
-h, --help
Print help information
-r, --rpc-addr <RPC_ADDR>
If present, start an RPC server at the specified address
-s, --seed-addrs <SEED_ADDRS>...
A list of initial standalone IP addresses and/or DNS servers to connect to
-n, --node-listening-port <NODE_LISTENING_PORT>
Default port used for connecting to the nodes [default: 8233]
-V, --version
Print version information
--seed-addrs \ --dns-seed is the only required argument and needs at least one specified address for it to run.
The command 'cargo run --release --features crawler --bin crawler -- --help' is the literal run command and will print the help menu shown.
Run the command
cargo run --release --features crawler --bin crawler -- --help
This will compile the program and ensure everything is working properly.
To run the Crawler, it is required to add a '--seed-addrs' flag to the start command, containing at least one, valid, Zcash node IP address. The crawler should be allowed to run for a reasonable amount of time to get an accurate result. Some sample node IP addresses can be found on https://zcashblockexplorer.com/nodes .
To get information from the Crawler while its running, it is required to add the '--rpc-addr' flag to the start command. This isn't required to only run the crawler itself but will otherwise require stopping the crawler (ctrl+c or SIGKILL) to display any information at all.
The crawler will begin communicating with the network (default every 20 secs) and gathering network data.
Information from the Crawler can be displayed by using curl to query the node (this requires jq for displaying that info).
The Crawler RPC address in this example is set to '127.0.0.1:54321'
This will display the current collected '.protocol_version' data contained within the '.result' field. The '.result' field is very large so it is useful to call specific portions of it instead. Other useful data types are '.num_known_nodes', '.num_good_nodes', '.user_agents' etc. See the metrics section Here
To run Crunchy and P2P-Viz, it is required to pipe the '.result' into a .json file.
This will create a 'latest.json' file in the current directory.This 'latest.json' file will be used with Crunchy.
At this point, the Crawler may be stopped with 'ctrl+c' if no more data is required. The Crawler will output a report to the terminal of useful information.
Crunchy
Crunchy is required to aggregate the output json file for use with P2P-Viz.
To build Crunchy, navigate to your '/runziggurat' folder
To clone into the Crunchy repo, Run the following commands
git clone https://github.com/runziggurat/crunchy.git
cd crunchy
Copy and paste the 'latest.json' file into the 'crunchy/testdata/' folder.
Select 'Geolocation' and then select 'Choose state file'.
From the file explorer pop-up, select the 'state.json' file.
The node explorer World Map will populate with the file data. See the readme Here for more details on usage options and settings.
TIPS!
You can set the Crawler on a timed crawl simply with the 'timeout' command which will issue a specific kill command after a set amount of time. Run 'timeout --help' for more info.
The following command will start and also automatically stop the crawler after 50 mins.
The 'latest.json' can be called and written into the '/testdata' so you don't have to manually copy and paste it.
TIPS!
IP Address information can be gathered from the output and then used to reseed the Crawler at start (--seed-addrs). This will reduce the time required to conduct a full crawl!