Skip to content

Latest commit

 

History

History
152 lines (105 loc) · 8.63 KB

README.md

File metadata and controls

152 lines (105 loc) · 8.63 KB

Polygon Metrics Node

This software is intended to be run by Polygon PoS validators to share health metrics of their nodes. This enables the Polygon community to build early warning systems, dashboards, and other tooling to benefit both the validators themselves as well as the broader audience by increasing transparency. There are currently a bit over 100 validators in the Polygon PoS network.

Polygon PoS validators run two types of nodes to validate the chain: Bor and Heimdall. Both ship with a HTTP API that outputs health metrics in Prometheus format. Additionally, the setup includes two different machines set up in a similar way: a Sentry machine and the actual Validator machine. In total there are 4 such metrics endpoints (Bor and Heimdall on both Sentry and Validator).

The Metrics node periodically calls those endpoints, formats the metrics to a more readable JSON format, and publishes the metrics over the decentralized Streamr protocol. The result is a firehose of health metrics from the validators, which anyone can subscribe to and build on.

The idea of the Metrics node is introduced in this Polygon governance proposal draft.

Installing the Metrics node

The Metrics node is available as a Docker image to make it easy to download and run regardless of platform, or to plug into orchestration frameworks like Kubernetes. These step-by-step instructions are for trying out the image using the docker command line tool, but if you use Kubernetes or a hosted cloud platform for Docker containers, then please refer to their respective documentation on how to run Docker containers.

  1. Check that the Prometheus API is enabled on both Bor and Heimdall, on both Validator and Sentry machines:
    • Heimdall: in your config.toml (usually located at /var/lib/heimdall/config/config.toml), you need to have prometheus = true. (See Polygon docs)
    • Bor: it's on by default, but you can check your config.toml (usually located at /var/lib/bor/config.toml) in which you need to have this.
  2. Configure the firewall on Validator and Sentry machines to accept connections to ports 26660 and 7071 from the Metrics node machine
  3. Create a new Ethereum address and private key using your wallet/tool of choice (MetaMask, Vanity address generator, etc.)
  4. Send the above Ethereum address (NOT the private key!) to the onboarding person for the Metrics network (currently ping @henri#1016 on #pos-discussion on Polygon Discord)
  5. Install Docker if you don't have it
  6. Use the docker command-line tool to download and start the image:
docker run hpihkala/polygon-metrics-node

The above command should try to start it, and exit with the following error:

Error: The following env variables are required: 
METRICS_PRIVATE_KEY
VALIDATOR_NAME

That's a very good sign! The program started successfully but quit because you didn't supply any configuration via environment variables. Let's do that now!

  • Create a file called env.list
  • Paste the following template into that file and fill the values according to your setup, replacing the "XXX" with your actual values:
# Your Metrics private key. The corresponding address must be whitelisted to publish on the metrics streams.
METRICS_PRIVATE_KEY=XXX

# The name of your validator node as shown in the Polygon Staking UI: https://staking.polygon.technology
VALIDATOR_NAME=XXX

# URL to the Prometheus endpoint on your Validator Heimdall
VALIDATOR_HEIMDALL=http://XXX:26660/metrics

# URL to the Prometheus endpoint on your Validator Bor
VALIDATOR_BOR=http://XXX:7071/debug/metrics/prometheus

# URL to the Prometheus endpoint on your Validator Heimdall
# If you run multiple sentry nodes, you can give several URLs separated by commas
SENTRY_HEIMDALL=http://XXX:26660/metrics

# URL to the Prometheus endpoint on your Validator Bor
# If you run multiple sentry nodes, you can give several URLs separated by commas
SENTRY_BOR=http://XXX:7071/debug/metrics/prometheus

# Optional: You can give custom names to your multiple sentry nodes, separated by commas.
# These correspond to the URLs passed and the number of entries must be the same.
# If not given and there are multiple URLs configured, the VALIDATOR_NAME will be used with "-1", "-2", etc. appended.
# SENTRY_BOR_NAMES=MyFirstName,MySecondName
# SENTRY_HEIMDALL_NAMES=MyFirstName,MySecondName

# Optional: How often to read and publish the metrics, in seconds. Default: `60` seconds
# POLL_INTERVAL_SECONDS=60

# Optional: How soon to timeout if the endpoint doesn't respond. Default: `10` seconds
# REQUEST_TIMEOUT_SECONDS=10
  • Save your changes to the file.
  • Test your config with:
docker run --env-file env.list hpihkala/polygon-metrics-node --test-config
  • If there's an error, see the Troubleshooting section below for help.
  • If the above run ends successfully with Everything seems fine!, you're good to go. Start the Metrics node in the background (-d) and configure it to start on reboot (--restart unless-stopped) with:
docker run -d --restart unless-stopped --env-file env.list hpihkala/polygon-metrics-node

To view the log for troubleshooting, use docker ps to find the ID of the container, and then docker logs -f [ID] to see the logs.

For further information about running, stopping, and updating containers see the Docker docs.

Updating the Metrics node image

  • Pull the newest image with docker pull hpihkala/polygon-metrics-node
  • Use docker ps to find the ID of the currently running container
  • docker stop [ID]
  • docker rm [ID]
  • Restart using the usual start command docker run -d --restart unless-stopped --env-file env.list hpihkala/polygon-metrics-node

Troubleshooting

Error: Your address 0x... does not have permission to publish to polygon-validators.eth/...!
  • Get in touch with the onboarding person for the Metrics network (currently ping @henri#1016 on #pos-discussion on Polygon Discord)
Error: Couldn't successfully retrieve metrics for [node] from [url].
  • Check that the given URL points to the right machine
  • Check your firewall settings on that machine: the Prometheus metrics API ports (by default 26660 and 7071) must be allowed from the Metrics machine
  • Check that the Prometheus metrics API is enabled on Bor and Heimdall (see the first step in the installation instructions above)
Error: Number of Sentry Bor URLs doesn't match the number of names!
  • Check that you pass in the correct number of explicit node names. For example, if you pass in 3 URLs in SENTRY_BOR and want to assign custom names for them, you need to pass in 3 names in SENTRY_BOR_NAMES.

Subscribing to the data

The data is being published to the following four stream ids, one per each node type:

Builders seeking to use the data can easily subscribe to the above streams using one of the following:

Example using the CLI tool:

streamr stream subscribe polygon-validators.eth/sentry/heimdall

Data format and content

Examples of the data format and content can be found here:

You can of course also subscribe to the streams to see current metrics content published by validators.

The metric types GAUGE, COUNTER, SUMMARY, and HISTOGRAM and corresponding values are as defined in Prometheus docs.