Search consensys.net

By using this site, you agree to our use of cookies, which we use to analyse our traffic in accordance with our Privacy Policy. We also share information about your use of our site with our analytics partners.

Developers

How to Monitor Your Eth2 Validator and Analyze Your P&L

by Coogan BrennanJanuary 15, 2021
My Journey to Becoming a Validator on Ethereum 2 0  Part 3

This is the third article in a four-part series on how to run your own Eth2 validator. If you are new to this series, be sure to check out Part 1: Getting Started, Part 2: Setting Up Your Client and Part 4: Safely Migrating Your Eth2 Node.

Also, you all should be checking Ben Edgington’s Eth2.News newsletter for essential updates, bug fixes, and news on the forthcoming roadmap. Our Eth2 Knowledge Base is helpful if you need more of a background on key terms, phases, and ConsenSys’ Eth2 products.

Intro 

It’s been a month and a half since Ethereum 2.0 Beacon chain genesis kicked off. Already, 2,515,170 ETH has been staked (about $2.9 billion at current market rates) with 61,561 unique validators, and another 16,687 waiting in queue. Despite the tremendous interest in staking, it’s actually been a pretty uneventful month and a half: There have been no major disruptions, only a few slashings and validator participation in the 98th percentile most of the time. Now’s a good time to take a breath to take stock of what we’ve done so far. 

In this blog post I will be covering monitoring and financial analysis of your Eth2 validator. I provide an overview on how to access Teku metrics, set up Beaconcha.in notifications, and how to query the node. I also share my current P&L breakdown. In the final installment of this series, I will discuss how to safely and (hopefully) successfully migrate a Teku node from one server to another.

Monitoring

In this section, I’m going to walk through how to read your validator node’s metrics. Running an Ethereum 2.0 validator is running infrastructure for a distributed system. A crucial part of maintaining infrastructure is being able to see what’s going on. Luckily, Teku comes with a great suite of monitoring tools enabled with the “–metrics-enabled” flag on our start-up command, highlighted below:

ExecStart=/home/ubuntu/teku-20.11.1/bin/teku --network=mainnet<strong> </strong> <strong>--eth1-endpoint=INFURA_ETH1_HTTP_ENDPOINT_GOES_HERE </strong> <strong>--validator-keys=/home/ubuntu/validator_key_info/KEYSTORE-M_123456_789_ABCD.json:/home/ubuntu/validator_key_info/validator_keys/KEYSTORE-M_123456_789_ABCD.txt </strong> --rest-api-enabled=true --rest-api-docs-enabled=true --metrics-enabled --validators-keystore-locking-enabled=false <strong>--data-base-path=/var/lib/teku</strong>
Code language: HTML, XML (xml)

We do have to follow a few steps before being able to read the data.

For those not running a Teku client: First, why? Second, you can see the minimum metrics provided by all clients in Ethereum 2.0 specs here.

Installing Prometheus

First, we need to install Prometheus, an open source monitoring program, and Grafana, an open-source analytics and interactive visualization web app. Prometheus pulls the data and Grafana displays it.

On your Ubuntu command line, download the latest stable Prometheus:

curl -JLO <a href="https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz">https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz</a>
Code language: HTML, XML (xml)

Decompress the file like so:

tar -zxvf <a href="https://github.com/prometheus/prometheus/releases/download/v2.23.0/prometheus-2.23.0.linux-amd64.tar.gz">prometheus-2.23.0.linux-amd64.tar.gz</a>
Code language: HTML, XML (xml)

Move the binary so it’s available from command line:

Cd prometheus-2.23.0
Code language: CSS (css)
sudo mv prometheus promtool /usr/local/bin/

Check to make sure it’s been installed correctly:

prometheus --version promtool --version

Create a prometheus YML configuration file:

sudo nano prometheus.yml
Code language: CSS (css)

Paste these parameters to the configuration file:

global: scrape_interval: 15s scrape_configs:   - job_name: "prometheus"     static_configs:     - targets: ["localhost:9090"]   - job_name: "teku-dev"     scrape_timeout: 10s     metrics_path: /metrics    scheme: http    static_configs:    - targets: ["localhost:8008"]
Code language: PHP (php)

This instructs Prometheus to poll your Teku node every 10 seconds on the 8008 port. Hit command-X and press Y to save buffer

Now, let’s create a directory to put our Prometheus config file:

sudo mkdir /etc/prometheus sudo mv prometheus.yml /etc/prometheus/prometheus.yml

We’re going to make one other directory for other Prometheus files and move the console and console_libraries modules to /etc/prometheus

sudo mkdir /var/lib/prometheus sudo mv consoles/ console_libraries/ /etc/prometheus/
Code language: JavaScript (javascript)

We’ll create a prometheus user to run a systemd service, like we did for Teku (read more here about how roles-based user access is best practice for server security) and give it access to appropriate files:

sudo useradd --no-create-home --shell /bin/false prometheus sudo chown -R prometheus:prometheus /var/lib/prometheus  sudo chown -R prometheus:prometheus /etc/prometheus sudo chown -R prometheus:prometheus /usr/local/bin/
Code language: JavaScript (javascript)

Last, create a systemd service that can run in the background and restart itself if it fails:

sudo nano /etc/systemd/system/prometheus.service

In this file (which should be empty), we’re going to put in a series of commands for the systemd to execute when we start the service. Copy the following into the text editor:

[Unit] Description=Prometheus Wants=network-online.target After=network-online.target [Service] Type=simple User=prometheus Group=prometheus Restart=always RestartSec=5 ExecStart=/usr/local/bin/prometheus \   --config.file=/etc/prometheus/prometheus.yml \   --storage.tsdb.path=/var/lib/prometheus \   --web.console.templates=/etc/prometheus/consoles \   --web.console.libraries=/etc/prometheus/console_libraries\  --web.listen-address=0.0.0.0:9090 \ [Install] WantedBy=multi-user.target
Code language: JavaScript (javascript)

Type command-X, then type “Y” to save your changes

We have to restart systemctl to update it:

sudo systemctl daemon-reload

Start the service:

sudo systemctl start prometheus

Check to make sure it’s running okay:

sudo systemctl status prometheus

If you see any errors, get more details by running:

sudo journalctl -f -u prometheus.service
Code language: CSS (css)

You can stop the Prometheus service by running:

sudo systemctl stop prometheus

Install Grafana

We’re going to use the APT package manager for Linux to install Grafana. This will save us a good amount of work and give us what we need. We’ll follow the steps from the Grafana installation page:

sudo apt-get install -y apt-transport-https sudo apt-get install -y software-properties-common wget wget -q -O - https://packages.grafana.com/gpg.key | sudo apt-key add -
Code language: JavaScript (javascript)

We add the stable Grafana repository for updates:

echo "deb https://packages.grafana.com/oss/deb stable main" | sudo tee -a /etc/apt/sources.list.d/grafana.list
Code language: PHP (php)

Then we run APT:

sudo apt-get update sudo apt-get install grafana
Code language: JavaScript (javascript)

The package sets up a systemd service for us (including a user grafana) so we just need to run:

sudo service grafana-server start sudo service grafana-server status sudo update-rc.d grafana-server defaults
Code language: CSS (css)

SSH Tunnelling

Grafana creates a very slick dashboard where we can view our metrics. That dashboard is typically available in the browser, but since we’re running the server version of Ubuntu 20.04, it’s all command-line. So how do we access Grafana?

Enter SSH tunnelling. It’s the same protocol we use to access AWS from our command-line, but we’re going to set it up so we create a mirror port on our local computer that connects to a specific port on our AWS instance. That way, when we call up the port locally, say by opening the browser to http://localhost:3000, we are actually looking at the 3000 port on our AWS instance.

To do this properly, you’ll need your SSH key for AWS and the AWS IP information. You also need to know what port you’d like to connect to. In this case, we know our Grafana instance is running on port 3000, so the command-line instructions will have this generic structure:

ssh -N -L 3000:localhost:3000 -i "PATH_TO_AWS_KEYPAIR.pem"ubuntu@INSTANCE_IDENTIFIER.compute-ZONE.amazonaws.com
Code language: CSS (css)

This allows us to go to http://localhost:3000 on our local machine and see our Grafana dashboard. But we don’t have one yet, so we need to do the following:

Add Prometheus as data source:

Go to “add new data source”

Click “Prometheus” from the drop-down

Click “Save and Test”

Click + on left-hand menu and select “import dashboard”

Add Teku Grafana ID: 13457

And, bada-bing! We have our dashboard, visible from the comfort of our own browser:

Beaconcha.in App

The Grafana dashboard is excellent and Prometheus is storing information for us. However, there are other options to check validator status.

I’ve been using Beaconcha.in Dashboard mobile app for Android. It’s a simple interface, which is fine because it’s not my primary monitoring service. It allows me to quickly glance at my phone to check validator status and provides notifications if something’s wrong with the validator.

You enter the validator address you’d like to watch and that’s pretty much it! Again, not heavy-duty monitoring (that’s what the Grafana Teku feed provides). But it’s fine as a secondary service and binary “is the validator functioning or not”:

Querying the Node

Another way to “monitor” our Ethereum validator client is to query it! Like an Ethereum 1.0 client, our Ethereum validator client is storing and maintaining a world state. It’s much smaller compared to Ethereum 1.0, but it’s still on-chain data stored and maintained by your validator client. 

This is the same data consumed by the Prometheus / Grafana workflow. We are simply getting closer to the metal (virtually speaking) by querying the node ourselves. Here’s a sample of the available data (full list here):

  • Beacon chain information (genesis block, block headers and root, etc.)
  • Validator information (list of validators, validator balance, validator responsibilities, etc.)
  • Node information (overall health, list of peers, etc.)

cURL

The first way to do this is from the command line. When we started Teku, we added the flag –rest-api-enabled=true. This opens up an API endpoint at the default port of 5051 (you can specify another port by using the flag –rest-api-port=<PORT>).  You can double-check your port is open by running sudo lsof -i -P -n | grep LISTEN.

Once you’ve confirmed port 5051 is open by Teku, we will use cURL to send REST calls to the Teku API endpoint at http://localhost:5051. For example, here is the way we check the balance of the highest  performing validator (according to Beaconcha.in):

curl -X GET "http://localhost:5051/eth/v1/beacon/states/head/validator_balances id=0x8538bbc2bdd5310bcc71b1461d48704e36dacd106fa19bb15c918e69adbcc360e5bf98ebc3f558eb4daefe6d6c26dda5"
Code language: PHP (php)

Here’s the response I got back in mid-January 2021 (in Gwei): 

{"data":[{"index":"4966","balance":"32607646851"}]}
Code language: JSON / JSON with Comments (json)

Try out any of the methods on the Teku API doc page using the format at the bottom of this page:

curl -X [REST_METHOD]API_CALL_IN_QUOTES
Code language: CSS (css)

Swagger UI

There’s a basic graphic UI for API calls Teku provides when the flag –rest-api-docs-enabled=true is added in the start-up commands. It’s built on swagger-ui and It’s on the port 5051 by default and we can use SSH tunneling to access it. Follow the same SSH tunnelling steps from above but with 5051 as the port:

ssh -N -L 5051:localhost:5051 -i "PATH_TO_AWS_KEYPAIR.pem" ubuntu@INSTANCE_IDENTIFIER.compute-ZONE.amazonaws.com
Code language: CSS (css)

From the browser on our computer, we can then navigate to http://localhost:5051/swagger-ui, which looks like this on my machine:

World state and consensus is something that’s emergent in all public blockchains. This means Ethereum 2.0 reaches consensus by all validators storing and updating information. It’s a bit nerdy, but to look into your local state is to peer into a single pane of a much larger structure. A subset of the fractal constantly updating and emerging into something new. Try it!

Financial analysis

In my first post, I sketched out basic material requirements needed:

  • A three year commitment to staking 32 ETH and maintaining a validator node
  • 32 ETH (plus <1 ETH for gas costs)
  • $717.12 (three-year reserved instance pricing for an m5.xlarge instance) + 120 (one year’s cost of 100 GB of storage, conservatively assuming nearly full storage capacity) = $837.12 paid over the course of the year to AWS
  • MetaMask Extension (free install
  • Infura Account (free tier)

The AWS costs were for a three year lock-in, but I mentioned later I wasn’t quite ready to do that. And I’m glad I didn’t! You’ll see why in a moment, but here’s my basic breakdown of costs for the month of December 31st 2020:

AWS Monthly Costs

  • Data Transfer: $8.52
  • Server: $142.85
  • Storage: $72.50
  • Total: $223.87

Eth2 Validator Rewards 

  • Blocks: 5
  • Attestations: ~6,803
  • ETH Rewards: 0.420097728 ($485.83 USD)

As you can probably see, a profit of $261.96  isn’t a great spread for one validator. There are a couple of options: This is a relatively stable cost, so I could stake another 32 ETH. The better option might be to change the VPS I’m using, which I mentioned in my first post, actually:

Initially, I was confident AWS was the best virtual platform and it’s the service I’ll use for this post and the next. However, after going through the whole process, I realized AWS might be overkill for the individual developer. AWS’ real strength seems to be its capacity to dynamically scale up to meet demand which comes at a premium cost. This makes economic sense for a large scale, enterprise-level project, but individual Ethereum 2.0 current client requirements do not require such rigor.

I’m going to continue with AWS but am also entertaining the option of running an instance on Digital Ocean, which may be more appropriate for an individual developer. 

I think I can get a much better profit from running on Digital Ocean and still not take a hit on my validator performance. A friend is running a validator instance on a much smaller VPS which costs an order of magnitude less and we have the same validator performance. 

It’s great to experiment with AWS and I don’t regret having the capacity in case something goes sideways on the beacon chain. However, I think it’s really great that Eth 2 devs are delivering on the promise of making validating available from home networks and setups! 

The current price modulations also make financial analysis hard, as server costs are fixed in USD but rewards are fluctuating. Long-term, I’m very confident that my validator rewards will increase in value. It does make cost-benefit tricky!

For the last installment of this series, I will discuss how to safely and (hopefully) successfully migrate a Teku node from one server to another. The major issue is getting slashed, of course. It seems the vast majority of slashings that have taken place is due to this very issue. We’ll see how it goes…