Setting Up a Flow Node

First you'll need to provision a machine or virtual machine to run your node software.

Hardware Requirements

The hardware your Node will need varies depending on the role your Node will play in the Flow network. For an overview of the differences see the Node Roles Overview.

Node TypeCPUMemoryDiskExample GCP Instance
Collection4 cores32 GB200 GBn2-highmem-4
Consensus2 cores16 GB200 GBn2-standard-4
Execution64 cores800 GB9 TBn2-highmem-128
Verification2 cores16 GB200 GBn2-highmem-2
Access16 cores64 GB900 GBn2-standard-16
Observer2 cores4 GB300 GBn2-standard-4

Note: The above numbers represent our current best estimate for the state of the network. These will be actively updated as we continue benchmarking the network's performance.

To run an Observer node, follow this guide.

Networking Requirements

Most of the load on your nodes will be messages sent back and forth between other nodes on the network. Make sure you have a sufficiently fast connection; we recommend at least 1Gbps, and 5Gbps is better.

Each node will require either a static IPv4 address or a fixed DNS name. Either works, and we'll refer to this more generally as your 'Node Address' from here on out.

Your Node Address must be a publicly routable IPv4 address or valid DNS name that points to your node. This is how other nodes in the network will communicate with you.

While both a static IPv4 and a domain name are possible, we prefer and recommend that node operators register their node under a domain that they control. This gives the Flow network more options for resiliency and resistance to adverse network conditions.

Crash recovery and denial of service attacks are two concerns that operators can mitigate relying on each of DNS indirection and IP routing. The later requires more involvement.

Running a node behind an operator-controlled hostname (rather than "just" an IP) is a simple and cheap measure that:

  • offers additional technical pathways to let operators improve resiliency and security,
  • lets them opt in to those measures as a reaction to an attack,
  • does not preclude any lower-level IP-based resiliency approaches.

Your firewalls must expose TCP/3569 for Node communication. If you are running an Access Node, you must also expose the GRPC port 9000.

Flow Architecture

Operating System Requirements

The Flow node code is distributed as a Linux container image, so your node must be running an OS with a container runtime like docker or containerd.

The bootstrapping scripts we'll use later are compiled binaries targeting an amd64 architecture, so your system must be 64-bit. Some of these scripts are bash based hence a shell interpreter that is bash compatible will also be needed.

Flow also provides systemd service and unit files as a template for installation, though systemd is not required to run Flow.

Flow is distributed in such a way that makes it very system agnostic. You are free to build your own orchestration around how you run your nodes and manage your keys.



For the remainder of this guide, we cover the most simple case, a single node being hand deployed. This should give you a good sense of what's needed, and you can modify to suit your needs from there.



The Flow team has tested running nodes on Ubuntu 18.04 and GCP's Container Optimized OS, which is based on Chromium OS. If you are unsure where to start, those are good choices.

Time synchronization

You should also ensure you run time synchronization on the machine hosting the container, to avoid clock drift. In practice, this means configuring a client for the NTP protocol, and making sure it runs as a daemon. ntpd is one recommended example. To configure it, you just have to point it to an NTP server to query periodically. A default from your Linux distribution or cloud operator may already be set, and in the interest of decentralization, our recommendation would be to use it unless you have a specific reason to do otherwise.

  • Leap-smearing: Leap-smearing time servers and non-leap-smearing time servers are both acceptable for the magnitude of our time precision requirements - though considering very few providers offer leap smearing time servers, a "regular" time server helps ensure our pool of time providers is more diverse.

  • Why not do it in the container itself? Why do we need to do this?: Without special privileges and in all major container runtimes, a container will not run with the CAP_SYS_TIME capability. For Flow, this means that the node software itself cannot change the time of the host machine, making the in-container use of standard time synchronization protocols ineffective.

  • Why does time matter in Flow?: Time information comes up in consensus and in smart contracts. The consensus algorithm of Flow allows nodes to exit the influence of a corrupt or ineffective "leader" node by collectively deciding to switch to the next "phase" of the protocol at the right time. The smart contract language also allows developer access to block time stamps, which provide an approximation of time. To resist manipulation in each case, honest nodes must compute timing values from an aggregate of the information provided by all nodes. That approach, though resilient, is still sensitive to inaccurate time information. In other words, a node subject to clock drift but otherwise honest will not stop the consensus, but might make it slower.

Setup Data Directories & Disks

Flow stores protocol state on disk, as well as execution state in the case of execution nodes.

Where the data is stored is up to you. By default, the systemd files that ship with Flow use /var/flow/data. This is where the vast majority of Flow's disk usage comes from, so you may wish to mount this directory on a separate disk from the OS. The performance of this disk IO is also a major bottleneck for certain node types. While all nodes need to make use of this disk, if you are running an execution node, you should make sure this is a high performing SSD.

As a rough benchmark for planning storage capacity, each Flow block will grow the data directory by 3-5KiB.

Confidential Data & Files

Flow stores dynamically generated confidential data in a separate database. We strongly recommend enabling encryption for this database - see this guide for instructions.

Confidential information used by Flow is stored in the private-root-information subtree of the bootstrap folder. In particular:

  • the staking private key (node-info.priv.json)
  • the networking private key (node-info.priv.json)
  • the encryption key for the secrets database (secretsdb-key)
  • (if applicable) the initial random beacon private key (random-beacon.priv.json)

These files contain confidential data, and must be stored and accessed securely.

Pull the Flow Images

The flow-go binaries are distributed as container images, and need to be pulled down to your host with your image management tool of choice.

Replace $ROLE with the node type you are planning to run. Valid options are:

  • collection
  • consensus
  • execution
  • verification
  • access
1
# Docker
2
docker pull gcr.io/flow-container-registry/${ROLE}:alpha-v0.0.1
3
4
# Containerd
5
ctr images pull gcr.io/flow-container-registry/${ROLE}:alpha-v0.0.1",

Prepare Your Node to Start

Your nodes will need to boot at startup, and restart if they crash.

If you are running systemd you can use the service files provided by flow-go. Find them in the Flow Go.

If you are using some other system besides Systemd, you need to ensure that the Flow container is started, the appropriate key directories are mounted into the container, and that the container will automatically restart following a crash.

The systemd files pull runtime settings from /etc/flow/runtime-config.env and any .env files under /etc/flow/conf.d. Examples of these files are also available in the github repo. You will need to modify the runtime config file later.

Systemd

If you are not using Systemd, you can skip this step

  1. Ensure that you pulled the latest changes from flow-go repository via git
1
## Clone the repo if you haven't already done so
2
git clone https://github.com/onflow/flow-go
3
4
## Get latest changes
5
cd flow-go
6
git pull origin master
  1. Copy your respective systemd unit file to: /etc/systemd/system
  2. Create directory sudo mkdir /etc/flow
  3. Copy the runtime-conf.env file to: /etc/flow/
  4. Enable your service sudo systemctl enable flow-$ROLE.service (replace $ROLE with your node role - eg. collection)

Docker Configuration

If you are not using Systemd, sample commands for running each Docker container are below. Be sure to replace /path/to/data and /path/to/bootstrap with the appropriate paths you are using.

Do not run your node using docker run command directly without a mechanism for the node to automatically restart following a crash.

Access

1
docker run --rm \
2
-v /path/to/bootstrap:/bootstrap:ro \
3
-v /path/to/data:/data:rw \
4
--name flow-go \
5
--network host \
6
gcr.io/flow-container-registry/access:v0.25.7 \
7
--nodeid=${FLOW_GO_NODE_ID} \
8
--bootstrapdir=/bootstrap \
9
--datadir=/data/protocol \
10
--secretsdir=/data/secrets \
11
--rpc-addr=0.0.0.0:9000 \
12
--http-addr=0.0.0.0:8000 \
13
--collection-ingress-port=9000 \
14
--script-addr=${FLOW_NETWORK_EXECUTION_NODE} \
15
--bind 0.0.0.0:3569 \
16
--loglevel=error

Collection

1
docker run --rm \
2
-v /path/to/bootstrap:/bootstrap:ro \
3
-v /path/to/data:/data:rw \
4
--name flow-go \
5
--network host \
6
gcr.io/flow-container-registry/collection:v0.25.7 \
7
--nodeid=${FLOW_GO_NODE_ID} \
8
--bootstrapdir=/bootstrap \
9
--datadir=/data/protocol \
10
--secretsdir=/data/secrets \
11
--ingress-addr=0.0.0.0:9000 \
12
--bind 0.0.0.0:3569 \
13
--loglevel=error

Consensus

1
docker run --rm \
2
-v /path/to/bootstrap:/bootstrap:ro \
3
-v /path/to/data:/data:rw \
4
--name flow-go \
5
--network host \
6
gcr.io/flow-container-registry/consensus:v0.25.7 \
7
--nodeid=${FLOW_GO_NODE_ID} \
8
--bootstrapdir=/bootstrap \
9
--datadir=/data/protocol \
10
--secretsdir=/data/secrets \
11
--bind 0.0.0.0:3569 \
12
--loglevel=error

Execution

1
docker run --rm \
2
-v /path/to/bootstrap:/bootstrap:ro \
3
-v /path/to/data:/data:rw \
4
--name flow-go \
5
--network host \
6
gcr.io/flow-container-registry/execution:v0.25.7 \
7
--nodeid=${FLOW_GO_NODE_ID} \
8
--bootstrapdir=/bootstrap \
9
--datadir=/data/protocol \
10
--secretsdir=/data/secrets \
11
--triedir=/data/execution \
12
--rpc-addr=0.0.0.0:9000 \
13
--bind 0.0.0.0:3569 \
14
--loglevel=error

Verification

1
docker run --rm \
2
-v /path/to/bootstrap:/bootstrap:ro \
3
-v /path/to/data:/data:rw \
4
--name flow-go \
5
--network host \
6
gcr.io/flow-container-registry/verification:v0.25.7 \
7
--nodeid=${FLOW_GO_NODE_ID} \
8
--bootstrapdir=/bootstrap \
9
--datadir=/data/protocol \
10
--secretsdir=/data/secrets \
11
--bind 0.0.0.0:3569 \
12
--loglevel=error

Start the Node

Now that your node is provisioned and configured, it can be started.

Before starting your node, ensure it is registered and authorized.

Ensure you start your node at the appropriate time. See Spork Process for when to start up a node following a spork. See Node Bootstrap for when to start up a newly registered node.

Systemd

  1. Check that your runtime-conf.env is at /etc/flow/runtime-conf.env
  2. Update your environment variables: source /etc/flow/runtime-conf.env
  3. Start your service: sudo systemctl start flow

Verify your Node is Running

Here are a few handy commands that you can use to check if your Flow node is up and running

Systemd

  • To get Flow logs: sudo journalctl -u flow-YOUR_ROLE
  • To get the status: sudo systemctl status flow
1
● flow-verification.service - Flow Access Node running with Docker
2
Loaded: loaded (/etc/systemd/system/flow-verification.service; enabled; vendor preset: enabled)
3
Active: active (running) since Wed 2020-05-20 18:18:13 UTC; 1 day 6h ago
4
Process: 3207 ExecStartPre=/usr/bin/docker pull gcr.io/flow-container-registry/verification:${FLOW_GO_NODE_VERSION} (code=exited, status=0/SUCCESS)
5
Main PID: 3228 (docker)
6
Tasks: 10 (limit: 4915)
7
Memory: 33.0M
8
CGroup: /system.slice/flow-verification.service
9
└─3228 /usr/bin/docker run --rm -v /var/flow/bootstrap:/bootstrap:ro -v /var/flow/data:/data:rw --rm --name flow-go --network host gcr.io/flow-container-registry/verification:candidate8 --nodeid=489f8a4513d5bd8b8b093108fec00327b683db545b37b4ea9153f61b2c0c49dc --bootstrapdir=/bootstrap --datadir=/data/protocol --alpha=1 --bind 0.0.0.0:3569 --loglevel=error

Docker

  • To get Flow logs: sudo docker logs flow-go
  • To get the status: sudo docker ps
1
$ sudo docker ps
2
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
3
1dc5d43385b6 gcr.io/flow-container-registry/verification:candidate8 \"/bin/app --nodeid=4\" 30 hours ago Up 30 hours flow-go

Monitoring and Metrics

This is intended for operators who would like to see what their Flow nodes are currently doing. Head over to Monitoring Node Health to get setup.

Node Status

The metrics for the node should be able to provide a good overview of the status of the node. If we want to get a quick snapshot of the status of the node, and if it's properly participating in the network, you can check the consensus_compliance_finalized_height or consensus_compliance_sealed_height metric, and ensure that it is not zero and strictly increasing.

1
curl localhost:8080/metrics | grep consensus_compliance_sealed_height
2
3
# HELP consensus_compliance_sealed_height the last sealed height
4
# TYPE consensus_compliance_sealed_height gauge
5
consensus_compliance_sealed_height 1.132054e+06