I want to start to write down some ideas:
– Each node has a node id, and a network id, so other nodes can check that info to accept the node as its peer or not
– A node could have configured a hardcoded list of initial nodes, to use as peers
– But it could have another list: a list of special nodes that knows other nodes in the network. This nodes are not peers of the first node. They are helpers, node registries, that knows other peers in the network, that can be used as peer by a new node in the network. Usually, this lists is not a list of IPs, but of machines by name, in a DNS controlled by the blockchain network infrastructure.
When a new node starts to run, it communicate its existence to this list of peer registry nodes, and actively query them for initial peers.
Each node has a number of maximum peers to use and connect. When one of these connections drops, or it is not suitable for be a peer node, the node tries other known peers, or ask new peers to the registry servers.
One way to ensure a good distribution of connection, is identify the known nodes in zones (maybe, node id modulus a low number). When a node of zone 2 needs peers, the registry servers sends to it peers of zones 1 and 3. In this way, the node start to be connected with more and more peers. But avoiding to know ALL the peers in the network: only some peers of its adjacent zones. This is for security reasons: if the FULL list of peers is not general available, it’s better to avoid global network attacks.
As usual, I should design all these with TDD, guided by simplicity and initial use cases.