Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeker autoconnect #7798

Merged

Conversation

endothermicdev
Copy link
Collaborator

Important

24.11 FREEZE NOVEMBER 7TH: Non-bugfix PRs not ready by this date will wait for 25.02.

Checklist

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes.
  • Documentation has been reviewed and updated as needed.
  • Related issues have been listed and linked, including any that this PR closes.

This builds on #7768 to make our gossip seeker more capable at syncing the network graph. In the case of nodes with a small number of peers, it may be impossible to get a reasonably complete enough gossip sync. In this case, choosing a node at random to try to connect and request additional gossip seems reasonable.

Testing from 14 Sept 2024 found that only 30% of 60 nodes tested were able to sync >90% of the current network graph. This should significantly aid small nodes in syncing their gossip.

Changelog-Changed: Gossipd can now request connections to additional nodes for improved gossip sync

@endothermicdev endothermicdev added this to the v24.11 milestone Nov 8, 2024
@endothermicdev endothermicdev force-pushed the seeker-autoconnect branch 4 times, most recently from 5c15ef8 to bb81f52 Compare November 13, 2024 22:24
Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs sync code. That will probably need a dev command to trigger it manually for testing, too...

Comment on lines +974 to +981
static struct node_and_addrs *get_random_node(const tal_t *ctx,
struct seeker *seeker)
{
struct gossmap *gossmap = gossmap_manage_get_gossmap(seeker->daemon->gm);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better done by providing a "struct gossmap_node *gossmap_random_node(const struct gossmap *map);" in gossmap.c, which can use nodeidx_htable_pick. Then iterate until NULL, or you find one with an address.

Comment on lines +1036 to +1030
if(!random_node->addrs || tal_count(random_node->addrs) == 0)
status_broken("seeker: random gossip node missing address");

u8 *msg = towire_gossipd_connect_to_peer(NULL, random_node->id,
random_node->addrs);
daemon_conn_send(seeker->daemon->master, take(msg));
tal_free(random_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify: don't hand addrs here, since connectd already has that logic internally.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it doesn't yet, that patch is coming.

gossipd/seeker.c Outdated
/* FIXME: store random sync peer and don't select next time. */
status_peer_debug(&random_peer->id,
"seeker: chosen for periodic full sync");
normal_gossip_start(seeker,random_peer, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do a sync, it just starts gossip from this point.

You need to send query_channel_range, then query_short_channel_ids for any we didn't know about in the reply.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ask_for_all flag is set, so we should already get everything without a query. Are you saying we should just be streaming from current and then performing a channel query to fill in gaps? I think the benefit of just asking for all is we backfill any channel updates we may have missed (for channels we already know about.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this will work with LND nodes, but it will with CLN nodes...

@rustyrussell
Copy link
Contributor

I'm going to rebase and fix these nits up, since I have another PR which will clash...

@rustyrussell rustyrussell force-pushed the seeker-autoconnect branch 2 times, most recently from b6fb14b to 9cbe0b7 Compare November 21, 2024 04:52
rustyrussell and others added 9 commits November 22, 2024 14:02
…u1) 12.3.0

```
common/test/run-splice_script.c: In function ‘main’:
common/test/run-splice_script.c:349:17: error: ‘%.*s’ directive argument is null [-Werror=format-overflow=]
  349 |         printf("%.*s\n", (int)len, str);
      |                 ^~~~
cc1: all warnings being treated as errors
make: *** [Makefile:297: common/test/run-splice_script.o] Error 1
make: *** Waiting for unfinished jobs....
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This will let us change the default gossipers at runtime
Gossipd uses this to ask lightningd -> connectd to initiate
a connection to a new gossip peer.  This can be used when
there are insufficient peers already connected to gossip with.

Changelog-Changed: Gossipd can now request connections to additional nodes for improved gossip sync
This does not validate a node announcement and address, but it
does select a node at random from the gossmap and asks lightningd
to attempt a connection to it.
Changelog-added: Added option --autoconnect-seeker-peers, allowing seeker to reach out to new nodes for additional gossip.
It's easy for gossmap, since it has access to the htable.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…ap_manage.

We don't want to to refresh the gossmap internally: this could invalidate the
gossmap held by the current callers.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell
Copy link
Contributor

Rebased on flake fixes

@rustyrussell rustyrussell merged commit d5c0d21 into ElementsProject:master Nov 22, 2024
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants