Retrieval API Based Filter Configs #1020

hannahhoward · 2022-11-30T08:07:15Z

Goal

Per ongoing discussions with @s0nik42, we're updating the endpoint that previously transmitted peer allow/deny lists to send additional agreed upon configuration for retrieval.

The api structure is now as follows:

{
	"AllowDenyList": {
		"Type": "allowlist",  // "allowlist" or "denylist"
			"PeerIDs": ["Qma9T5YraSnpRDZqRR4krcSJabThc8nwZuJV3LercPHufi", "QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N"]
	},
        "UnderMaintenance": true,
        "StorageProviderLimits": {
	        "Bitswap": {
                        "SimultaneousRequests": 10,
                        "SimultaneousRequestsPerPeer": 4,
			"MaxBandwidth": "10mb" // human readable size metric, per second
		}
	}
}

Implementations

uses some newly added bitswap apis to get the information needed to manage limits
implements a bandwidth measurement tool
updates multifilter to pass on additional information to individual filters
renames peer filter to remote config filter

For Discussion

Blocked on #994 , ipfs/go-bitswap#596

provide a simple tool for a time window'd measurement of bandwidth

add additional configs for the peer filter, update interfaces

implement bitswap setup between go-bitswap PR + other changes to support full configs

rename peer filter to remote config filter

cmd/booster-bitswap/server.go

dirkmc · 2022-12-05T20:29:54Z

I'd suggest we add a version field to the config API structure

Remove need for custom bitswap while providing a more correct simultaneous request limit

hannahhoward · 2022-12-08T02:30:42Z

@dirkmc for HTTP I tend to expect versioning to be in:

URL
Query params
Accept Headers

Maybe we can discuss this as a seperate ticket between now and final release.

hannahhoward · 2022-12-08T09:41:04Z

closed in favor of #1030

dirkmc · 2022-12-08T13:18:31Z

I'd suggest we keep this pull request open until it's approved. It makes it easier to discuss the PR.

I'm not sure that checking wantlists and sent messages (block / have / dont-have) exactly maps to "simultaneous requests". For example in certain circumstances if the server doesn't have a block it simply doesn't respond. The problem is that bitswap is not a request / response protocol.

For this reason it may not make sense to have a limit on the number of simultaneous requests. If the goal is to make sure that one peer doesn't hog all the resources, bitswap is already designed to distribute blocks in a manner that is fair to all peers. So I'd suggest we just drop the simultaneous requests limits (but keep the bandwidth limit).

Because bitswap is not a request / response protocol, and because the client can send a request for many blocks in a single message, I'm not sure that it makes sense to reject requests because of bandwidth limits.
For example if client A asks for 100 blocks, and the bandwidth reaches the limit, and then client B asks for 1 block, it doesn't seem fair to reject client B's request.

I'd suggest that we don't reject requests at all - instead we could limit the data rate at which bitswap serves blocks so that it doesn't hog all the server's resources. As mentioned above, bitswap already has mechanisms to make sure that each peer is served its fair share of data, so this is more about preventing bitswap from crowding out other retrieval endpoints (eg graphsync).

It may make more sense to implement bandwidth rate-limiting at the libp2p level, as libp2p has knowledge of how much data each protocol is sending, without needing to know about the implementation of the protocol itself.

dirkmc · 2022-12-08T13:33:22Z

With respect to versioning, I'm not sure I was clear in what I suggested. Copying from the PR description, the api structure is now as follows:

{
	"AllowDenyList": {
		"Type": "allowlist",  // "allowlist" or "denylist"
			"PeerIDs": ["Qma9T5YraSnpRDZqRR4krcSJabThc8nwZuJV3LercPHufi", "QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N"]
	},
        "UnderMaintenance": true,
        "StorageProviderLimits": {
	        "Bitswap": {
                        "SimultaneousRequests": 10,
                        "SimultaneousRequestsPerPeer": 4,
			"MaxBandwidth": "10mb" // human readable size metric, per second
		}
	}
}

I'm suggesting we add a version field to this structure, so that if (for example) we add "Graphsync" to "StorageProviderLimits" we can update the version, and it's easier for the process that parses this structure to understand which fields are expected.

hannahhoward · 2022-12-08T17:14:51Z

@dirkmc

I looked into limiting bandwidth at the libp2p level. Libp2p has limits for connections and streams, but it actually doesn't have the kind of per-protocol bandwidth limiting you're talking about.
If we simply limit outgoing bandwidth without rejecting requests, the we'd end up in a situation where no one gets any responses quickly, with the task queue just growing (though I guess the client might cancel tasks on their own?). It's certainly possible to use MessageSent hook rather than libp2p as a way to slowdown outgoing bandwidth usage if we wanted to go that route, but I think rejecting requests is ultimately has better performance implications.
Re: request limits -- so you raise a good point about not sending blocks at all. I saw the global SendDontHave (set true on our instance) but I didn't see the entry based SendDontHave. I guess that's the case where it wouldn't send anything. There are probably ways around that if we need.
- The not smart fix is just adding a timeout based deletion of tracked requests in the booster-bitswap code
- If we're concerned with "correctness" -- the number of tasks in the peer's taskqueue is probably best representation of "simultaneous requests" -- that would require some modifications to that queue but could be done.
Re: removing simultaneous limits entirely -- it seems like we should have involved you earlier in the discussions with @s0nik42 and Slingshot Evergreen. I'm not sure the best way to proceed. The SPs want these limits --- while the peer task queue provides a mechanism of "fairness" -- it doesn't provide what SPs are concerned about which is hard limiting their resource usage. Again, while I think we had hoped Libp2p could help here, it really just understands connections and streams. Especially cause Bitswap isn't request response, that doesn't provide much.

dirkmc · 2022-12-08T17:38:13Z

If we simply limit outgoing bandwidth without rejecting requests, the we'd end up in a situation where no one gets any responses quickly

Bitswap is designed to handle this use case - because it operates in a peer-to-peer environment, it needs to be resilient to getting spammed by a single peer. Essentially it will distribute blocks evenly amongst peers, so that no one peer can clog up the queue.

it doesn't provide what SPs are concerned about which is hard limiting their resource usage. Again, while I think we had hoped Libp2p could help here, it really just understands connections and streams

I understand that SPs want to limit resource usage, that makes sense 👍
As explained above, I don't think the limit should be applied to the number of concurrent requests, because of the nature of how bitswap works.
On the other hand I do think we should apply a bandwidth limit. If this is not currently available in libp2p I suggest we either add it there, or add a wrapper somewhere in our code or a new library.

I realize that's going to take time, so I suggest we punt on this requirement. Let's start testing booster-bitswap with SPs and work on bandwidth limiting in the mean time.

s0nik42 · 2022-12-08T17:50:49Z

To clarify about simultaneous requests, the correct phrasing should be "simultaneous TCP connections". So it's not based on a per request, but more per simultaneous TCP active connection.
I confirmed that during the SP call yesterday. Does that make a difference on your side ?

Limiting the bandwidth at the libp2p level is probably good.

What SP wants to protect themself from :

Out of Server ressource
Out of network equipment resources (router/fw)
Controlling how much bandwidth from the physical access is allocated to retrieval

dirkmc · 2022-12-08T17:53:45Z

Thanks Julien, that makes sense 👍
We already have a Connection Manager in libp2p that should be used for this purpose.

hannahhoward · 2022-12-08T18:46:57Z

To clarify about simultaneous requests, the correct phrasing should be "simultaneous TCP connections". So it's not based on a per request, but more per simultaneous TCP active connection.

OK that's helpful, we can certainly use this metric instead, and it's much simpler so that's great.

It's worth noting that currently the ConnManager won't let you change limits after its constructed, so a change on CIDGravity wouldn't take effect until you reboot booster-bitswap, unless we made a change in go-libp2p's conn manager.

It also won't work the way you expect if you run booster bitswap proxied through boostd. Connection limiting then is just boostd's job.

The same issues apply for the resource manager if we went that route. (only sets limits at startup, won't work the way you expect in proxied mode)

We can maybe doing this implementing a ConnectionGater instead if we want limits that change while running?

Limiting the bandwidth at the libp2p level is probably good.

Ok, so we can certainly explore doing it through libp2p but it looks like a longer discussion that will take a while based off some early convos with the libp2p folks.

The alternative is just to block the MessageSent hook I'm using already to limit bandwidth. It's an indirect approach but it has a much shorter implementation path. In fact, we use this exact approach to backpressure selector execution flow in go-graphsync, so I'm pretty sure there's some code I can copy that will do almost exactly what we want.

dirkmc · 2022-12-08T19:10:29Z

I think something like the connection gater approach sounds good 👍

Blocking the message sent hook sounds like a bit of a hack..

hannahhoward · 2022-12-08T21:51:42Z

I mean, I'm not sure what would be wrong with blocking the message sent hook. It would work. again -- we use these mechanisms in graphsync.

hannahhoward · 2022-12-08T21:52:34Z

Also though, another option is to simply tell SPs who care to put boost on a public node and configure bandwidth usage on a firewall.

hannahhoward · 2023-01-11T20:29:17Z

Closing pending future discussion

hannahhoward mentioned this pull request Nov 30, 2022

Add hooks for better peer filters ipfs/go-bitswap#596

Closed

hannahhoward requested a review from dirkmc November 30, 2022 08:15

hannahhoward mentioned this pull request Dec 1, 2022

booster-bitswap add option to limit the number of simultaneous peer connections #716

Closed

hannahhoward added 7 commits December 1, 2022 19:10

feat(bandwidthmeasure): tool to sample bandwidth

01e776f

provide a simple tool for a time window'd measurement of bandwidth

feat(filters): augment peer filter

47eafad

add additional configs for the peer filter, update interfaces

feat(booster-bitswap): hook up new filters

d0ea832

implement bitswap setup between go-bitswap PR + other changes to support full configs

refactor(filters): rename peer filter

e753db9

rename peer filter to remote config filter

style(lint): mod tidy

3e88682

refactor(filters): make filters work without endpoint

8dd2ce8

fix(booster-bitswap): fix run command issue

7c341f9

hannahhoward force-pushed the feat/peer-deny-list branch from 69e1b90 to 044ddd7 Compare December 2, 2022 16:40

hannahhoward force-pushed the feat/api-filter-configs branch from 0fd7a9e to 7c341f9 Compare December 2, 2022 16:40

hannahhoward mentioned this pull request Dec 5, 2022

Release: Booster bitswap #1030

Merged

dirkmc reviewed Dec 5, 2022

View reviewed changes

cmd/booster-bitswap/server.go Outdated Show resolved Hide resolved

dirkmc reviewed Dec 5, 2022

View reviewed changes

cmd/booster-bitswap/server.go Outdated Show resolved Hide resolved

hannahhoward added 2 commits December 7, 2022 18:15

refactor(booster-bitswap): reconfigure request limits

8fff4dd

Remove need for custom bitswap while providing a more correct simultaneous request limit

feat(filters): support auth header for config endpoint

24247d8

feat(booster-bitswap): add error log

f4cb847

hannahhoward closed this Dec 8, 2022

hannahhoward reopened this Dec 8, 2022

hannahhoward marked this pull request as ready for review December 8, 2022 17:19

hannahhoward closed this Jan 11, 2023

nonsense deleted the feat/api-filter-configs branch January 17, 2023 15:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieval API Based Filter Configs #1020

Retrieval API Based Filter Configs #1020

hannahhoward commented Nov 30, 2022 •

edited

Loading

dirkmc commented Dec 5, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

dirkmc commented Dec 8, 2022 •

edited

Loading

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022 •

edited

Loading

dirkmc commented Dec 8, 2022

s0nik42 commented Dec 8, 2022

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022 •

edited

Loading

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Jan 11, 2023

Retrieval API Based Filter Configs #1020

Retrieval API Based Filter Configs #1020

Conversation

hannahhoward commented Nov 30, 2022 • edited Loading

Goal

Implementations

For Discussion

dirkmc commented Dec 5, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

dirkmc commented Dec 8, 2022 • edited Loading

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022 • edited Loading

dirkmc commented Dec 8, 2022

s0nik42 commented Dec 8, 2022

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022 • edited Loading

dirkmc commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Dec 8, 2022

hannahhoward commented Jan 11, 2023

hannahhoward commented Nov 30, 2022 •

edited

Loading

dirkmc commented Dec 8, 2022 •

edited

Loading

hannahhoward commented Dec 8, 2022 •

edited

Loading

hannahhoward commented Dec 8, 2022 •

edited

Loading