Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release: Booster bitswap #1030

Merged
merged 32 commits into from
Jan 17, 2023
Merged

Release: Booster bitswap #1030

merged 32 commits into from
Jan 17, 2023

Conversation

hannahhoward
Copy link
Collaborator

@hannahhoward hannahhoward commented Dec 5, 2022

Goals

This consolidates #1020 #994 and #967 as the new features for the production release of booster bitswap.

Guide To Booster-Bitswap

(this is a v0 of booster-bitswap docs)

booster-bitswap is a new binary that is run alongside the boostd market process in order to serve retrievals over Bitswap.

Currently, there is no payment method in the new binary but we do support a number of tools for managing a production grade Bitswap server for your SP's content.

Booster Bitswap Setup

Booster Bitswap Initialization And Demo

To run booster-bitswap, first build the binary:

make booster-bitswap

Then, initialize booster-bitswap:

booster-bitswap init

Record the peer ID output by booster-bitswap init -- we will need this peer id later

Collect the boost API Info:

export ENV_BOOST_API_INFO=`boostd auth api-info --perm=admin`

export BOOST_API_INFO=`echo $ENV_BOOST_API_INFO | awk '{split($0,a,"="); print a[2]}'`

We can now run booster-bitswap by running:

booster-bitswap run --api-boost=$BOOST_API_INFO

By default, booster-bitswap runs on port 8888.

We can demonstrate fetching over bitswap by running:

booster-bitswap fetch /ip4/127.0.0.1/tcp/8888/p2p/{peerID} {rootCID} outfile.car

Where peerID is the peer id recorded when you ran booster-bitswap init and rootCID is the CID of a data CID known to be stored on your SP.

Configuring Boost To Serve Retrievals Publicly

While the above setup demonstrates booster-bitswap working, it provides no mechanism for potential retrieval clients to discover and reach your booster-bitswap instance from the internet in order to retrieve over Bitswap. In order to set this up, we need to configure boostd itself.

There are two primary "modes" for exposing booster-bitswap to the internet:

  1. In "private mode" the booster-bitswap peer ID is not publicly accessible to the internet. Instead, public Bitswap traffic goes to boostd itself, which then acts as a reverse proxy, forwarding that traffic on to booster-bitswap. This is similar to the way one might configure Nginx as a reverse proxy for an otherwise private web server. private mode is simpler to setup but may produce greater load on boostd as a protocol proxy.
  2. In "public mode" you configure your public internet firewall to forward traffic directly to booster-bitswap, then configure boost so it can announce the public address of booster-bitswap to the network indexer. "public mode" offers greater flexibility and performance -- you can even setup booster-bitswap to run over its own separate internet connection. However, it requires additional configuration and changes to your overall network infrastructure.

The diagram below demonstrates the various ways to setup booster-bitswap:

booster-bitswap

To setup these various modes, you'll need to make some changes to Boost's config file and restart boostd.

First, edit ~/.boost/config.toml and set the peer ID for bitswap

[DealMaking]
  BitswapPeerID ="{peer id for bosoter bitswap you recorded earlier}"

Setting this peer id tells boostd you're going to be running booster-bitswap and it should start announcing the capability to serve Bitswap retrievals to the Network indexer.

If you've decided to run in private mode, you don't need to make any more changes. Restart boostd and then also restart booster-bitswap with the command:
booster-bitswap run --api-boost=$BOOST_API_INFO --proxy={boostd multiaddress}

You can generally get a boostd multiaddress by running boostd net listen and using any of the returned addresses

If you want to run in public mode, first, you'll need to decide on a port for booster-bitswap and you'll need to configure your public firewall to forward requests on that port to booster bitswap. Now you'll need to set the following additional values in your boost config

[DealMaking]
 BitswapPeerID ="{peer id for bosoter bitswap you recorded earlier}"
 BitswapPublicAddresses = ["/ip4/{booster-bitswap public IP}/tcp/{booster-bitswap public port}"]
 BitswapPrivKeyFile = "{path to libp2p private key file for booster bitswap}"

The libp2p private key file for booster-bitswap can generally be found at ~/.booster-bitswap/libp2p.key

The reason boost needs to know the public multiaddresses and libp2p private key for booster-bitswap is so it can properly announce these records to the network indexer.

You'll now want to restart boostd and you'll want to restart booster-bitswap with the --port argument if you've setup your firewall to forward to something other than the default booster-bitswap port of 8888

Booster Bitswap Management

Booster Bitswap provides a number of performance and safety tools for managing a production grade bitswap server without overloading your infrastructure.

BadBits filtering

Booster-bitswap is automatically setup to deny all requests for CIDs that are on the BadBits Denylist. This is on by default and there is currently no way to turn it off.

Request Filtering

Booster Bitswap provides a number of controls for filtering requests and limiting resource usage. These are expressed in a JSON configuration as follows:

{
   "AllowDenyList": { // list of peers to either deny or allow (denying all others)
   	"Type": "allowlist",  // "allowlist" or "denylist"
   		"PeerIDs": [
   			"Qma9T5YraSnpRDZqRR4krcSJabThc8nwZuJV3LercPHufi",
   			"QmYyQSo1c1Ym7orWxLYvCrM2EmxFTANf8wXmmE7DWjhx5N"
   		]
   },
       "UnderMaintenance": false, // when set to true, denies all requests
       "StorageProviderLimits": {
           "Bitswap": {
                       "SimultaneousRequests": 100, // bitswap block requests served at the same time across peers
                       "SimultaneousRequestsPerPeer": 10, // bitswap block requests served at the same time for a single peer
   		"MaxBandwidth": "100mb" // human readable size metric, per second
   	}
   }
}

There are two ways you can manage this configuration. The first is to create and manually edit ~/.booster-bitswap/retrievalconfig.json. If you do this, you will need to restart booster-bitswap every time you edit the config. Also note that all configs are optional and absent parameters generally default to no filtering at all for the given parameter.

The second, and probably more user friendly way to work with your retrieval config is to fetch it from a remote HTTP API, possibly one provided by a third party configuration tool like CIDGravity. To do this, start booster-bitswap with the --api-filter-endpoint {url} option where URL is the HTTP URL for an API serving the above JSON format. Optionally, add --api-filter-auth {authheader}if you need to pass a value for the HTTP Authorization header with your API. When you setup with an API endpoint, booster-bitswap will update its local configuration from the API every five minutes, so you won't have to restart booster-bitswap to make a change. Also, be aware that the remote config will overwrite, rather than merge, with the local config.

Metrics, Profiling, and Tracing

If you need observability into the performance of your booster-bitswap instance, booster-bitswap can be setup to export pprof data, Prometheus metrics, and Jaeger traces. See the booster-bitswap run CLI documentation for command line arguments used to manage these services.

hannahhoward and others added 25 commits November 22, 2022 20:57
uses index provider families to switch to advertising bitswap as an extended provider -- this allows
us to expose booster-bitswap directly
add peer filter parsing and processing and test combined logic
Co-authored-by: dirkmc <dirkmdev@gmail.com>
provide a simple tool for a time window'd measurement of bandwidth
add additional configs for the peer filter, update interfaces
implement bitswap setup between go-bitswap PR + other changes to support full configs
rename peer filter to remote config filter
Remove need for custom bitswap while providing a more correct simultaneous request limit
Co-authored-by: dirkmc <dirkmdev@gmail.com>
Co-authored-by: dirkmc <dirkmdev@gmail.com>

// FulfillRequest checks if a given peer is in the allow/deny list and decides
// whether to fulfill the request
func (cf *ConfigFilter) FulfillRequest(p peer.ID, c cid.Cid) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add some instrumentation around this function in terms of logs and metrics (fulfilled and blocked counters).

I am pretty sure someone will mess up the configuration and then it will be very hard to figure out if/why their node is not serving requests. We should be able to turn on debug logging and see why requests are blocked/passed through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nonsense could you please add your comments on the original PR (I think we should keep those original PRs open until people have had a chance to comment on them, otherwise it all gets mixed together).

@brendalee
Copy link
Collaborator

@hannahhoward - do you have handy the tests that you ran for this (even if it's just a list of commands)? Could be a good starting point for @LexLuthr

Comment on lines +41 to +42
func IndexerIngestTopic(netName dtypes.NetworkName) string {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func IndexerIngestTopic(netName dtypes.NetworkName) string {
func IndexerIngestTopic(netName dtypes.NetworkName) string {

var log = logging.Logger("booster-bitswap")

// BadBitsDenyList is the URL for well known bad bits list
const BadBitsDenyList string = "https://badbits.dwebops.pub/denylist.json"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we able to support this denylist as a user-defined config, rather than a single centralized list?

// PeerListType is either an allow list or a deny list
type PeerListType string

// AllowList is a peer list where only the specified peers are allowed to serve retrievals
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// AllowList is a peer list where only the specified peers are allowed to serve retrievals
// AllowList is a peer list where only the specified peers are allowed to retrieve content

// AllowList is a peer list where only the specified peers are allowed to serve retrievals
const AllowList PeerListType = "allowlist"

// DenyList is a peer list where the specified peers cannot serve retrievals, but all others can
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// DenyList is a peer list where the specified peers cannot serve retrievals, but all others can
// DenyList is a peer list where the specified peers cannot retrieve content, but all others can

configFetcher = FetcherForHTTPEndpoint(apiFilterEndpoint, apiFilterAuth)
}
filters = append(filters, FilterDefinition{
CacheFile: filepath.Join(cfgDir, "retrievalconfig.json"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we expect cfgDir to be writable always already? is there a way for config to specify the volite directory where these caches live? (e.g. you mount in your config directory with docker or nfs, but want written files to live in /tmp)

// If public addresses are set, Boost will announce the booster-bitswap peer id directly to the indexer as an extended provider.
BitswapPublicAddresses []string

// If operating in public mode, in order to announce booster-bitswap as an extended provider, this value must point to a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

include instructions on how to generate such a key?

@LexLuthr
Copy link
Collaborator

Test results for my functional testing

Steps

  1. Create a custom BadBits list
  2. Rebuild boost on the new shared SP with this branch and custom BadBits list
  3. Use Kubo to fetch the data over Bitswap

Tests

  1. Download data over the proxy - Successful (proxy and non-proxy)
  2. Deny BadBit - Successful (proxy and non-proxy)
  3. Deny when maintenance mode enabled - Successful (proxy and non-proxy)
  4. Deny when allow list is empty - Successful (proxy and non-proxy)
  5. Download when allow list has correct Peer ID - Successful (proxy and non-proxy)
  6. Deny when allow list has incorrect Peer ID - Successful (proxy and non-proxy)
  7. Download when deny list is empty - Successful (proxy and non-proxy)
  8. Deny when deny list has correct Peer ID - Successful (proxy and non-proxy)

I did not test number of simultaneous connection and bandwidth per Peer as they would fall into load testing domain.

@hannahhoward
Copy link
Collaborator Author

@LexLuthr have we tested indexer announcements yet?

@LexLuthr
Copy link
Collaborator

@LexLuthr have we tested indexer announcements yet?

Yes. We have indexer announcements on cid.contact with proxied Bitswap

FOUND AT
Peer Id:
12D3KooWNSRG5wTShNu6EXCPTkoH7dWsphKAPrbvQchHa5arfsDC
Multiaddress:
/ip4/209.94.92.6/tcp/24001
Protocol:
Bitswap

* custom badbits filter

* remove unused field

* don't process if badbits flag empty

* redesign badbits filter input

* take slice input

* fix filter file name
@LexLuthr
Copy link
Collaborator

#1071 is a blocker for this merge.

@hannahhoward hannahhoward force-pushed the release/booster-bitswap branch from 03fe6b5 to 720f1ad Compare January 14, 2023 03:21
@dirkmc dirkmc force-pushed the release/booster-bitswap branch from 8f82dcc to b21b5ed Compare January 16, 2023 17:00
@dirkmc dirkmc merged commit 632f923 into main Jan 17, 2023
@dirkmc dirkmc deleted the release/booster-bitswap branch January 17, 2023 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants