Skip to content
This repository has been archived by the owner on Sep 4, 2024. It is now read-only.

updates and disables the chain tx if unable to connect to an rpc #33

Conversation

dmikey
Copy link
Contributor

@dmikey dmikey commented Jan 25, 2024

going to smash this one through, but removes the requirement to connect to the app chain when running the node.

{"level":"info","component":"node","request":"1340ceab-d04e-45fa-8849-5d8a537b04a5","function":"bafybeiaugwh3mktzbnudurzk7jcyvmdhyy6jnairiu2phuf2t7c7orowjq","node_count":1,"cluster_size":1,"responded":1,"time":"2024-01-25T17:51:42-06:00","message":"received execution responses"}
{"level":"debug","time":"2024-01-25T17:51:46-06:00","message":"inference results would have been submitted to chain"}

@dmikey dmikey requested review from xmariachi and Maelkum January 25, 2024 23:52
Copy link

linear bot commented Jan 25, 2024

ORA-481 Remove strict dependency between appchain and compute node

Data scientists should not have to be bothered with all the devops involved in running the appchain. They should just have to worry about their model. However our appchain and compute node are currently coupled in our docker-compose. This shouldn't be the case!

Task here is to split these apart! And also communicate it with the team so minimal surprises. See this convo for more.

@dmikey dmikey merged commit 6c51ca3 into main Jan 25, 2024
1 check passed
@dmikey dmikey deleted the derek/ora-481-remove-strict-dependency-between-appchain-and-compute-node-2 branch January 25, 2024 23:57
Copy link
Contributor

@Maelkum Maelkum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving a few comments, though some may be outdated by another PR, which I am going to look at next :)

userHomeDir, err := os.UserHomeDir()
if err != nil {
config.Logger.Warn().Err(err).Msg("could not get home directory for app chain")
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these should wrap error and return, and uppermost function should log. Right now, the function is ran as a goroutine and its return value never checked.

Upper function should be something like:

func (ap *AppChain) start(ctx context.Context) {
	go func() {
		err := ap.startClient(ctx, ap.Config)
		if err != nil {
			ap.Config.Logger.Error().Err(err).Msg("appchain client start failed")
		}
	}()
}

(with logger outside of the config)

}

DefaultNodeHome := filepath.Join(userHomeDir, ".uptd")
client, _ := cosmosclient.New(ctx, cosmosclient.WithAddressPrefix(ap.Config.AddressPrefix), cosmosclient.WithHome(DefaultNodeHome))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no chain available, this will fail first. Error for me was error while requesting node 'http://localhost:26657': post failed: Post \"http://localhost:26657\": dial tcp 127.0.0.1:26657: connect: connection refused.

Few issues I think:

  1. Right now SubmitTx is never initialized to true? It starts off as and will forever be false.
  2. I think we have some distinct scenarios:
    a) chain being there or not
    b) me wanting to use chain or not

Combination of those two is possible:
a) chain is there but I don't want to use it
b) chain is there, and I do want to use it. If I can't - it's an error
c) chain is not there, and I don't want to use it
d) chain is not there, and I do want to use it - error

Scenarios b) and d) I think are clear errors that should result in stopping the process, not silently continuing. IMO, not even logging and continuing.

client, _ := cosmosclient.New(ctx, cosmosclient.WithAddressPrefix(ap.Config.AddressPrefix), cosmosclient.WithHome(DefaultNodeHome))

// this is terrible, no isConnected as part of this code path
if(len(client.Context().ChainID) < 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code path doesn't clear the SubmitTx boolean. It is never set in the first place either, but if that is fixed, this should also clear it.

if (!queryIsNodeRegistered(ctx, client, address, config)) {
// not registered, register the node
registerWithBlockchain(ctx, client, account, config)
if(config.SubmitTx) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug because of how this function is used - config is a function parameter, while there's also ap.Config used.

This line should probably use ap.Config.SubmitTx (that's the one that's changed in the upper conditions).

Also, if you invert the function, you can get rid of the nesting:

	if !ap.Config.SubmitTx {
		config.Logger.Warn().Err(err).Msg("could not retrieve allora blockchain address, transactions will not be submitted to chain")
		return nil
	}

	if !queryIsNodeRegistered(ctx, client, address, config) {
		// not registered, register the node
		registerWithBlockchain(ctx, client, account, config)
	}
	

// Might be disabled if so we should log out
if(appChainClient.Config.SubmitTx) {
// don't block the return to the consumer to send these to chain
go sendResultsToChain(ctx, a, appChainClient, req, res)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bug.. You want to do this async, in the background, but you use the echo.Context as a context. In the next function you use ctx.Request().Context(). From the docs:

// For incoming server requests, the context is canceled when the
// client's connection closes, the request is canceled (with HTTP/2),
// or when the ServeHTTP method returns.

So, if the client drops off OR - it simply received our response, is content and closes the connection => this context gets canceled, and we may not complete whatever it is we set out to do. This should be context.Background() if we want it truly to run in the background.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants