Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Simplify scrypt in systests and reduce batch size #4995

Closed
wants to merge 4 commits into from

Conversation

poszu
Copy link
Contributor

@poszu poszu commented Sep 11, 2023

Motivation

Nodes in systests sporadically hang during POST init. The last message is:

initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match labelsPerUnit for the fastnet preset (which systests use).

Changes

  • reduce batch size to 128 in the fastnet preset,
  • reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

Test Plan

Existing tests pass

@codecov
Copy link

codecov bot commented Sep 11, 2023

Codecov Report

Merging #4995 (f68e6fc) into develop (61d308f) will increase coverage by 0.0%.
Report is 14 commits behind head on develop.
The diff coverage is 94.7%.

@@           Coverage Diff           @@
##           develop   #4995   +/-   ##
=======================================
  Coverage     77.1%   77.1%           
=======================================
  Files          254     254           
  Lines        30282   30356   +74     
=======================================
+ Hits         23356   23434   +78     
+ Misses        5411    5402    -9     
- Partials      1515    1520    +5     
Files Changed Coverage Δ
activation/nipost.go 81.5% <ø> (+0.5%) ⬆️
activation/activation.go 75.9% <87.5%> (+0.2%) ⬆️
activation/validation.go 86.7% <100.0%> (-0.2%) ⬇️
config/presets/fastnet.go 100.0% <100.0%> (ø)
node/node.go 63.1% <100.0%> (+<0.1%) ⬆️

... and 25 files with indirect coverage changes

@poszu
Copy link
Contributor Author

poszu commented Sep 11, 2023

bors merge

bors bot pushed a commit that referenced this pull request Sep 11, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 11, 2023

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 11, 2023

Failed to build a docker image.

bors merge

bors bot pushed a commit that referenced this pull request Sep 11, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 11, 2023

Build failed:

@dshulyak
Copy link
Contributor

{"L":"ERROR","T":"2023-09-11T21:44:12.883Z","N":"a8349.post","M":"Proof is invalid: MSB value for index: 9 doesn't satisfy difficulty: 248 > 12 (label: [130, 102, 117, 246, 44, 81, 99, 77, 106, 119, 96, 59, 92, 95, 148, 235])","node_id":"a83497fcdd409b180b56907c292cb15a213021745a5d957e7495d11101d52258","module":"post","module":"post::post_impl","file":"ffi/src/post_impl.rs","line":242}

lots of errors like the one above https://grafana.spacemesh.dev/explore?orgId=1&left=%7B%22datasource%22:%22loki-gke-dev%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22test-xwcg%5C%22%7D%20%7C%3D%20%5C%22ERROR%5C%22%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki-gke-dev%22%7D,%22editorMode%22:%22code%22%7D%5D,%22range%22:%7B%22from%22:%22now-12h%22,%22to%22:%22now%22%7D%7D

@poszu
Copy link
Contributor Author

poszu commented Sep 13, 2023

bors merge

bors bot pushed a commit that referenced this pull request Sep 13, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 13, 2023

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 13, 2023

bors merge

bors bot pushed a commit that referenced this pull request Sep 13, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 13, 2023

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 14, 2023

bors try

bors bot added a commit that referenced this pull request Sep 14, 2023
@bors
Copy link

bors bot commented Sep 14, 2023

try

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 14, 2023

Flaky #4171

bors try

bors bot added a commit that referenced this pull request Sep 14, 2023
@bors
Copy link

bors bot commented Sep 14, 2023

try

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 14, 2023

UT fail seems unrelated (flake?):

=== FAIL: hare/eligibility TestCalcEligibilityWithSpaceUnit/large_network (2.55s)
    oracle_test.go:285: diff=80 (10% of committeeSize)
    oracle_test.go:286: 
        	Error Trace:	/Users/runner/work/go-spacemesh/go-spacemesh/hare/eligibility/oracle_test.go:286
        	Error:      	"80" is not less than "80"
        	Test:       	TestCalcEligibilityWithSpaceUnit/large_network

bors try

bors bot added a commit that referenced this pull request Sep 14, 2023
@bors
Copy link

bors bot commented Sep 14, 2023

try

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@poszu
Copy link
Contributor Author

poszu commented Sep 14, 2023

Bors merge

bors bot pushed a commit that referenced this pull request Sep 14, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 14, 2023

Build failed:

@poszu
Copy link
Contributor Author

poszu commented Sep 14, 2023

Flaky TestAccountMeshDataStream_comprehensive

Bors merge

bors bot pushed a commit that referenced this pull request Sep 14, 2023
## Motivation
Nodes in systests sporadically hang during POST init. The last message is:
```
initialization: continue looking for a nonce	{"startPosition": 256, "batchSize": 1048576}
```

It hangs because the batch size is big and init is performed on CPU (it should eventually finish, it just takes veeery long). This fix changes the batch size to match `labelsPerUnit` for the fastnet preset (which systests use).

## Changes
- reduce batch size to 128 in the fastnet preset,
- reduce scrypt difficulty to the lowest possible N = 2 in the fastnet preset.

## Test Plan
Existing tests pass
@bors
Copy link

bors bot commented Sep 14, 2023

Pull request successfully merged into develop.

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot changed the title Simplify scrypt in systests and reduce batch size [Merged by Bors] - Simplify scrypt in systests and reduce batch size Sep 14, 2023
@bors bors bot closed this Sep 14, 2023
@bors bors bot deleted the fix-occasional-init-hang-in-systest branch September 14, 2023 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants