Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: only check for db update if local db is old enough #2005

Closed
wants to merge 3 commits into from

Conversation

willmurphyscode
Copy link
Contributor

Previously, grype would check for a database update on every invocation. However, this was causing slow downs in the CDN. Since new databases are typically made available every 24 hours, change grype to, by default, only check if a new database is available if the local database is more than 12 hours old.

See #1731

Previously, grype would check for a database update on every invocation.
However, this was causing slow downs in the CDN. Since new databases are
typically made available every 24 hours, change grype to, by default,
only check if a new database is available if the local database is more
than 12 hours old.

Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
@@ -30,6 +30,8 @@ func DBCheck(app clio.Application) *cobra.Command {
}

func runDBCheck(opts options.Database) error {
// `grype db check` should _always_ check for updates, regardless of config
opts.MinAgeToCheckForUpdate = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about adding a separate bool "force check" option, or a force check parameter to IsUpdateAvailable and Update. Would that have been preferable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that not changing the behavior of db check is the right call for now from an options-perspective. We can always add more config items later to opt into force.

But, this param should be changed within PreRunE such that the config options are finalized before logging them.

Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
@@ -18,6 +18,7 @@ type Database struct {
AutoUpdate bool `yaml:"auto-update" json:"auto-update" mapstructure:"auto-update"`
ValidateByHashOnStart bool `yaml:"validate-by-hash-on-start" json:"validate-by-hash-on-start" mapstructure:"validate-by-hash-on-start"`
ValidateAge bool `yaml:"validate-age" json:"validate-age" mapstructure:"validate-age"`
MinAgeToCheckForUpdate time.Duration `yaml:"min-age-to-check-for-update" json:"min-age-to-check-for-update" mapstructure:"min-age-to-check-for-update"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming suggestion: what about stale-age with an attached description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db:
   min-age-to-check-for-update: 12h

vs

db:
   stale-age: 12h

The first one seems clearer to me in terms of what it controls. But I agree the name is pretty wordy. How do we communicate to a user what stale-age does?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to other naming suggestions too. Here's how we attach descriptions to fields https://github.com/anchore/syft/blob/9573f557d12756f8c873b98a6af476835f611055/cmd/syft/internal/options/update_check.go#L15-L17

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to be parallel to MaxAllowedBuiltAge, itself perhaps not an ideal age, since "built" seems a bit of an implementation concern.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about min-age-to-update ?

Copy link
Contributor

@kzantow kzantow Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like update-check-interval? And what happens 12 hours after the db is released? With this behavior do we simply move the busy time to download a database by 12 hours? Should we update the metadata to indicate the last time we checked so we can spread this out more evenly in some sort of meaningful way?

Signed-off-by: Will Murphy <willmurphyscode@users.noreply.github.com>
defer f.Close()
err = json.NewEncoder(f).Encode(tt.existingMetadata)
require.NoError(t, err)
getter := newTestGetter(fs, files, dirs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some issues getting this working in all the scenarios I wanted to test because it seemed as though I couldn't mock every call and the getter code was ultimately calling os.Open, instead of using the afero.FS. If you happen to run into similar issues, I wrote a mocked but functional server to emulate the listing file / db download.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree testing with a local http server is a cleaner solution. I'll wait for that PR to get in and then rebase and change these tests to use that, assuming we still decide we want this behavior.

@willmurphyscode
Copy link
Contributor Author

We will probably close this in favor of #2020, which seems like a much better approach. Thanks @kzantow!

@willmurphyscode willmurphyscode self-assigned this Jul 31, 2024
@willmurphyscode
Copy link
Contributor Author

This is probably the wrong approach, because once the local DB is old enough, grype will still check every time, so from the CDN's perspective, there will still be a window from when the N hours expires until when the new database is downloaded by most clients that will have the same requests per second as today, so it won't increase reliability during that time window.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

3 participants