[feat] - Support S3 Source Resumption #3570

ahrav · 2024-11-07T21:34:08Z

Description:

This PR adds resumption to the S3 source.

Checklist:

Tests passing (make test-community)?
Lint passing (make lint this requires golangci-lint)?

…y/trufflehog into feat-s3-source-resumption

rgmz · 2024-11-08T18:15:06Z

This PR adds resumption to the S3 source.

This would be a beneficial capability for other sources. e.g., resuming a large GitHub org scan.

rosecodym

This seems pretty straightforward, although I would like to see a test case for (and handling of) the case where an in-progress bucket is ignored while the scan is stopped (described in an inline comment)

rosecodym · 2024-11-21T15:19:17Z

pkg/sources/s3/s3.go

+		ctx.Logger().Error(err, "failed to get resume point")
+		return


This seems drastic - what do you think of instead restarting the scan from the beginning when we can't get a resume point? I feel like we should err on the side of scanning too much rather than scanning too little.

rosecodym · 2024-11-21T15:22:57Z

pkg/sources/s3/s3.go

+			i,
+			len(bucketsToScan),
+			fmt.Sprintf("Bucket: %s", bucket),
+			s.Progress.EncodedResumeInfo,


Why are we re-using the existing resume info? I expected to see something from the progress tracker here.

I was essentially saying, "Don’t modify EncodeResumeInfo—it’s already been updated by progressTracker, so just use it as-is." Since progressTracker and s.Progress reference the same underlying object, would it be clearer if we explicitly used s.progressTracker.Progress instead?

pkg/sources/s3/s3.go

rosecodym · 2024-11-21T15:28:08Z

pkg/sources/s3/s3.go

 ) {
-	for _, obj := range page.Contents {
+	s.progressTracker.Reset()


This doesn't reset entirely, right? It just resets for a new page? I wish I'd been well enough to leave that comment on #3568 :(

Yea, this was my mistake. I'm going to fix the tracker so it's named more accurately. here
It's no longer a progress tracker, it's pretty much just a checkpointer.

pkg/sources/s3/s3_integration_test.go

ahrav · 2024-11-21T23:54:08Z

This seems pretty straightforward, although I would like to see a test case for (and handling of) the case where an in-progress bucket is ignored while the scan is stopped (described in an inline comment)

done

ahrav and others added 16 commits November 5, 2024 14:03

add config option for s3 resumption

cd3d9d4

updates

edd9cd5

initial progress tracking logic

804da8e

more testing

789bd39

revert s3 source file

5a2820c

UpdateScanProgress tests

454b48b

Merge branch 'main' into feat-s3-source-resumption

389fffb

adjust

8446a72

updates

26c75a9

invert

e895ec7

updates

fd248ca

updates

4ea1583

fix

b4e8b1e

Merge branch 'feat-s3-source-resumption' of github.com:trufflesecurit…

334a4f9

…y/trufflehog into feat-s3-source-resumption

update

7476fa9

adjust test

8740bc6

ahrav force-pushed the s3-progress-tracker branch from 96ae71f to 094815a Compare November 7, 2024 21:39

fix

627ece0

ahrav force-pushed the s3-progress-tracker branch 2 times, most recently from 5ed6639 to b29a571 Compare November 7, 2024 22:01

ahrav requested review from a team November 7, 2024 22:10

ahrav marked this pull request as ready for review November 7, 2024 22:10

ahrav requested a review from a team as a code owner November 7, 2024 22:10

remove progress tracking

0047c62

ahrav force-pushed the s3-progress-tracker branch from b29a571 to f8cc40f Compare November 8, 2024 18:46

cleanup

dd20664

ahrav force-pushed the s3-progress-tracker branch from 660f1cd to dd20664 Compare November 8, 2024 19:21

cleanup

bb578fe

ahrav force-pushed the s3-progress-tracker branch from a7dd656 to aa167b7 Compare November 18, 2024 22:21

ahrav requested review from rosecodym and a team November 18, 2024 22:24

ahrav requested a review from a team as a code owner November 19, 2024 18:11

remove context cancellation logic

fcdd7ab

ahrav force-pushed the s3-progress-tracker branch from 375d817 to fcdd7ab Compare November 19, 2024 18:14

fix comment format

6364445

rosecodym reviewed Nov 21, 2024

View reviewed changes

ahrav and others added 6 commits November 21, 2024 12:00

make resumption logic more clear

d0ba821

rename

79f90fb

Merge branch 'main' into s3-progress-tracker

4892cf6

Merge branch 'main' into refactor-s3-progress-tracker

b9392e6

merge

beef2bb

fixes

7a8cf26

ahrav changed the base branch from main to refactor-s3-progress-tracker November 21, 2024 23:30

ahrav added 3 commits November 21, 2024 15:31

update

d1375e7

add edge case test

a83627c

merge

d99464e

ahrav requested a review from rosecodym November 21, 2024 23:54

Base automatically changed from refactor-s3-progress-tracker to main November 22, 2024 17:27

ahrav added 4 commits November 22, 2024 09:28

Merge main

a49ceda

remove dupe mu

e250664

add comment

c720543

fix comment

9fcadf7

dustin-decker approved these changes Nov 22, 2024

View reviewed changes

ahrav merged commit e495661 into main Nov 22, 2024
13 checks passed

ahrav deleted the s3-progress-tracker branch November 22, 2024 21:33

blsaccess mentioned this pull request Nov 23, 2024

Update trufflehog to 3.84.1 blacklanternsecurity/bbot#2014

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] - Support S3 Source Resumption #3570

[feat] - Support S3 Source Resumption #3570

ahrav commented Nov 7, 2024 •

edited

Loading

rgmz commented Nov 8, 2024

rosecodym left a comment

rosecodym Nov 21, 2024

rosecodym Nov 21, 2024

ahrav Nov 21, 2024

rosecodym Nov 21, 2024

ahrav Nov 21, 2024

ahrav commented Nov 21, 2024

[feat] - Support S3 Source Resumption #3570

[feat] - Support S3 Source Resumption #3570

Conversation

ahrav commented Nov 7, 2024 • edited Loading

Description:

Checklist:

rgmz commented Nov 8, 2024

rosecodym left a comment

Choose a reason for hiding this comment

rosecodym Nov 21, 2024

Choose a reason for hiding this comment

rosecodym Nov 21, 2024

Choose a reason for hiding this comment

ahrav Nov 21, 2024

Choose a reason for hiding this comment

rosecodym Nov 21, 2024

Choose a reason for hiding this comment

ahrav Nov 21, 2024

Choose a reason for hiding this comment

ahrav commented Nov 21, 2024

ahrav commented Nov 7, 2024 •

edited

Loading