-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add rate limit per source option #891
add rate limit per source option #891
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lint error can be fixed by using a custom type as key for the context value. It's enough to create an alias of string, for example as follows:
type ctxarg string
const (
ctxSourceArg ctxarg = "source"
)
Then use it as:
xxx := context.WithValue(ctx, ctxSourceArg, "xxx")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation is lgtm - Anyway I recommend the following change as the global/custom rate-limiter have a naming and positional redundancy:
// Wrap both arguments into a custom type
type RateLimitJar struct {
Global int
Custom mapsutil.SyncMap[string, int]
}
// Expose Set and GetOrDefault operations
RateLimitJar.GetOrDefault(sourceName)
RateLimitJar.Set(sourceName, limit)
Then this struct can be passed around instead of the two arguments.
Also, I think that the parsing from interface{}
to int
or the options.RateLimits.AsMap()
can be moved in options.ParseOptions()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Before:
$ go run . -d hackerone.com -rls crtsh=-1
...
[INF] Found 18 subdomains for hackerone.com in 2 seconds 19 milliseconds
After:
$ go run . -d hackerone.com -rls crtsh=1
...
[INF] Found 18 subdomains for hackerone.com in 8 seconds 327 milliseconds
@ehsandeep @Mzack9999 , i think its a breaking change for library users, if i remember correctly |
@tarunKoyalwar I tried to make the change compatible with older function signature. On a side I've to say that I like more the new syntax ( |
yeah i also think variadic parameter options is great idea. even outside subfinder we should use that whenever possible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggesting some changes
- add 2 flag ,
rls
,rlsm
similar to all tools . which means
rls => ratelimit source (unit is seconds)
rlsm => ratelimit source minute (unit is minute)
even better we can consider adding new flag type in goflags for ratelimit which accepts ratelimit along with unit @ehsandeep @Mzack9999
type RateLimit struct{
MaxTokens unit
Duration. time.Duration
}
rls := []RateLimit{}
flag.RateLimitVar(&rls, "-rls",nil,"rate limit along with unit ex: 10/ms , 30/m, 8/s ")
things to review/reconsider
-
i think we should opt for whitelist instead of blacklist. subfinder sources have completely different ratelimit . and some sources documentation (api docs) don't mention ratelimit at all but throtle randomly . currently we are applying ratelimit to all sources . instead i think we should apply ratelimit to sources whose ratelimit is either known or user specifies (
all=60
) or a new flag to exclude ratelimit for xyz sources -
implementation looks ok but maybe we could have abstracted all ratelimit logic inside session . this is a reference from my other WIP (but backlog) branch https://github.com/projectdiscovery/subfinder/blob/c3d8a02a7b636fa11bb5c76065c952a73fad561e/v2/pkg/subscraping/agent.go
these are some of ratelimits i have collected from sources https://github.com/projectdiscovery/subfinder/blob/c3d8a02a7b636fa11bb5c76065c952a73fad561e/v2/pkg/subscraping/ratelimits.go |
@tarunKoyalwar , I like the idea of having default rate limits and defining a flag to make it easy to get the limit with different units of time. I'll update the PR. |
Now, it's using the RateLimit flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
Optional:
- Update readme file with better examples of custom rate limit
- The
rls
flag accepts arbitrary keys, maybe we should validate them against the list of sources to help the user avoid typos:
$ go run . -rls "a=1/s,censyz=1/s"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm !
Notes:
this new feature gives some stability to subfinder when input is more than 1 domain
earlier , when we pass multiple domains as input to subfinder it would return different results everytime it ran due to not respecting ratelimits of each source which affected overall results to change
its difficult to benchmark difference but idea is that subfinder will not get 429 if we set/use proper rate limits
With this PR, users now have the capability to configure the rate limit for each individual source. Closes #718.