Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Makes response codes rate limited as well as prints a message when it is hit #6701

Merged

Conversation

AlanConfluent
Copy link
Member

@AlanConfluent AlanConfluent commented Dec 2, 2020

Description

Makes response codes logging rate limited too. Also prints a message when rate limit is hit.

Testing done

Unit tests and manual testing

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

Copy link
Member

@JimGalasyn JimGalasyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with one small suggestion.

Co-authored-by: Jim Galasyn <jim.galasyn@confluent.io>
Copy link
Contributor

@vcrfxia vcrfxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlanConfluent -- LGTM! The usual comments and questions inline.

A comma-separated list of HTTP response codes to skip during server
request logging. This is useful for ignoring certain 4XX errors that you
might not want to show up in the logs.
A list of `path:rate_limit` pairs, to limit the rate of server request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify the units of the rate limit (and also for the other config below). Appears to be number of messages per second?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified it's qps and even explicitly write out 10 logs per second as an example.

logging. This is useful for limiting certain 4XX errors that you
might not want to blow up in the logs.
This setting enables seeing the logs when the request rate is low
and dropping them when they go over the threshold.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that a message will be logged (at most once every five seconds) if the threshold is hit? (And same for the other config below.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if (rateLimitedPaths.containsKey(path)) {
final double rateLimit = rateLimitedPaths.get(path);
rateLimiters.computeIfAbsent(path, (k) -> rateLimiterFactory.apply(rateLimit));
return rateLimiters.get(path).tryAcquire();
rateLimitersByPath.computeIfAbsent(path, (k) -> rateLimiterFactory.apply(rateLimit));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to initialize the rate limiters up front, rather than calling computeIfAbsent on each request? I don't have a sense of how large this optimization is, feels minor but I also feel it can't hurt. Feel free to disagree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. The data is static and not large, so why not just make an immutable map instead on initialization?

I was originally thinking it might be dynamic, but it's not currently.

logging. This is useful for limiting certain 4XX errors that you
might not want to blow up in the logs.
This setting enables seeing the logs when the request rate is low
and dropping them when they go over the threshold.

### ksql.logging.server.rate.limited.request.paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're currently logging for internal endpoints (used in server-to-server communication for multi-node clusters), right? Would it make sense to disable logging for those by default? (I don't feel strongly one way or the other -- you have more context than I do about when/how often those endpoints are hit.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently we log all requests, regardless of internal/external status. I think this could be useful if we're trying to trace pull queries falling back on standbys after failures. We can see if this ends up being useful and potentially remove internal if not.

@AlanConfluent AlanConfluent merged commit bdec3dd into confluentinc:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants