-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose CircuitBreaker current condition (or state) to the metrics recorder interface and Prometheus. #25
base: master
Are you sure you want to change the base?
Expose CircuitBreaker current condition (or state) to the metrics recorder interface and Prometheus. #25
Changes from 7 commits
b5e2dd3
4d29958
fcfae81
57a6f52
8978c16
db7bf6a
2bf49fb
c89af62
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,14 +10,30 @@ import ( | |
"github.com/slok/goresilience/metrics" | ||
) | ||
|
||
type state string | ||
type state uint | ||
|
||
const ( | ||
stateOpen state = "open" | ||
stateHalfOpen state = "halfopen" | ||
stateClosed state = "closed" | ||
stateNew state = iota | ||
stateOpen | ||
stateHalfOpen | ||
stateClosed | ||
) | ||
|
||
var stateStrings = []string { | ||
"new", | ||
"open", | ||
"halfopen", | ||
"closed", | ||
} | ||
|
||
func (st state) label () string { | ||
return stateStrings[st] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 2 things:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll have to read up on the Stringer interface but it feels a more natural fit. Will consider both when refactoring. |
||
} | ||
|
||
func (st state) condition () int { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that we could remove the integer usage for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree, will refactor accordingly. (Am I right in that you're happy with the use of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The |
||
return int(st) | ||
} | ||
|
||
// Config is the configuration of the circuit breaker. | ||
type Config struct { | ||
// ErrorPercentThresholdToOpen is the error percent based on total execution requests | ||
|
@@ -124,7 +140,7 @@ func NewMiddleware(cfg Config) goresilience.Middleware { | |
|
||
return func(next goresilience.Runner) goresilience.Runner { | ||
return &circuitbreaker{ | ||
state: stateClosed, | ||
state: stateNew, | ||
recorder: newBucketWindow(cfg.MetricsSlidingWindowBucketQuantity, cfg.MetricsBucketDuration), | ||
stateStarted: time.Now(), | ||
cfg: cfg, | ||
|
@@ -157,6 +173,9 @@ func (c *circuitbreaker) Run(ctx context.Context, f goresilience.Func) error { | |
func (c *circuitbreaker) preDecideState(metricsRec metrics.Recorder) { | ||
state := c.getState() | ||
switch state { | ||
case stateNew: | ||
// Close the breaker as this is the first time through and generate a statistic. | ||
c.moveState(stateClosed, metricsRec) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As stated before, the initial state of a circuit breaker is closed, so the above change of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, although it's coupled to my comment above. |
||
case stateOpen: | ||
// Check if the circuit has been the required time in closed. If yes then | ||
// we move to half open state. | ||
|
@@ -223,7 +242,8 @@ func (c *circuitbreaker) moveState(state state, metricsRec metrics.Recorder) { | |
|
||
// Only change if the state changed. | ||
if c.state != state { | ||
metricsRec.IncCircuitbreakerState(string(state)) | ||
metricsRec.IncCircuitbreakerState(state.label()) | ||
metricsRec.SetCircuitbreakerCurrentCondition(state.condition()) | ||
|
||
c.state = state | ||
c.stateStarted = time.Now() | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,6 +22,8 @@ type Recorder interface { | |
IncBulkheadTimeout() | ||
// IncCircuitbreakerState increments the number of state change. | ||
IncCircuitbreakerState(state string) | ||
// SetCircuitbreakerCurrentCondition sets the condition of ciruit breaker. | ||
SetCircuitbreakerCurrentCondition(condition int) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be better to have uniformity between method signatures:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will make the name change. |
||
// IncChaosInjectedFailure increments the number of times injected failure. | ||
IncChaosInjectedFailure(kind string) | ||
// SetConcurrencyLimitInflightExecutions sets the number of queued and executions at a given moment. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ import ( | |
"github.com/prometheus/client_golang/prometheus" | ||
) | ||
|
||
var ( | ||
const ( | ||
promNamespace = "goresilience" | ||
|
||
promCommandSubsystem = "command" | ||
|
@@ -17,8 +17,11 @@ var ( | |
promCBSubsystem = "circuitbreaker" | ||
promChaosSubsystem = "chaos" | ||
promConcurrencyLimitSubsystem = "concurrencylimit" | ||
|
||
) | ||
|
||
var DefaultPrometheusRecorder = NewPrometheusRecorder(prometheus.DefaultRegisterer) | ||
|
||
type prometheusRec struct { | ||
// Metrics. | ||
cmdExecutionDuration *prometheus.HistogramVec | ||
|
@@ -28,6 +31,7 @@ type prometheusRec struct { | |
bulkProcessed *prometheus.CounterVec | ||
bulkTimeouts *prometheus.CounterVec | ||
cbStateChanges *prometheus.CounterVec | ||
cbCurrentCondition *prometheus.GaugeVec | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Uniformity naming: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will make this change |
||
chaosFailureInjections *prometheus.CounterVec | ||
concurrencyLimitInflights *prometheus.GaugeVec | ||
concurrencyLimitExecuting *prometheus.GaugeVec | ||
|
@@ -59,6 +63,7 @@ func (p prometheusRec) WithID(id string) Recorder { | |
bulkProcessed: p.bulkProcessed, | ||
bulkTimeouts: p.bulkTimeouts, | ||
cbStateChanges: p.cbStateChanges, | ||
cbCurrentCondition: p.cbCurrentCondition, | ||
chaosFailureInjections: p.chaosFailureInjections, | ||
concurrencyLimitInflights: p.concurrencyLimitInflights, | ||
concurrencyLimitExecuting: p.concurrencyLimitExecuting, | ||
|
@@ -122,6 +127,13 @@ func (p *prometheusRec) registerMetrics() { | |
Help: "Total number of state changes made by the circuit breaker runner.", | ||
}, []string{"id", "state"}) | ||
|
||
p.cbCurrentCondition = prometheus.NewGaugeVec(prometheus.GaugeOpts{ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will make this change. |
||
Namespace: promNamespace, | ||
Subsystem: promCBSubsystem, | ||
Name: "current_condition", | ||
Help: "The current condition of the circuit breaker runner.", | ||
}, []string{"id"}) | ||
|
||
p.chaosFailureInjections = prometheus.NewCounterVec(prometheus.CounterOpts{ | ||
Namespace: promNamespace, | ||
Subsystem: promChaosSubsystem, | ||
|
@@ -172,6 +184,7 @@ func (p *prometheusRec) registerMetrics() { | |
p.bulkProcessed, | ||
p.bulkTimeouts, | ||
p.cbStateChanges, | ||
p.cbCurrentCondition, | ||
p.chaosFailureInjections, | ||
p.concurrencyLimitInflights, | ||
p.concurrencyLimitExecuting, | ||
|
@@ -210,6 +223,10 @@ func (p prometheusRec) IncCircuitbreakerState(state string) { | |
p.cbStateChanges.WithLabelValues(p.id, state).Inc() | ||
} | ||
|
||
func (p prometheusRec) SetCircuitbreakerCurrentCondition(condition int) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The problem with this metric approach is that the user needs to be aware of what is the number when it queries, this is not very natural and could lead to easy misinterpretation of the metrics. To have the state of a resource in a metric what I have seen as a standardized way is that the state is in a label and when one of the states is set to "active" is set to What do you think? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I think that your proposal to follow the conventions you have seen elsewhere is worth adopting. I'll make the necessary refactoring to implement this design. |
||
p.cbCurrentCondition.WithLabelValues(p.id).Set(float64(condition)) | ||
} | ||
|
||
func (p prometheusRec) IncChaosInjectedFailure(kind string) { | ||
p.chaosFailureInjections.WithLabelValues(p.id, kind).Inc() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New state is the same as a closed state because a new circuitbreaker starts in a closed state. Do you have a use case where you need to know the difference between new and closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My choice to implement
stateNew
and the transition tostateClosed
on first traversal provided a mechanism to ensure that the state of the breaker was recorded on first use.Otherwise, I found that there are no state metrics recorded until the first state change occurs.
I might take a few moments to review how I'm using the circuitbreaker instance by my application.