Ensure proper metrics registration and lifecycle management to maintain observability system stability. Handle metric registration/unregistration correctly during configuration changes and avoid common pitfalls that can cause panics or data loss.
Key practices:
Use AlreadyRegisteredError pattern for shared metrics: When multiple components share the same metrics, handle registration conflicts gracefully by retrieving existing metrics from the registry instead of creating new ones.
Manage registration during config reloads: Properly unregister metrics before re-registering to avoid panics, but be aware this causes counter resets. Consider keeping metrics registered and reusing them when possible.
Register all required collectors: When using custom registries, ensure all necessary collectors (GoCollector, ProcessCollector) are registered, not just the default ones.
Example of proper AlreadyRegisteredError handling:
func newDiscovererMetrics(reg prometheus.Registerer) *zookeeperMetrics {
m := &zookeeperMetrics{
failureCounter: prometheus.NewCounter(prometheus.CounterOpts{
Name: "zookeeper_failures_total",
Help: "The total number of ZooKeeper failures.",
}),
}
// Handle shared metrics between components
if err := reg.Register(m.failureCounter); err != nil {
if are, ok := err.(prometheus.AlreadyRegisteredError); ok {
// Use the existing metric instead of the new one
m.failureCounter = are.ExistingCollector.(prometheus.Counter)
} else {
// Handle other registration errors
return nil
}
}
return m
}
This prevents registration panics while ensuring metrics remain functional across component lifecycles and configuration changes.
Enter the URL of a public GitHub repository