Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Ignore discovery errors for metrics resources #2009

Merged
merged 1 commit into from
May 6, 2019

Conversation

2opremio
Copy link
Contributor

@2opremio 2opremio commented May 2, 2019

The Metrics API tends to be misconfigured, causing discovery errors which ultimately makes syncs fail.

This change makes Flux ignore those errors.

Fixes #1991

The Metrics API tends to be misconfigured, causing discovery errors which
ultimately makes syncs fail.

This change makes Flux ignore those errors.
@@ -206,6 +206,11 @@ func (c *Cluster) getAllowedResourcesBySelector(selector string) (map[string]*ku
return nil, err
}
for gv, e := range discErr.Groups {
if strings.HasSuffix(gv.Group, "metrics.k8s.io") {
// The Metrics API tends to be misconfigured, causing errors.
// We just ignore them, since it doesn't make sense to sync metrics anyways.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefanprodan can you confirm whether this is true? I am not that familiar with the metrics API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GKE comes with the metrics server and it registers the metrics.k8s.io at bootstrap. For EKS and AKS where you would install the metrics server with Helm, I don't think it will be an issue and for GKE ignoring it will not cause any problems as far as I can tell.

@2opremio
Copy link
Contributor Author

2opremio commented May 2, 2019

Since this is the third time we have problems with the discovery API (before #1991 we had #1951 and before that #1855) I think this may happen again with other resources in the future. I am tempted to, instead of the proposed solution, ignore the partial list of resources affected by discovery.ErrGroupDiscoveryFailed like ArgoCD does argoproj/argo-cd#526

@hiddeco @squaremo thoughts?

@hiddeco
Copy link
Member

hiddeco commented May 3, 2019

@2opremio your suggestion makes more sense to me as I have the feeling we will otherwise forever be patching specific resources.

@2opremio
Copy link
Contributor Author

2opremio commented May 4, 2019

After further thinking, all the errors we've had were in one way or another related to metrics. So I think it's worth giving it a try as-is (I am not too happy about ignoring errors, unless we are sure about the impact). If it fails again with a different resource, I will be happy to ignore `discovery.ErrGroupDiscoveryFailed altogether.

@2opremio 2opremio merged commit 97022b6 into fluxcd:master May 6, 2019
@2opremio 2opremio deleted the 1991-ignore-metrics-resources branch May 6, 2019 09:01
@hiddeco hiddeco added this to the v1.12.2 milestone May 7, 2019
@sercanacar
Copy link

Temporary fix for this? I've tried removing helmrelease for metrics-server...no luck

@2opremio
Copy link
Contributor Author

2opremio commented May 9, 2019

@sercanacar I don't know what causes the error, sorry. I would assume it's a problem with the metrics server so I would suggest to double-check it's uninstalled.

@stefanprodan any suggestions?

@2opremio
Copy link
Contributor Author

2opremio commented May 9, 2019

@sercanacar you can upgrade to 1.12.2, which contains the fix

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sync fails due to metrics.k8s.io discovery error
4 participants