Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

It is impossible to set a custom polling frequency in DNS resolver #1663

Closed
kop opened this issue Nov 10, 2017 · 29 comments
Closed

It is impossible to set a custom polling frequency in DNS resolver #1663

kop opened this issue Nov 10, 2017 · 29 comments
Labels
P2 Status: Working As Intended Type: Feature New features or improvements in behavior

Comments

@kop
Copy link

kop commented Nov 10, 2017

Please answer these questions before submitting your issue.

What version of gRPC are you using?

1.7.2

What version of Go are you using (go version)?

go1.9.2 darwin/amd64

What did you do?

I was trying to play with new Load Balancing APIs for gRPC services running in K8S.

What did you expect to see?

I expected to see an option to configure default DNS resolver and set custom frequency of polling the DNS server.

What did you see instead?

Resolver has a hardcoded value of 30 minutes that is absolutely unusable for dynamic environments (like K8S).

@menghanl
Copy link
Contributor

menghanl commented Nov 10, 2017

With the current code, right, there's no way to set a custom polling frequency for DNS.
An easy workaround would be to add a new function dns.NewBuilderWithFreq(freq time.Duration) (similar to this). Users can create a new DNS builder with this, and register it to override the default DNS builder.

Another problem here is, it's very hard to find the perfect polling frequency.

I'm working on a change to make resolver re-resolve the name whenever a connection goes down. With this, we plan to remove the polling frequency in resolver.
But the resolver won't re-resolve if all connections are working. Does this work for K8S?

@kop
Copy link
Author

kop commented Nov 10, 2017

An easy workaround would be to add a new function dns.NewBuilderWithFreq(freq time.Duration)

Yes, this sounds good.

I'm working on a change to make resolver re-resolve the name whenever a connection goes down.
But the resolver won't re-resolve if all connections are working. Does this work for K8S?

I'm not sure it will work great. I mean it will handle cases when pod is destroyed/recreated but it will not handle cases when service is scaled up and new pods are ready to serve requests, isnt?

I think polling frequency should stay. It will be possible to set much higher values for polling frequency if targets will be reevaluated on every disconnect, but we still need a mechanism to discover new instances of the service.

@menghanl
Copy link
Contributor

it will not handle cases when service is scaled up and new pods are ready to serve requests

IMO a better solution for this would be to push the updates instead of polling. But that's out of the scope of DNS.

cc @ejona86 for more inputs.

@kop
Copy link
Author

kop commented Nov 10, 2017

IMO a better solution for this would be to push the updates instead of polling.

Totally agreed.

But that's out of the scope of DNS.

DNS is a native service discovery mechanism in K8S and i think it would be really great if it will work with gRPC out of the box :)

@ejona86
Copy link
Member

ejona86 commented Nov 10, 2017

I mean it will handle cases when pod is destroyed/recreated but it will not handle cases when service is scaled up and new pods are ready to serve requests, isnt?

The solution for this is MAX_CONNECTION_AGE. This has the advantage that the service owner controls the configuration. (It also works with L4 proxies where the address doesn't change.)

@dfawley dfawley added P2 Type: Feature New features or improvements in behavior labels Nov 16, 2017
@trusch
Copy link

trusch commented Nov 29, 2017

DNS in k8s only returns the Virtual IP of the service and not all IP's of the replicas. This will not work when reusing a single client conn.

I saw a k8s resolver implementation which resolves natively via the k8s api.

@kop
Copy link
Author

kop commented Nov 29, 2017

@trusch but we can use Headless Services - they return multiple A records, this should solve the problem.

@ejona86
Copy link
Member

ejona86 commented Nov 29, 2017

DNS in k8s only returns the Virtual IP of the service and not all IP's of the replicas. This will not work when reusing a single client conn.

@trusch, what do you mean by "not work?" It should "work" as in function, in a basic sense. It would also load balance multiple clients to multiple servers and rebalance the clients over time (via MAX_CONNECTION_AGE). But I would agree it does not help distributing load from a single client to multiple servers. Is this what you mean?

Our (internal-discussion only?) proposed solution there is to create multiple ClientConns and round-robin over them. To make this easier, we could enhance grpc to create the multiple ClientConns behind a single ClientConn so it becomes mostly transparent. The extra connections would not be guaranteed to get different backends, but that is a normal property of L4 LBs.

The discussions/work to do this stalled. But this work can be re-opened if necessary. But either being okay with what you have (because, let's say, you don't a single client causing a substantial amount of load) or exposing IPs to your clients to avoid the L4 LB sounds superior in this case.

@trusch
Copy link

trusch commented Nov 30, 2017

@ejona86 yes, I got a setup with gateways and workers and would like to have one ClientConn to the workers which loadbalances on a per call basis. I think using headless services like @kop suggested and the new DNS resolver (in combination with MAX_CONN_AGE) will do the job for me.

@dfawley
Copy link
Member

dfawley commented Dec 14, 2017

Resolving with "working as intended", since we don't intend to add support for custom polling frequencies to the default DNS resolver -- please let us know if you don't think that's appropriate.

@HannesMvW
Copy link

I agree with the OP, we'd benefit from setting the DNS polling frequency shorter than 30 minutes.
What is the alternative method to alert client(s) when new instenaces have been added behind a Headless service?

@ejona86
Copy link
Member

ejona86 commented Jan 16, 2018

@HannesMvW, see #1663 (comment). The API is keepalive.ServerParameters.MaxConnectionAge.

@HannesMvW
Copy link

Thanks @ejona86 - how do I access this API? I'm using the "grpc" package from a Go application.

@ejona86
Copy link
Member

ejona86 commented Jan 17, 2018

@HannesMvW, use grpc.KeepaliveParams() to create a ServerOption that's passed to grpc.NewServer()

@HannesMvW
Copy link

HannesMvW commented Jan 18, 2018

Thanks, @ejona86. My apologies for not being more specific - I'd like my gRPC clients (using client side load balancing) to refresh the list of servers (listed for a Kubernetes service name), via DNS, more frequently than every 30 minutes.

Of course, instead of polling, a prettier solution could somehow subscribe to DNS changes, but I haven't yet figured out if or how this is possible with gRPC.

PS. Thanks to your help I got MaxConnectionAge to work, and then realized it keeps killing my gRPC connections - which is not what I intended. I'd like to keep server connections up throughout the lifetime of the server, but also make use of new servers as quickly as possible (which I guess requires DNS polling).

@ejona86
Copy link
Member

ejona86 commented Jan 18, 2018

@HannesMvW, when the server shuts down the connection due to age, the client will re-resolve DNS. This is very natural with pick-first load balancing. With round-robin it is slightly awkward, but it doesn't seem severe enough to warrant determining yet-another-lb-solution. It's also only a problem when using DNS as other naming systems tend to provide update notifications.

DNS polling is in general a bad idea. In the presence of round-robin DNS servers it would cause actual harm. High-frequency DNS polling can cause quite a bit of load to a critical system, which is dangerous. It is also client-side configuration which generally cannot be updated quickly. I'm right now hoping to kill the 30 minute polling, as Go is the only implementation that does that and there should not be a need.

The k8s documentation itself talks about them avoiding the same problems.

I'd like to keep server connections up throughout the lifetime of the server

What harm are you seeing by the re-creation of connections? I know of several possible issues, but I'm interested in what you're experiencing.

@HannesMvW
Copy link

HannesMvW commented Jan 18, 2018

Great, thanks @ejona86 for your feedback! I understand polling in general is a bad idea. In this case "a few minutes" should be good enough frequency, since we'd want the gRPC client side load balancing to pick up new servers reasonably quick.

My application is a small number of client side load balancers, working towards a huge set of servers, possibly in the 100's or 1000. (This is why I want to avoid tearing down gRPC connections unnecessarily.) When scaling up the number of servers, I'd like to see them go into use a bit quicker than 30 minutes...

I had a look at the gRPC source code, and I see no solution short of writing my own load balancer builder...all just to change "30m0s" to something like "3m0s". Or is there an easier way?

Or, perhaps we can also hook into dockerd to get notifications when new servers are added.

--

On a side note, I've noticed gRPC DNS for service "foo" does this:

  1. Query DNS SRV record for "grpclb.tcp.foo"
  2. Query DNS for each target returned from this query (think hundreds of DNS queries here...)
  3. Query DNS for A record for "foo" (this returns the complete set of server IP addresses)
  4. Query DNS for TXT record for "foo" (in hope of getting a service config, not supported by K8s)

I'd love to skip steps 1, 2, and 4, since 3 gives us the complete list of servers we're looking for.
Do I misconfigure/misuse gRPC client side load balancing somehow?
Although related to the thread above, I realize this is probably better off on a different thread.

@ejona86
Copy link
Member

ejona86 commented Jan 18, 2018

Ah, I see. Your case is quite pathological. I'll try to have a quick conversation with some other devs today about this.

Worst-case, you could write your own name resolver "my-better-dns" that does exactly what you want. Most of the complexity of the current resolver is to do all those things you don't like anyway, so this probably isn't that bad. (Not to say it should be the solution, just that it isn't a bad "worst case.")

(1) shouldn't return any results for your case, so (2) will be a noop. Most of the time we wouldn't expect 100s of LB servers and instead you rely on a few virtual IPs or the like for large scale-out. Pick-first is used to the LB.

(4) shouldn't return any results for your case as well.

The steps 1, 2, 4 aren't really intended to be configured and it's unfortunate they are always done even when unnecessary, but this is sort of the state of the world. There's not many other options other than "use better service discovery." There's quite a lot in the pipeline to add to service config, so in the future I hope you'd be interested in it, although the "not supported by k8s" would need a resolution of some sort.

@ejona86
Copy link
Member

ejona86 commented Jan 18, 2018

Discussed with @dfawley. @HannesMvW, we think writing your own name resolver is the way to go. Should be pretty easy and you can configure it exactly like you want.

You're actually in deep enough of a case where you may want "serious client-side load balancing." That would mean "gRPC LB." Unfortunately that still lacks a server and so has high barrier for entry, but it allows you to more seriously scale and adapt to load and changes. I expect you'd be happy enough with making your own resolver, but I thought I should mention that your case is getting into this other realm.

@HannesMvW
Copy link

Thanks s much for looking into this. Yes, I'm looking for really really high performance. Currently I've opted out of naming my service ports 'grpclb' to make the 1st lookup fail, which, yes, does skip the 2nd lookup, utilizing only the 3rd lookup. 4 always fails. (I've seen requests for K8s support for TXT records, for gRPC service configs - which I also would be interested in trying.)

@raliste
Copy link

raliste commented Jun 21, 2018

Just out of curiosity, why is the go implementation different than the Java implementation? Scaling replicas up with grpc-go fails miserably as the subconn map is not properly updated. The Java implementation works flawlessly.

Our main issues are:

  • When all subconns are in transient failure, no resolution is requested, so we depend on the 30 minute interval.

@menghanl
Copy link
Contributor

@raliste Are you using an old version of gRPC? We do try to re-resolve whenever a subconn goes into transient failure.

@raliste
Copy link

raliste commented Jun 21, 2018

Thank you for your answer @menghanl

We were actually testing using and old version. After upgrading we saw good resolving but it appears than when all subconn go into transient failure no re-resolve is done.

@raliste
Copy link

raliste commented Jun 21, 2018

grpc-java did implement name resolution if all subconns go into transient failure.

grpc/grpc-java#1591

@menghanl
Copy link
Contributor

@raliste I'm not sure I understand the problem.
The current behavior is, whenever a subconn goes into transient failure, resolver does a re-resolve. This should cover the case where all subconns go into transient failure. Can you give more explanation?

@raliste
Copy link

raliste commented Jun 22, 2018

@menghanl thank you. The actual problem is that when all subconns are in transient failure no re-resolve is performed. I understand that a re-resolve is done whenever a subconn goes into transient failure, but in our case, whenever all subconns go into transient failure (at once), no re-resolve is performed. The subconn map is only updated after 30 minutes.

rpc error: code = Unavailable desc = all SubConns are in TransientFailure

I think the problem is that you do re-resolve (one time), but you don't keep re-resolving until a new subconn is retrieved.

@menghanl
Copy link
Contributor

@raliste When a subconn is in transient failure, it will keep retrying. So the state will go back to connecting, and then transient failure, and a re-resolve will happen.

If you still have the problem, can you give a bit more information about what you did? Like what's your environment setup? And are you using the default dns resolver?
It would also be good to file a new issue instead of discussing here. Thanks!

@elvizlai
Copy link
Contributor

elvizlai commented Jul 2, 2018

@menghanl look like it not works well in k8s. because some time, the client will up first, but not the server. So the dns record maybe empty, and it will work after 30 min.

apiVersion: v1
kind: Namespace
metadata:
  name: test

---

apiVersion: v1
kind: Service
metadata:
  namespace: test
  name: server
spec:
  clusterIP: None
  ports:
    - name: grpc
      port: 1234
  selector:
    app: server

---

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: server
  namespace: test
spec:
  replicas: 5
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: server
        ver: v1
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: server
        image: "sdrzlyz/istio-demo-server:v1"
        imagePullPolicy: IfNotPresent
        ports:
        - name: grpc-port
          containerPort: 1234
        readinessProbe:
          tcpSocket:
            port: grpc-port
          initialDelaySeconds: 5
          periodSeconds: 10
        livenessProbe:
          tcpSocket:
            port: grpc-port
          initialDelaySeconds: 15
          periodSeconds: 20

---

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: client
  namespace: test
spec:
  replicas: 1
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 1
  template:
    metadata:
      labels:
        app: client
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: client
        image: "sdrzlyz/istio-demo-client"
        imagePullPolicy: Always
        env:
        - name: SADDR
          value: "dns:///server.test.svc.cluster.local:1234"

@menghanl
Copy link
Contributor

menghanl commented Jul 2, 2018

@elvizlai What you described is tracked by #1795. Will have a fix shortly.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P2 Status: Working As Intended Type: Feature New features or improvements in behavior
Projects
None yet
Development

No branches or pull requests

8 participants