Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grpc: singleton manager/factory pattern for gRPC async clients. #2393

Merged
merged 3 commits into from
Jan 19, 2018

Conversation

htuch
Copy link
Member

@htuch htuch commented Jan 17, 2018

As part of the switch to envoy::api::v2::GrpcService and preparing the code base for the Google C++
gRPC client, this PR introduces a singleton gRPC async client manager and factory pattern. This
allows for details of TLS and shared state across clients to be hidden from consumers and for
consumers to be agnostic about the type of gRPC client.

This PR also introduces the deprecation of the specification of gRPC services by cluster name,
envoy::api::v2::GrpcService should be used going forward.

Risk Level: Medium (some changes to initialization, API cluster validation)
Testing: New unit tests, validated full coverage.
Docs Changes: envoyproxy/data-plane-api#398
Deprecated: Specifying rate limit and API config sources via cluster names, use the grpc_service(s) fields instead.

Signed-off-by: Harvey Tuch htuch@google.com

As part of the switch to envoy::api::v2::GrpcService and preparing the code base for the Google C++
gRPC client, this PR introduces a singleton gRPC async client manager and factory pattern. This
allows for details of TLS and shared state across clients to be hidden from consumers and for
consumers to be agnostic about the type of gRPC client.

This PR also introduces the deprecation of the specification of gRPC services by cluster name,
envoy::api::v2::GrpcService should be used going forward.

Signed-off-by: Harvey Tuch <htuch@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, great cleanup. One question.

@@ -302,6 +303,7 @@ class ClusterManagerImpl : public ClusterManager, Logger::Loggable<Logger::Id::u
LoadStatsReporterPtr load_stats_reporter_;
// The name of the local cluster of this Envoy instance if defined, else the empty string.
std::string local_cluster_name_;
Grpc::AsyncClientManagerPtr async_client_manager_{};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: {} not needed.

* @return AsyncClientFactoryPtr factory for grpc_service.
* @throws EnvoyException when grpc_service validation fails.
*/
virtual AsyncClientFactoryPtr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really suggesting we do this in this PR, but I think there is an opportunity here that would further massively cleanup the amount of code that is needed at many of the call sites. Basically, like HTTP async client, if a gRPC client was internally TLS aware, I think many callers would not need any TLS at all. They could just acquire a handle, and then create a TLS local request. I'm pretty sure that if it were implemented this way, TLS in rate limit, access log, and metrics (and soon tracing) would not be needed.

At the very least, I might add some TODOs in this area, or if there any interface changes that would make this easier later we might want to consider it. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, as documented, when you create() on the factory, the client you get back (like with CM HTTP async client) is the one for your thread. It may be a cheap operation, if all we are returning is a thin handle (not the current implementation, but maybe in the future).

For the Envoy gRPC client today, Grpc::AsyncClientImpl uses the HTTP async client from CM, so it already is somewhat TLS aware. In the Google gRPC world, we might have things like TLS completion queues that are managed by the the TLS for the Google gRPC client.

The question is then, at these various sites that use the gRPC async client, do they have any specific TLS state that is external to the client. Looking at access logs gRPC, there is the stream_map_; would this remain TLS or become some state that is shared across threads? There's a tradeoff here; introducing more complexity (or contention) on shared structures vs. more streams (one per silo) for the same resource.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question is then, at these various sites that use the gRPC async client, do they have any specific TLS state that is external to the client. Looking at access logs gRPC, there is the stream_map_; would this remain TLS or become some state that is shared across threads? There's a tradeoff here; introducing more complexity (or contention) on shared structures vs. more streams (one per silo) for the same resource.

Yeah I'm not sure. The main pattern that I think would be possible if the client itself was TLS aware is you could grab a client and do error checking, and then safely use that client across all threads to start streams. In this case, the client itself is TLS aware, but it still effectively needs to be stored in a local TLS slot for use.

It's not a big deal, this is fine as is. We can look at further cleanups later.

AsyncClientFactoryImpl::AsyncClientFactoryImpl(Upstream::ClusterManager& cm,
const std::string& cluster_name)
: cm_(cm), cluster_name_(cluster_name) {
auto clusters = cm_.clusters();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does using get() here like was done in the access log code not work because TLS clusters are not setup yet in some cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right; there's a chick-and-egg situation with ClusterManager initialization, where EDS based clusters are created, and AsyncClientManager initialization which takes place in that same constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sounds good.

@htuch htuch merged commit 63059e6 into envoyproxy:master Jan 19, 2018
@htuch htuch deleted the grpc-async-factory branch January 19, 2018 17:14
Shikugawa pushed a commit to Shikugawa/envoy that referenced this pull request Mar 28, 2020
Signed-off-by: John Plevyak <jplevyak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants