Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify socket passing in hot restart test #805

Merged
merged 14 commits into from
Apr 27, 2017

Conversation

hennna
Copy link
Contributor

@hennna hennna commented Apr 20, 2017

In order to get updated admin and listener address information, the following changes were made:

  1. Add an optional command line flag <-a> <--admin-address-path>. When set, the admin address will be written to the specified file.

  2. Add a /listeners URL admin handler that returns all server listener addresses.

  3. Write a python helper script that either updates the listener addresses in the original config file or checks that the server listener addresses have not changed after envoy startup.

Fixes #654.

@hennna
Copy link
Contributor Author

hennna commented Apr 20, 2017

@htuch for first pass.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this is well written code, fixes #654 and adds the new listener handle. Most of my comments are about test stuff.

BTW, if you put at the end of your commit mesasge "Fixes #654." GH can autoclose the issue on merge.

@@ -10,6 +10,10 @@ following are the command line options that Envoy supports.

*(required)* The path to the :ref:`JSON configuration file <config>`.

.. option:: -a <path string>, --admin-address-path <path string>

*(optional)* The output file path where the admin address will be written.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

admin address and port

/**
* @return int the total number of listeners
*/
virtual int numListeners() PURE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we strictly need this? A caller could just iterate incrementally on getListenSocketByIndex until a nullptr is returned for the purposes we need in this PR.

@@ -17,6 +18,8 @@ class Factory {
* Constructs a Json Object from a String.
*/
static ObjectPtr LoadFromString(const std::string& json);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an aside, looks like LoadFromString and peers violate the coding style in their naming. A distinct cleanup PR might be nice to make this file consistent.

for (int i = 0; i < server_.numListeners(); ++i) {
listeners.push_back(server_.getListenSocketByIndex(i)->localAddress()->asString());
}
response.add(fmt::format("{}", Json::Factory::listAsJsonString(listeners)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need the fmt::format here.


if (options.adminAddressPath().length() > 0) {
std::ofstream admin_address_file(options.adminAddressPath());
ASSERT(admin_address_file.is_open());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be better error handling here - it's easy for the user to supply a bad path, somewhere not writable, etc. We should log().criticial the failure and return false (add return code). The exception catching in the caller is too focused on the JSON case, there are other reasons we can fail here, so that should also be refactored to reflect that.

def ReplaceListenerAddresses(original_json, admin_address, no_port_change):
# Get original listener addresses
with open(original_json, 'r') as original_json_file:
parsed_json = json.load(original_json_file, object_pairs_hook=OrderedDict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct in thinking we're using OrderedDict here just so we get deterministic output? If so, maybe just add a comment.

admin_response = admin_conn.getresponse()

if not admin_response.status == 200:
admin_conn.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use a try: ... finally: admin_conn.close() block here.

if no_port_change:
return CheckNoChange(original_listeners, updated_listeners)
else:
for updated, original in zip(updated_listeners, original_listeners):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe s/updated_listeners/discovered_listeners/

return CheckNoChange(original_listeners, updated_listeners)
else:
for updated, original in zip(updated_listeners, original_listeners):
original['address'] = "tcp://" + updated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the style guide says we need to have consistent use of "" and ''. Basically, prefer '' over "" unless there is legacy in a file you want to be consistent with.

while not os.path.exists(admin_address_path):
time.sleep(1)
counter += 1
if counter > 20:
Copy link
Member

@htuch htuch Apr 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should probably document that this also waits for the admin address file to appear on the filesystem. Also, best not to use magic numbers like 20, make this a constant at the top of the file, e.g. ADMIN_FILE_WAIT_TIMEOUT_SECS.

BTW, as mentioned in person, the really robust way to do this is with something like inotify, but that's not worth doing for a simple test like this.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking solid, almost ready for external review.

Http::Code AdminImpl::handlerListenerAddresses(const std::string&, Buffer::Instance& response) {
std::list<std::string> listeners;
int listener_index = 0;
while (server_.getListenSocketByIndex(listener_index) != nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can write this as while (auto listen_socket = server_.getListenSocketByIndex(listener_index) ) { ...

@@ -118,6 +120,7 @@ class AdminImpl : public Admin,
Http::Code handlerServerInfo(const std::string& url, Buffer::Instance& response);
Http::Code handlerStats(const std::string& url, Buffer::Instance& response);
Http::Code handlerQuitQuitQuit(const std::string& url, Buffer::Instance& response);
Http::Code handlerListenerAddresses(const std::string& url, Buffer::Instance& response);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to make this more generic, e.g. handlerListenerInfo, to allow later extension for more than addresses.


std::list<std::string> list4 = {"127.0.0.1:46465", "127.0.0.1:52211", "127.0.0.1:58941"};
EXPECT_STREQ("[\"127.0.0.1:46465\",\"127.0.0.1:52211\",\"127.0.0.1:58941\"]",
Json::Factory::listAsJsonString(list4).c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be more robust to ::listAsJsonString and then parse it back with RapidJSON and make sure you get back an equivalent C++ std::list. That way any changes to the RapidJSON output formatting (e.g. whitespace use) don't cause the test to fail.

tools/BUILD Outdated
@@ -3,4 +3,5 @@ package(default_visibility = ["//visibility:public"])
exports_files([
"gen_git_sha.sh",
"git_sha_rewriter.py",
"socket_passing.py",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be declared with envoy_py_test_binary (hot off the press, since #823, which is still inflight but will probably be merged before this PR).


echo "Updating original config json listener addresses"
UPDATED_HOT_RESTART_JSON="${TEST_TMPDIR}"/hot_restart_updated.json
tools/socket_passing.py "-o" "${HOT_RESTART_JSON}" "-a" "${ADMIN_ADDRESS_PATH_0}" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be ${TEST_RUNDIR}"/tools/socket_passing.py since #819 (sorry about the churn!).

TestEnvironment::temporaryPath("some/unlikely/bad/path.prof"),
Network::Utility::resolveUrl("tcp://127.0.0.1:0"), server_);
AdminImpl admin_bad_profile_path(
"/dev/null", TestEnvironment::temporaryPath("some/unlikely/bad/path.prof"), "",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this points to making AdminImplOptions something worth considering.

AdminImpl admin_bad_address_out_path(
"/dev/null", cpu_profile_path_, bad_path,
Network::Test::getSomeLoopbackAddress(Network::Address::IpVersion::v4), server_);
EXPECT_FALSE(std::ifstream(bad_path));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to EXPECT on the log output? I'm not sure, but if it's easy to do it might be worth doing that as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I haven't found a way to check log outputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger could be mocked and the log statement EXPECTed, but, the way the logger object is brought into existence in many places including in AdminImpl makes this difficult. It's super convenient to be able to inherit in any class from Logger::Loggable and just get the magic log() method but this kills the flexibility of being able to pass in a mocked logger. Some test code added to the logger could provide a special mode where you can check what's been logged, but probably this is nontrivial and best done as a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also just make the error a PANIC() and then do a death test. I don't have a strong preference here.

# exisiting json config file with updated listener addresses. This script is
# currently called in the hot restart integration test to update listener
# addresses bound to port 0 in the intial json config file. With the
# -n or -no_port_change option, this script does not update an exisiting json
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-n is now gone.

import sys
import time

ADMIN_FILE_TIMEOUT_SECS = 20
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment explaining what this means.

return False
discovered_listeners = json.loads(admin_response.read())
except:
sys.stderr.write('Cannot connect to admin.\n')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally it's scary to do a wildcard exception catch in Python and then not propagate further. In this case, it means exceptions will be hard to debug as we lose info. You could do except e: \n sys.stderr.write('Cannot connect to admin: %s' % e) or something like that. Or re-raise with

except e:
  sys.stdout.write(...)
  raise e

or create a custom exception class. Probably the first is fine for this test tool.

@@ -1,6 +1,7 @@
#pragma once

#include <chrono>
#include <fstream>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: shouldn't this be in admin.cc only?

std::vector<Json::ObjectPtr> output3 =
Json::Factory::LoadFromString(Json::Factory::listAsJsonString(list3))->asObjectArray();
EXPECT_EQ(output3.size(), 4);
EXPECT_STREQ(output3[0]->asString().c_str(), "one");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it possible to EXPECT_EQ with std::string? Might be cleaner than dropping to C strings.

std::vector<Json::ObjectPtr> listener_info =
Json::Factory::LoadFromString(response->body())->asObjectArray();
for (std::size_t index = 0; index < listener_info.size(); index++) {
EXPECT_STREQ(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same point on EXPECT_EQ here and elsewhere.

tools/BUILD Outdated
@@ -1,6 +1,15 @@
package(default_visibility = ["//visibility:public"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you switch this to envoy_package since #828, thanks!

import time

# Seconds to wait for the admin address output file to appear. The script exits
# if the file is not found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "exits with failure"

sys.stderr.write('Cannot connect to admin: %s\n' % e)
return False
else:
if not len(discovered_listeners) == len(original_listeners):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A != B is clearer than not A == B. The second is in danger of being read (not A) == B, which isn't what it means, but the casual reader might easily make that mistake.

# if the file is not found.
ADMIN_FILE_TIMEOUT_SECS = 20

def ReplaceListenerAddresses(original_json, admin_address, updated_json):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name is now legacy, it should probably be PrintListenerAddresses or something you think is better.

with open(admin_address_path, 'r') as admin_address_file:
admin_address = admin_address_file.read()

result = ReplaceListenerAddresses(args.original_json, admin_address, args.updated_json)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/result/success

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready for wider review. @mattklein123

tools/BUILD Outdated
@@ -1,5 +1,6 @@
load(
"//bazel:envoy_build_system.bzl",
"envoy_py_test_binary",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer alphabetical ordering here.

@mattklein123
Copy link
Member

@hennna can we get this passing tests? If you want to wait until cmake removal that's fine. It should be soon.

@mattklein123
Copy link
Member

Nevermind looks like you fixed it. Will look in a bit.

@hennna
Copy link
Contributor Author

hennna commented Apr 25, 2017

The fix is only for the alphabetical ordering comment by @htuch. I'll wait for cmake removal rather than add changes that will need to be reverted.

@mattklein123
Copy link
Member

OK let's just wait until this is passing tests then we can take a quick final pass through.

@mattklein123
Copy link
Member

@hennna please merge master. This should pass tests now.

@hennna
Copy link
Contributor Author

hennna commented Apr 27, 2017

@mattklein123 updated

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, a few small comments.

{"/stats", "print server stats", MAKE_HANDLER(handlerStats)},
{"/listeners", "print listener addresses", MAKE_HANDLER(handlerListenerInfo)}} {

if (address_out_path.length() > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: !address_out_path.empty()

@@ -10,6 +10,10 @@ following are the command line options that Envoy supports.

*(required)* The path to the :ref:`JSON configuration file <config>`.

.. option:: -a <path string>, --admin-address-path <path string>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we drop the -a short option. This is a very uncommon option and would rather it be fully explicit. (Obviously need to fix tclap stuff below).

@@ -28,12 +28,14 @@ namespace Server {
*/
class TestOptionsImpl : public Options {
public:
TestOptionsImpl(const std::string& config_path) : config_path_(config_path) {}
TestOptionsImpl(const std::string& config_path)
: config_path_(config_path), admin_address_path_("") {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: admin_address_path_("") not needed, as its the default.

AdminImpl admin_bad_address_out_path(
"/dev/null", cpu_profile_path_, bad_path,
Network::Test::getSomeLoopbackAddress(Network::Address::IpVersion::v4), server_);
EXPECT_FALSE(std::ifstream(bad_path));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also just make the error a PANIC() and then do a death test. I don't have a strong preference here.

@hennna
Copy link
Contributor Author

hennna commented Apr 27, 2017

I'll defer the log.critical vs. PANIC change to after the wider discussion on failure modes in the upcoming doc PR. There doesn't seem to be strong feelings for either option in this specific PR.

@mattklein123 mattklein123 merged commit 5d05bc4 into envoyproxy:master Apr 27, 2017
@hennna hennna deleted the socket-passing branch April 27, 2017 22:13
jpsim pushed a commit that referenced this pull request Nov 28, 2022
We don't currently run the Kotlin linter on anything other than `hello_world`, but we should.

This PR fixes all existing violations and enables the linter.

Resolves envoyproxy/envoy-mobile#599.

Signed-off-by: Michael Rebello <me@michaelrebello.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
We don't currently run the Kotlin linter on anything other than `hello_world`, but we should.

This PR fixes all existing violations and enables the linter.

Resolves envoyproxy/envoy-mobile#599.

Signed-off-by: Michael Rebello <me@michaelrebello.com>
Signed-off-by: JP Simard <jp@jpsim.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants