-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libssl.so.3 observability is broken #1736
Comments
@danielkleinstein thanks for the detailed bug report and as a result I was able to reproduce the problem. This logic has existed for quite some time, so I was confident that this situation was handled properly. I need to investigate what changes have happened, since my suspicion is that a regression was introduced at some point. I believe this would impact all of OpenSSL tracing (not just OpenSSL v3) and as a result want to make sure we understand the situation fully. Regarding the |
I think ccb32d5 might be the change that introduce the regression. The FilePathResolver class was mount namespace aware, but it was replaced wholesale with the |
Wow, that was some quick debugging! Two questions:
|
If the rest of the usages are correct, it's possible that this is overkill - it'd be easy to overload This is admittedly a bit of a hacky solution but it seems much smaller in scope than reintroducing the removed class. |
Hi @danielkleinstein and @ddelnano - Could you please take a look at this PR: #1740. I think the issue was that we applied the
But using the namespace pid is just redundant and not really what we intended. If #1740 helps, it is probably the fix we want. |
Hi Pete, your PR fixes the issue 🚀. I see you also modified |
@etep that looks like the correct fix and that confirms that the regression was introduced in ccb32d5. So I believe the impact of this is that dynamically linked OpenSSL tracing for containers within PID namespaces would have been broken since late 2022 or early 2023 (vizier releases weren't published to GitHub around that timeframe, so not sure how to confirm which release it was included in easily). Containers running within the host PID namespace or processes outside of k8s would have be unaffected. Unfortunately our end to end tests heavily rely on running containers with the host PID namespace (source). Converting all the tests to support that might be challenging, so maybe it would be best to have a single use case or |
… `libssl.so` files. (#1740) Summary: Fixes a bug that prevented SSL tracing on dynamically loaded SSL libraries. When probing open ssl libs found from a pod's mapped libraries, we erroneously prefixed library paths with `/proc/<pid>/root` twice, i.e. such that a target library might look like this: ``` /proc/<pid>/root/proc/<pid>/root/<path>/some-lib.so ``` Normal target processes run in their own pid namespace. This meant that the *second* instance of `<pid>` in the path above would not exist in the target process namespace. Hence probing dynamic libraries using this path failed. We also update the open-ssl test case to not use host pid namespace, i.e. such that those tests expose this bug. Previously, this bug was masked in those test cases because the test container processes ran with host pid namespace. Related issues: #1736. Type of change: /kind bug fix. Test Plan: Show that tests fail when we disable use of host pid namespace in SSL test containers, see #1744. Changelog Message: ```release-note Fixed a bug that prevented SSL tracing on dynamically linked SSL libraries. ``` --------- Signed-off-by: Pete Stevenson <jps@pixielabs.ai>
@danielkleinstein apologies that it has taken some time for the fix to make it in a release, but v0.14.8 addresses this issue! We really appreciate your help in surfacing this issue and debugging it with us! |
@ddelnano Sure! Thanks in turn for responding so promptly to the issue. And thanks more broadly for your guys' work on Pixie, I've learned a ton from your project. |
Describe the bug
OpenSSL visibility seems to be broken, at least with
libssl.so.3
.On one of my cluster nodes, I am running a Python 3.10 program that performs outgoing HTTPS traffic against an API endpoint
The relevant PEM's log shows the following line:
To Reproduce
kubectl run ubuntu --image=ubuntu --rm -it -- bash
http-traffic.py
and run./http-traffic.py
(installing dependencies as necessary):Expected behavior
Since the Python program makes continuous requests against an HTTPS endpoint
https://api.sampleapis.com/countries/countries - I expect the requests to show up under the
px/http_data
script (filtered to "ubuntu" in thesource_filter
) - but nothing appears.Screenshots
Logs
Please attach the logs by running the following command:
I ran this command immediately after running
px deploy
and thenhttp-traffic.py
- logs attached.pixie_logs_20231014150647.zip
App information (please complete the following information):
^ Not filling out because I'm not sure and I believe it's irrelevant for the issue.
Additional context
I have created a POC PR here that seemingly solves the issue - #1735. When deploying Pixie with these changes, I can see the HTTPS traffic.
was a bit of a hack - but BCC failed without it with the error:
The text was updated successfully, but these errors were encountered: