Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asyncio DatagramTransport.sendto does not send empty UDP packets #113812

Closed
joudinet opened this issue Jan 8, 2024 · 15 comments
Closed

asyncio DatagramTransport.sendto does not send empty UDP packets #113812

joudinet opened this issue Jan 8, 2024 · 15 comments
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error

Comments

@joudinet
Copy link

joudinet commented Jan 8, 2024

Bug report

Bug description:

Trying to send multicast UDP packets with DatagramTransport.sendto, I notice no error on the python program but no packet is sent (I've checked with tcpdump). Modifying the call to use the underlying socket fixes the issue but I get a warning that I should not use this socket directly.
Is this a known issue?
How should I use asyncio to properly send UDP multicast packets?

CPython versions tested on:

3.10

Operating systems tested on:

Linux

Linked PRs

@joudinet joudinet added the type-bug An unexpected behavior, bug, or error label Jan 8, 2024
@gvanrossum
Copy link
Member

@joudinet This is not enough information for us to help you. There could be a bug in your program. I don't know much about UDP multicast, and there doesn't seem to be any special provision for it in asyncio. (Does it work by using a specific IP address? Maybe the socket needs to have some flag applied to enable multicast.)

Can you narrow the problem down to a program that's small enough to paste into a comment here and simple enough for someone to try running locally?

@joudinet
Copy link
Author

Here is a small enough program to replicate the issue:

#! /usr/bin/env python
import socket
import asyncio
import os
import sys

MULTICAST_GROUP="ff12::4242:2e5e:7:c011:f16"
PORT = 4242
ENABLE_LOOPBACK_TESTING = 0

if os.name == "nt" and not hasattr(socket, "IPPROTO_IPV6"):
    socket.IPPROTO_IPV6 = 41

class MyProtocol(asyncio.DatagramProtocol):
    def __init__(self, ifindex, on_connection_lost):
        self.ifindex = ifindex
        self.on_connection_lost = on_connection_lost
        self.transport = None

    def connection_made(self, transport):
        self.transport = transport
        sock = transport.get_extra_info('socket')
        try:
            sock.setsockopt(socket.IPPROTO_IPV6,
                            socket.IPV6_MULTICAST_IF,
                            self.ifindex)
            sock.setsockopt(socket.IPPROTO_IPV6,
                            socket.IPV6_MULTICAST_LOOP,
                            ENABLE_LOOPBACK_TESTING)
            self.sock = sock
        except (OSError, socket.error):
            # Expected exceptions, simply close the socket.
            self.transport.close()

    def send_packet(self, content):
        if bytes is not str and isinstance(content, str):
            content = content.encode()
        # FIXME: the following transport.sendto call does not work:
        # self.transport.sendto(content,
        #                       (MULTICAST_GROUP, PORT, 0, self.ifindex))
        self.sock.sendto(content,
                         (MULTICAST_GROUP, PORT, 0, self.ifindex))
    def send_probe(self):
        self.send_packet("")

    def writable(self):
        """Not interested in write events"""
        return False

    def datagram_received(self, data, addr):
        """Called when some datagram is received.

        data is a bytes object containing the incoming data.
        addr is the address of the peer sending the data.
        """
        # Skip the received part for this example.
        pass

    def error_received(self, exc):
        print("Error received:", exc)

    def connection_lost(self, exc):
        self.on_connection_lost.set_result(True)

async def main():
    loop = asyncio.get_running_loop()
    connections = set()
    # Bruteforce the interface index, because there is no reliable way
    # to get them
    for ifindex in range(2,1000):
        on_connection_lost = loop.create_future()
        transport, protocol = await loop.create_datagram_endpoint(
            lambda: MyProtocol(ifindex, on_connection_lost),
            family=socket.AF_INET6)
        if not on_connection_lost.done():
            connections.add(
                (ifindex, on_connection_lost, transport, protocol))
    if not connections:
        print("No network interfaces found")
        sys.exit(1)

    first_probe = True
    while connections:
        await asyncio.sleep(1)
        if first_probe:
            print("* Sending probes every second on interface indexes:",
                  ",".join(str(c[0]) for c in connections))
            first_probe = False
        done_connections = set()
        for ifindex, on_conn_lost, transport, protocol in connections:
            if on_conn_lost.done():
                done_connections.add(
                    (ifindex, on_conn_lost, transport, protocol))
            else:
                try:
                    protocol.send_probe()
                except:
                    # Continue to send probes to faulty interfaces. So
                    # this program won't fail if the system reports an
                    # error while the device is restarting.
                    pass
        for conn in done_connections:
            connections.discard(conn)

if __name__=='__main__':
    asyncio.run(main())

This program sends UDP packets to the MULTICAST_GROUP IPv6 address on port 4242 on all network interfaces every second.
The send_packet method uses the underlying socket to send UDP multicast packets. If you run a tcpdump on the same machine, you can see the packets being sent. For example: sudo tcpdump -n -i any udp port 4242.

Now, if you replace self.sock.sendto by self.transport.sendto, the program does not show any error message but no packet is sent.

@gvanrossum
Copy link
Member

I tried running your sample program on macOS and ran the tcpdump command you suggested, but I have no idea what any of these mean, and the tcpdump program didn't give any output. I don't have access to Linux. Maybe someone else with more networking and Linux experience can jump in?

@ordinary-jamie
Copy link
Contributor

ordinary-jamie commented Feb 6, 2024

Hi @joudinet the issue is because asyncio.selector_events._SelectorDatagramTransport will return on sendto if the content is empty. See https://github.com/python/cpython/blob/3.10/Lib/asyncio/selector_events.py#L1041-L1042 (nb. this is the same behaviour in 3.13 today)

If you pass a non-empty string to MyProtocol.send_packet (in your send_probe method) then your tcpdump will show the UDP packets being sent.

Side note:

Bruteforce the interface index, because there is no reliable way
to get them

You can use for idx, name in socket.if_nameindex()

Short example

import socket
import asyncio
import sys
import time


class Protocol(asyncio.DatagramProtocol):
    def __init__(self, multicast_group, multicast_port, if_index):
        # addr, port, flowinfo, scope
        self._addr = (multicast_group, multicast_port, 0, if_index)
        self.transport = None
        self.closed = False

    def connection_made(self, transport):
        self.transport = transport

    def sendto(self, content):
        return self.transport.sendto(content, addr=self._addr)

    def connection_lost(self, _):
        self.closed = True


async def main(multicast_group, multicast_port, interface="en0"):
    def protocol_factory():
        return Protocol(
            multicast_group=multicast_group,
            multicast_port=multicast_port,
            if_index=socket.if_nametoindex(interface),
        )

    loop = asyncio.get_running_loop()
    transport, protocol = await loop.create_datagram_endpoint(
        protocol_factory, family=socket.AF_INET6
    )

    if protocol.closed:
        print("Failed to connect")
        sys.exit(1)

    try:
        print(f"{time.time()}: sending empty")
        protocol.sendto(b"")

        time.sleep(1)
        print(f"{time.time()}: sending 'hello'")
        protocol.sendto(b"hello")
    except Exception as exc:
        print(f"Failed to send: {exc}")
    finally:
        transport.close()


if __name__ == "__main__":
    asyncio.run(
        main(multicast_group="ff12::1234", multicast_port=4242, interface="en0")
    )

tcpdump

➜  cpython git:(main) ✗ sudo tcpdump -n -i any udp port 4242
Password:
tcpdump: data link type PKTAP
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type PKTAP (****), snapshot length 524288 bytes
09:34:15.895835 IP6 2001:db8:ffff:ffff:ffff:ffff:ffff:ffff > ff12::1234.4242: UDP, length 5

Notice there is only one datagram with a payload of length=5 ("hello")

@ordinary-jamie
Copy link
Contributor

ordinary-jamie commented Feb 6, 2024

For what it's worth, uvloop mimics the exact same behaviour in its API (specifically mentioning the asyncio behaviour): https://github.com/MagicStack/uvloop/blob/9f82bd74447aa34abb6fa1b34d1d011d027c4369/uvloop/handles/udp.pyx#L284-L286

I don't believe the sendto system call actually prohibits empty payloads

@joudinet
Copy link
Author

joudinet commented Feb 6, 2024

Hi @ordinary-jamie ,
Thanks for the explanation.
I confirm the sendto system call does not prohibit empty payloads and it is perfectly valid to send UDP packets with an empty payload. I don't understand why asyncio and uvloop prevent this behaviour?

@ordinary-jamie
Copy link
Contributor

perfectly valid to send UDP packets

Yup agreed; Microsoft on Win32 API, Winsock.h recvfrom

For UDP if the packet received contains no data (empty), the return value from the recvfrom function is zero.

However this is probably only the case for unconnected sockets, and may require additional documentation to make clear the differing behaviour. Since for connected sockets, I believe the recv zero-return is to indicate an orderly shutdown/EOF. From man 2 recvfrom:

These calls return the number of bytes received, or -1 if an error occurred. The return value will be 0 when the peer has performed an orderly shutdown.

I am unsure about unix-domain sockets; a quick browse and it appears a zero-length return is ambiguous.

@gvanrossum, what do you think about removing this pattern for just the datagram transports (proactor_events._ProactorDatagramTransport and selector_events._SelectorDatagramTransport)?

@gvanrossum gvanrossum changed the title asyncio DatagramTransport.sendto does not send multicast UDP packets asyncio DatagramTransport.sendto does not send empty multicast UDP packets Feb 7, 2024
@gvanrossum
Copy link
Member

Thanks @ordinary-jamie for that diagnosis! I have changed the title to reflect this.

Now we need to decide whether this is a bug or a feature. Do we even care, now that we know that the OP's problem can be solved by adding some data to the test packets?

I suspect that ignoring sendto() calls with empty data a feature, at least for certain types of sockets, and I don't think it is appropriate to check the socket type at that point.

So, apart from the fact that it's different from the underlying sendto(2) syscall, is there a real use case for sending zero-length packets?

@ordinary-jamie
Copy link
Contributor

ordinary-jamie commented Feb 8, 2024

Do we even care, now that we know that the OP's problem can be solved by adding some data to the test packets?

is there a real use case for sending zero-length packets?

Perhaps this can be rephrased as -- should asyncio be responsible for setting up guardrails in preventing zero-length sends on behalf of application developers? If a zero-length send affects other internal parts of asyncio (or other libraries) then I would say yes, but it appears to me, the internal datagram transports are only used for the create_datagram_endpoint loop method. I am not sure if that loop method is used elsewhere, however.

I don't think it is appropriate to check the socket type at that point.

Agreed. However, if instead we can show that a zero-length sendto for the relevant socket domains with type SOCK_DGRAM (the default type in create_datagram_endpoint) is valid would that be sufficient to justify a change just for the datagram transports?

There is added complexity for when a caller passes their own socket to create_datagram_endpoint as there was a recent change in #114887 which would allow SOCK_RAW and SOCK_SEQPACKET socket types, where a zero-length recv would be problematic for at least the latter -- but in these use-cases, I would think advanced users should be responsible for validating what is sendto/recvfrom in the socket.

@ncoghlan
Copy link
Contributor

ncoghlan commented Feb 8, 2024

We're reasonably sure the SOCK_SEQPACKET socket type is only valid to use in cases where SOCK_STREAM is accepted (based on the Linux socket man pages).

However, without any standard internet protocols that actually use SOCK_SEQPACKET, it was deemed appropriate to use "only exclude SOCK_STREAM" to resolve the SocketCAN breakage detected in #114887 (we're not even sure that SOCK_SEQPACKET works with asyncio in the first place, as even the SOCK_RAW regression went unnoticed for years, and that's actively required for hardware CAN bus support).

All of which adds up to: SOCK_SEQPACKET doesn't need to be considered when deciding on the most appropriate way to resolve this issue.

@joudinet
Copy link
Author

joudinet commented Feb 8, 2024

I let you decide as I don't know enough about asyncio but as an application developer that had to migrate an application from asyncore, I was very surprised that asyncio DatagramTransport.sendto ignores my UDP packets with no payload.
In my opinion, there are many protocols that send such packets. The oldest example I could think of is the Time Protocol (RFC 868), where the client sends an empty datagram to port 37, so the server can reply a datagram containing the 32-bit time value.
If a library forbids such packets, it should have a good reason and it should be clearly stated in its documentation.

@gvanrossum
Copy link
Member

Okay, @joudinet's mention of the Time Protocol has convinced me that there's a valid use case for sending zero-length packets, and after reading through the code I think it is as simple as removing the two lines that check for empty data. Now, there are two implementations (one for Proactor and one for Selector), and both have these lines. There are no docs mentioning a special case for empty packets.

I have a feeling @ordinary-jamie might want to look into submitting a PR, but anyone can give it a try (just leave a comment here if you intend to work on this). I would like the PR to add some tests (for both implementations) and for good measure add a line to the docs (and to the docstring for sendto in transports.py) explicitly allowing the sending of empty packets, and the .rst docs should add a versionchanged note explaining that this was added in 3.13. I don't think we can treat this as a backportable bug (it'll be too confusing to explain exactly which micro versions have the fix).

I will next edit the issue subject line to remove "muticast", since this bug applies to all uses of sendto.

@gvanrossum gvanrossum changed the title asyncio DatagramTransport.sendto does not send empty multicast UDP packets asyncio DatagramTransport.sendto does not send empty UDP packets Feb 8, 2024
@gvanrossum
Copy link
Member

Once this is fixed someone should file a uvloop bug letting them know this is changed in Python 3.13.

@ordinary-jamie
Copy link
Contributor

I have a feeling @ordinary-jamie might want to look into submitting a PR

🤭 Thanks Guido, I'll work on this if there are no takers!

gvanrossum pushed a commit that referenced this issue Feb 17, 2024
Also include the UDP packet header sizes (8 bytes per packet)
in the buffer size reported to the flow control subsystem.
@gvanrossum
Copy link
Member

Fixed by #115199. Can't backport because this feels like a (tiny) new feature.

woodruffw pushed a commit to woodruffw-forks/cpython that referenced this issue Mar 4, 2024
…ython#115199)

Also include the UDP packet header sizes (8 bytes per packet)
in the buffer size reported to the flow control subsystem.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
…ython#115199)

Also include the UDP packet header sizes (8 bytes per packet)
in the buffer size reported to the flow control subsystem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-asyncio type-bug An unexpected behavior, bug, or error
Projects
Status: Done
Development

No branches or pull requests

5 participants