Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timestamped input buffering - prevent stalling and improve timing #77062

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lawnjelly
Copy link
Member

@lawnjelly lawnjelly commented May 14, 2023

Input was resulting in stalling and ANRs on Android and possibly other platforms. This PR separates buffered input into two parts to prevent stalls. It also timestamps input events and applies them on the relevant logical physics tick, rather than attempting to apply them "realtime".

Master version of #76839
(see that PR for more details and demo project)

This PR is quite a significant refactor to input buffering in order to make it threadsafe, store timestamps, and offer accurate agile input on physics ticks on platforms with separate input thread. It also offers a legacy mode (with a project setting) in case of regressions, this can be removed once any bugs have been worked out, which should make the code simpler.

N.B.

This PR was written immediately prior to #76399 being merged, which may have addressed some of the similar concerns for ANRs. This PR is thus more concerned with agile input than fixing ANRs.

Explanation

The fix in #76399 was to deal with the problem that processing input events (which could take an extended amount of time, for say a load or reading a web address etc) was locking the input buffer. It took a simple approach of micro-unlock / relock for each input event.

This PR is a bit more involved, it changes the buffer into a more thread safe structure.

  • Two mirror incoming buffers for incoming events (and events are timestamped as they come in)
  • The active mirror is swapped quickly with a quick lock on flush, so there is no need for microlocking as in Allow concurrent buffering and dispatch of input events #76399
  • The incoming read buffer is copied across to a master buffer, which is thus sorted by timestamp

When flushing the master buffer can thus be flushed up to a desired timestamp. This enables us to flush on physics ticks up to the logical timestamp of the physics tick (rather than realtime of when the tick is processed).

This means input can be spread out correctly over logical time - if a frame has e.g. 8 physics ticks, an on off keypress can be assigned to say ticks 2 and 3, instead of being only applied if the key happens to be pressed at the frame time. This is essentially a (hopefully! 😄 ) more accurate version of @RandomShaper 's agile input.

Accumulating

In AGILE mode, accumulating input (e.g. mouse motion) is deferred until flushing, which means that accumulation can be done accurately on a per tick basis. This can potentially give perfectly accurate input for say first person shooters, while still using accumulation to reduce the number of calls to user code and thus increase efficiency. In FRAME mode (and OS that do not support AGILE), accumulation still takes place correct to the nearest frame, as before.

Input thread

There is one snag, for the input timestamps to be meaningful, we ideally want input to come in free flow from the OS. This so far only occurs on Android, where there is a separate thread for input. On most of the other platforms, the OS message pump is done once per frame, so the end result is the input gets bunched up like the current status quo.

However, if we have the input backend working correctly with timestamps / agile input, that does leave the option open to now improve the input accuracy on other platforms in later PRs.

We can either investigate using a separate input thread (ideally) or perhaps resampling the input queue from the OS multiples times per frame (in single threaded fashion). The latter is not ideal but could lead to some greater input resolution.

Notes

  • Adds a temporary legacy_event_flushing project setting for more backward compatibility (but even so, input is quite different so will need testing).

@lawnjelly lawnjelly added this to the 4.x milestone May 14, 2023
@lawnjelly lawnjelly force-pushed the input_buffer_4 branch 2 times, most recently from 132d3db to a4982fb Compare May 14, 2023 17:10
@lawnjelly lawnjelly marked this pull request as ready for review May 14, 2023 17:31
@lawnjelly lawnjelly requested review from a team as code owners May 14, 2023 17:31
core/input/input.cpp Outdated Show resolved Hide resolved
core/input/input.h Show resolved Hide resolved
core/input/input.h Show resolved Hide resolved
Comment on lines 573 to +574
Input::get_singleton()->set_use_input_buffering(true); // Needed because events will come directly from the UI thread
Input::get_singleton()->set_has_input_thread(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Knowing why these are not exclusive would help me understand the new approach better.

Copy link
Member Author

@lawnjelly lawnjelly May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I can add a note about this.

It is rather confusing but is historical - accumulated input was added as an afterthought, and it overrides the use_input_buffering setting. I did attempt to change this to something more sensible, but it would have been compat breaking. (This is why the unit tests were failing when I first made the PR, so I reverted to logic mirroring the legacy.)

If you look in the _update_buffering_mode() logic you can see that buffering will still be used if use_input_buffering is false (!) if use_accumulated_input is true. This is because accumulated input requires buffering to work. This is how the legacy logic works.

If we are prepared to break compatibility, then we can probably make some simpler exclusive logic.

  • Also note that agile input only currently activates when there is a separate input thread. So input buffering can effectively be on (in frame mode) without a separate input thread.

Copy link
Member Author

@lawnjelly lawnjelly May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have added some comments in the _update_buffering_mode() function to this effect.

Incidentally if we do decide to break compatibility in order to simplify, am wondering if this would be better to do as a followup PR, because changing the logic is more likely to lead to regressions.

// and smoothed time / realtime can get out of sync.
mft.first_physics_tick_logical_time_usecs = current_cpu_ticks_usec;
mft.first_physics_tick_logical_time_usecs -= (mft.physics_steps * mft.usec_per_tick) + leftover_usec;

Copy link
Member Author

@lawnjelly lawnjelly May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that the timing calculations for the first logical physics tick are more complex when delta smoothing is being used (as there is time compression and expansion compared to wall clock time). I'm just fixing this up for the 3.x PR, but delta smoothing isn't merged yet in 4.x so will leave the simple version here for now.

Additionally "jitter fix" can significantly change input tick timings. For now I would advise setting jitter fix to 0 when agile input is in use, I will leave fixing this properly to a follow up PR. See #76839 (comment) for details.

core/input/input.h Outdated Show resolved Hide resolved
Comment on lines +102 to +105
incoming.resize(incoming.size() + 1);
Event &e = incoming[incoming.size() - 1];
e.event = p_event;
e.timestamp = p_timestamp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just call push_event here to reuse code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicating here avoids a redundant extra lock, especially as this will not be a rare path.

Alternatively we could place the push in a separate function from the locks, but that might be overkill for the few lines of code here (Godot codebase tends towards duplication rather more than I would personally, but it's no biggie).

core/input/input.cpp Outdated Show resolved Hide resolved
@lawnjelly lawnjelly marked this pull request as draft May 26, 2023 08:42
Input was resulting in stalling and ANRs on Android and possibly other platforms. This PR separates buffered input into two parts to prevent stalls. It also timestamps input events and applies them on the relevant logical physics tick, rather than attempting to apply them "realtime".
@lawnjelly lawnjelly marked this pull request as ready for review May 26, 2023 14:05
@lawnjelly
Copy link
Member Author

lawnjelly commented May 26, 2023

On reflection I made a change to the scheme:

  • Reinstated flush_buffered_events() to basically do as expected (except it uses the new buffer in this PR)
  • Avoid calling flush_buffered_events() in the main loop on Android, so that agile input works correctly using the new flushing functions in Main::iteration().

I've tested this on Android and agile input is still working, and it should be far simpler backward-compatibility wise.

@akien-mga akien-mga modified the milestones: 4.x, 4.3 Sep 19, 2023
@SirTodd
Copy link

SirTodd commented Oct 11, 2023

I tried cherry picking this to the 4.1 branch for public testing with a large audience, but I could not compile it anymore.
Could we have an update on this PR? The ANR issue is a big deal for the visibility of mobile games.

K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(484): error C2039: 'process_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(500): error C2039: 'physics_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(502): error C2039: 'process_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(885): error C2039: 'physics_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(886): error C2039: 'process_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(1015): error C2039: 'physics_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(1016): error C2039: 'process_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(1028): error C2039: 'physics_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'
core\input\input.cpp(1029): error C2039: 'process_frame': is not a member of 'Input::Action'
K:\godot-4.1.1\godot\core\input\input.h(158): note: see declaration of 'Input::Action'

@jknightdoeswork
Copy link

This PR is quite a significant refactor to input buffering in order to make it threadsafe

Can you point out where it's currently not threadsafe?

Input was resulting in stalling and ANRs on Android and possibly other platforms.

Can you show what exactly is the current 'stall' on Android? By stall, do you mean it's less efficient than it can be? Or do you mean it's deadlocking?

@Sauermann Sauermann modified the milestones: 4.3, 4.4 Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants