Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a few minor bugs to get KeyboardWalk to work reliably. #13

Merged
merged 4 commits into from
Apr 22, 2024

Conversation

Claegtun
Copy link
Contributor

@Claegtun Claegtun commented Apr 22, 2024

This branch is mainly to collect a few small fixes for a few small bugs on the NUSense firmware. After this, KeyboardWalk should be mostly reliable.

Such bugs were:

  • NUSense hitting a hard fault (like a segmentation fault), possibly because of a race-condition with the CDC callback,
  • a servo missing frames because its dirty servo-state was ignored since the dirty-flag was reset after some delay,
  • and the firmware getting stuck on a blocking function with an indefinite timeout for the IMU.

Such fixes are:

  • to disable the OTG_HS IRQ when the ring-buffer for the USB-communications is being handled lest there may be a race-condition*,
  • to reset the dirty-flag of the servo-state straight away before a write-instruction is sent lest a ServoTarget is ignored as the flag is reset afterwards,
  • to change the packet timeout-timer to time single bytes rather than a whole packet being decoded lest the Dynamixel loop is throttled too much†,
  • to make the IMU's polling timeout lest the firmware gets bricked at that one spot‡,
  • and to change the memcpy's in the CDC callback to be for-loops since the former throws away the volatile qualifier.

* This is rough fix since the fault itself was an imprecise or asynchronous bus-fault. Disabling an internal buffer apparently makes it easier to troubleshoot. However, disabling the IRQ seems to work well. No hard-fault comes up when the robot is standing for about half an hour. Furthermore, too much time should not be spent since StreamReactor may have a better approach. Interestingly, this bug only happened with the firmware was fully optimised.
† I am not sure about this since it does allow a more accurate timing of missing bytes, but it may never time out if there is a consistent enough stream of garbage. However, Dex's independent watchdog idea may help this.
‡ We may want to make a more thorough DMA approach; see this task.

…USB; reset the dirty-flag straight away before sending the first write-instruction so that new data is not ignored; changed the timeout-timer to single bytes rather than whole packets; made the polling of the IMU timeout so that there is no blocking that may stop NUSense; changed the memcpy's in the USB callback to for-loops to keep the volatile quailifier.
@Claegtun Claegtun marked this pull request as ready for review April 22, 2024 01:27
@Claegtun Claegtun self-assigned this Apr 22, 2024
Copy link
Member

@ysims ysims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but make sure these ideas are documented somewhere, Jira or GitHub issue

@JohanneMontano
Copy link
Contributor

Should try and send a massive pb message with all 20 servos, just to make sure we're not missing irqs from the read callback. The memcpy was to "guarantee" that the copy routine was fast enough to exit that callback as quick as it could so no interrupts are missed.

@Claegtun Claegtun merged commit 268c0e1 into main Apr 22, 2024
@Claegtun Claegtun deleted the keyboardwalk-working branch April 22, 2024 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants