Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kgo sink: fix read/write race for recBatch.canFailFromLoadErrs #786

Merged
merged 1 commit into from
Jul 29, 2024
Merged

Conversation

twmb
Copy link
Owner

@twmb twmb commented Jul 21, 2024

When writing a record batch during a request, the batch mutex is locked. This guards against a concurrent failAllRecords, which can be triggered from a metadata update.

However, a boolean field that guarded against failing buffered records if it's not "safe" was not properly mutex guarded. Writing a request only locks the batch, not the owning recBuf, while checking to see if the batch could fail only locked the owning recBuf, not the batch.

This adds locking around the batch when checking if it can be failed, and adds a bool that, if true (due to load failures), ensures the batch is not written.

Closes #785.

When writing a record batch during a request, the batch mutex is locked.
This guards against a concurrent failAllRecords, which can be triggered
from a metadata update.

However, a boolean field that guarded against failing buffered records
if it's not "safe" was not properly mutex guarded. Writing a request
only locks the batch, not the owning recBuf, while checking to see if
the batch could fail only locked the owning recBuf, not the batch.

This adds locking around the batch when checking if it can be failed,
and adds a bool that, if true (due to load failures), ensures the batch
is not written.

Closes #785.
@twmb twmb added the patch label Jul 21, 2024
@twmb twmb merged commit 4e14d75 into master Jul 29, 2024
8 checks passed
@twmb twmb deleted the 785 branch July 29, 2024 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data Race: concurrent access to recBatch.canFailFromLoadErrs during retry errors
1 participant