Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. #1608

Merged
merged 6 commits into from
Nov 25, 2020

Conversation

bharatviswa504
Copy link
Contributor

What changes were proposed in this pull request?

OM sends multiple times the same batch of block deletes to SCM, as keys will not be deleted from delete table, even though they are submitted for the purge.

Reason for this issue:

  1. When KeyDelete Service submits a request KeyPurge from KeyDelete Service, as this is submitted internally from OMServer the callID and ClientID will be passed the same, the subsequent requests are considered as retry, and it will not be processed until ratis clears its retry cache.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4481

How was this patch tested?

Added a test and also tested the fix on the cluster, now KeyDeletion service is working as expected.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharatviswa504 Thanks for working on this! The changes look good to me. I have one comment inline.

Copy link
Contributor

@lokeshj1703 lokeshj1703 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharatviswa504 Thanks for updating the PR! The changes look good to me. +1. Pending CI.

@bharatviswa504 bharatviswa504 merged commit d83ec1a into apache:master Nov 25, 2020
@bharatviswa504
Copy link
Contributor Author

Thank You @lokeshj1703 for the review.

errose28 added a commit to errose28/ozone that referenced this pull request Dec 1, 2020
* HDDS-3698-upgrade:
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  HDDS-4497. Recon File Size Count task throws SQL Exception. (apache#1612)
errose28 added a commit to errose28/ozone that referenced this pull request Dec 1, 2020
* HDDS-3698-upgrade:
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  HDDS-4497. Recon File Size Count task throws SQL Exception. (apache#1612)
errose28 added a commit to errose28/ozone that referenced this pull request Jan 5, 2021
* master: (40 commits)
  HDDS-4473. Reduce number of sortDatanodes RPC calls (apache#1610)
  HDDS-4485. [DOC] add the authentication rules of the Ozone Ranger. (apache#1603)
  HDDS-4528. Upgrade slf4j to 1.7.30 (apache#1639)
  HDDS-4424. Update README with information how to report security issues (apache#1548)
  HDDS-4484. Use RaftServerImpl isLeader instead of periodic leader update logic in OM and isLeaderReady for read/write requests (apache#1638)
  HDDS-4429. Create unit test for SimpleContainerDownloader. (apache#1551)
  HDDS-4461. Reuse compiled binaries in acceptance test (apache#1588)
  HDDS-4511: Avoiding StaleNodeHandler to take effect in TestDeleteWithSlowFollower. (apache#1625)
  HDDS-4510. SCM can avoid creating RetriableDatanodeEventWatcher for deletion command ACK (apache#1626)
  HDDS-3363. Intermittent failure in testContainerImportExport (apache#1618)
  HDDS-4370. Datanode deletion service can avoid storing deleted blocks. (apache#1620)
  HDDS-4512. Remove unused netty3 transitive dependency (apache#1627)
  HDDS-4481. With HA OM can send deletion blocks to SCM multiple times. (apache#1608)
  HDDS-4487. SCM can avoid using RETRIABLE_DATANODE_COMMAND for datanode deletion commands. (apache#1621)
  HDDS-4471. GrpcOutputStream length can overflow (apache#1617)
  HDDS-4308. Fix issue with quota update (apache#1489)
  HDDS-4392. [DOC] Add Recon architecture to docs (apache#1602)
  HDDS-4501. Reload OM State fail should terminate OM for any exceptions. (apache#1622)
  HDDS-4492. CLI flag --quota should default to 'spaceQuota' to preserve backward compatibility. (apache#1609)
  HDDS-3689. Add various profiles to MiniOzoneChaosCluster to run different modes. (apache#1420)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants