Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices #10111

Merged
merged 2 commits into from
Sep 21, 2023

Conversation

sohami
Copy link
Collaborator

@sohami sohami commented Sep 18, 2023

Description

There can be cases where one or more slice may not have timing related information for its leaves in contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice whereas totalNodeTime will reflect the total query time.

Related Issues

Resolves #9815

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 18, 2023

Compatibility status:

Checks if related components are compatible with change 9fb011c

Incompatible components

Incompatible components: [https://github.com/opensearch-project/k-nn.git]

Skipped components

Compatible components

Compatible components: [https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git]

@sohami
Copy link
Collaborator Author

sohami commented Sep 18, 2023

@ticheng-aws @reta Can you please help with review.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sohami
Copy link
Collaborator Author

sohami commented Sep 19, 2023

We need to update CHANGELOG.md as well.

Don't see a need for Changelog as it is a bug fix for unreleased changes.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.smoketest.SmokeTestMultiNodeClientYamlTestSuiteIT.test {yaml=pit/10_basic/Delete all}

…or slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sohami
Copy link
Collaborator Author

sohami commented Sep 21, 2023

Gradle Check (Jenkins) Run Completed with:

All known issues:
#5030
#10133

@sohami sohami merged commit 9f0e017 into opensearch-project:main Sep 21, 2023
10 of 11 checks passed
@reta
Copy link
Collaborator

reta commented Sep 21, 2023

@sohami I understand the frustration regarding failing checks but we should not merge pull requests that have them failed

@reta reta added the backport 2.x Backport to 2.x branch label Sep 21, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/OpenSearch/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/OpenSearch/backport-2.x
# Create a new branch
git switch --create backport/backport-10111-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 9f0e01743de1bf23264945d54d0d5ab20593020b
# Push it to GitHub
git push --set-upstream origin backport/backport-10111-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/OpenSearch/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-10111-to-2.x.

sarthakaggarwal97 pushed a commit to sarthakaggarwal97/OpenSearch that referenced this pull request Sep 24, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
Signed-off-by: Ivan Brusic <ivan.brusic@flocksafety.com>
vikasvb90 pushed a commit to vikasvb90/OpenSearch that referenced this pull request Oct 10, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 24, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 30, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
sohami added a commit to sohami/OpenSearch that referenced this pull request Oct 31, 2023
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
reta pushed a commit that referenced this pull request Oct 31, 2023
…#10898)

* Add support for query profiler with concurrent aggregation (#9248)

* Add support for query profiler with concurrent aggregation (#9248)

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Refactor and work on the PR comments

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Update collectorToLeaves mapping for children breakdowns post profile metric collection and before creating the results

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Refactor logic to compute the slice level breakdown stats and query level breakdown stats for concurrent search case

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Fix QueryProfilePhaseTests and QueryProfileTests, and parameterize QueryProfilerIT with concurrent search enabled

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Handle the case when there are no leaf context to compute the profile stats to return default stats
for all breakdown type along with min/max/avg values. Replace queryStart and queryEnd time with queryNodeTime

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Add UTs for ConcurrentQueryProfileBreakdown

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Add concurrent search stats test into the QueryProfilerIT

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Ticheng Lin <ticheng@amazon.com>
Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
Co-authored-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices (#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Fix flaky query profile phase tests with concurrent search enabled (#10547) (#10547)

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Introduce ConcurrentQueryProfiler to profile query using concurrent segment search path and support concurrency during rewrite and create weight (#10352)

* Fix timer race condition in profile rewrite and create weight for concurrent segment search (#10352)

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

* Refactor and work on the PR comments (#10352)

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

---------

Signed-off-by: Ticheng Lin <ticheng@amazon.com>

---------

Signed-off-by: Ticheng Lin <ticheng@amazon.com>
Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
Co-authored-by: Ticheng Lin <51488860+ticheng-aws@users.noreply.github.com>
@ticheng-aws
Copy link
Contributor

Resolves #9869

shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…or slices (opensearch-project#10111)

* Fix NPE in ConcurrentQueryProfile while computing the breakdown map for slices.

There can be cases where one or more slice may not have timing related information for its leaves in
contexts map. During creation of slice and query level breakdown map it needs to handle such cases by using
default values correctly. Also updating the min/max/avg sliceNodeTime to not include time to create
weight and wait times by slice threads. It will reflect the min/max/avg execution time of each slice
whereas totalNodeTime will reflect the total query time.

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

* Address review comments

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>

---------

Signed-off-by: Sorabh Hamirwasia <sohami.apache@gmail.com>
Signed-off-by: Shivansh Arora <hishiv@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch backport-failed bug Something isn't working flaky-test Random test failure that succeeds on second run skip-changelog
Projects
None yet
3 participants