-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented a performance data graphing module for batch mode #272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rayyang29 for this; we should also add the graphs with some notes in a separate doc/wiki page once you have them altogether.
pipelines/batch/src/main/java/org/openmrs/analytics/FhirEtl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bashir2 for the code review! I answered the questions, made style changes and ran the black formatter on the python script. Though please have a look at the format produced by the formatter. I will update the PR with the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ray for the changes; I think all of the remaining comments are minor/doc/questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bashir2 for your additional comments. I've made some minor changes and addressed all unresolved conversations.
Description of what I changed
I implemented a module for monitoring the resource usage of batch mode with HAPI as the source. This module generates graphs and csv files of resource usage (CPU, memory and I/O) of the HAPI server, postgres database and pipeline over the duration of the batch job. The user is able to specify the number of processes/cores used in the pipeline to assess batch mode's performance on the local machine.
The following files have been added in this PR:
/utils/resource-monitor/graph_pidstat.py
- The driver program for generating graphs and tables of resource usage./utils/resource-monitor/monitor_pipeline.sh
- A bash script that starts monitoring processes for the server, database and pipeline; called by the driver program./utils/resource-monitor/auto.sh
- A bash script that automates the generation of resource usage graphs and csv files with different number of cores used in the batch job./utils/resource-monitor/README.md
- README file for the resource-monitor module./docker/hapi-compose.yml
- A docker .yml file for the latest version HAPI server with postgresql database.Related to issue #266
E2E test
TESTED:
Checklist: I completed these to help reviewers :)
My IDE is configured to follow the code style of this project.
No? Unsure? -> configure your IDE, format the code and add the changes with
git add . && git commit --amend
I am familiar with Google Style Guides for the language I have coded in.
No? Please take some time and review Java and Python style guides. Note, when in conflict, OpenMRS style guide overrules.
I have added tests to cover my changes. (If you refactored existing code that was well tested you do not have to add tests)
No? -> write tests and add them to this commit
git add . && git commit --amend
I ran
mvn clean package
right before creating this pull request and added all formatting changes to my commit.All new and existing tests passed.
No? -> figure out why and add the fix to your commit. It is your responsibility to make sure your code works.
My pull request is based on the latest changes of the master branch.
No? Unsure? -> execute command
git pull --rebase upstream master