Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multithreading & Caching Optimizations for RefreshIG and ExtractMatBundleOperation #481

Closed

Conversation

echicoine-icf
Copy link
Contributor

@echicoine-icf echicoine-icf commented Oct 25, 2023

Applied multithreading to various processes to enhance performance, and implemented some caching optimizations to resolve duplicate processing.

RefreshIG also now persists "tests-" files along with measures when a fhir server url is provided.

ExtratMatBundleOperation's -dir argument can be used against a single folder and all subfolders

  • Github Issue:
  • I've read the contribution guidelines
  • Code compiles without errors
  • Tests are created / updated
  • Documentation is created / updated

By creating this PR you acknowledge that your contribution will be licensed under Apache 2.0

Evan Chicoine added 30 commits October 11, 2023 06:38
…now includes fhir server entry in arguments.
…ientUtils to collect POST calls and execute them at the end.
…ses for thread safety. Declaring certain variables outside of loops for stability.
…needed due to the nature of modifications of each by the process.
…usted IOUtils.copyFile to return false if the file wasn't copied, true if it was, to accurately track how many files get copied during TestCaseProcessor.bundleTestCaseFiles
…actMatBundleOperation with multithreading, updated matching test accordingly
… file count due to old approach to eliminating duplicate resource processing
…y issues in tests. IGBundleProcessor checking fhirUri for null before reference. RefreshIGOperationTest using wiremock to monitor post calls to mock fhir server. pom.xml updated for wiremock. Will possibly try different libraries for similar approach.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pom was updated to include wiremock which lets me prevent actual calls to urls during refreshIG and test the POST calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives a really simple way to take in a list of tasks, send them off on their own thread, then wait for each to finish.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated bundleLibraryDependencies to be multithreaded for performance enhancement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MeasureProcessor updated to use multithreading. This class extended BaseProcessor for other functions. Possible refactor should include moving those functions to their own class and let it extend BaseProcessor instead of the Abstract parent class extending BaseProcessor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtractMatBundleOperation modified to use multithreading as well as enhanced the directory processing so all child folders are processed recursively. If only X levels of child folders should be looked at, I can make adjustments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bundleTestCases enhanced with caching for performance improvements and modified to return informational messages about the number of files copied. IOUtils.copyFile is a boolean method and uses its own caching, where if the file is not copied it returns false.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An Abstract class housing the bundle method, and persistBundle method that QuestionnaireProcessor, PlanDefinitionProcessor, and MeasureProcessor all once had. The similarities between the methods in each class were such that a common abstract class could be formed.

Copy link
Contributor Author

@echicoine-icf echicoine-icf Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified bundleValueSets so that if an exception is hit, it's immediately thrown so that failure on this method can have more detailed information for the user during the bundle process. A list of failed items alongside their reason are displayed at the end of the process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified for the ExtractMatBundle process to avoid duplicate resources from being processed (previous approach was unreliable.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class was heavily modified so that as the refreshIG process proceeds, any post calls made in the persist methods are actually stored in a list of tasks to be processed at the very end.

Copy link
Contributor Author

@echicoine-icf echicoine-icf Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyFile modified to be a boolean method, indicating when the copy was successful or not. A basic counter was added so the user can be informed of the exact number of files copied at the end of the process. copyFile also now uses caching based off a map where the input and output paths are used as a unique key. This improves performance substantially.

methods such as getResourceFileName and getPaths have had caching introduced for performance improvements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info method modified to simply output the message using System.out.println. This brings the information back to the console for the user.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified for ExtratMatBundle to avoid processing duplicate resources

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was failing in Windows. I modified it so it runs the same way in windows as it does in macos, considering the file separator. Now building locally on windows machines works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the persist modifications, added wiremock to avoid actual calls to the test server and instead count the post calls to ensure they're accurate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Utilizing new method name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed up test based on existing RefreshIG test.

@echicoine-icf
Copy link
Contributor Author

Closing this in favor of #486

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant