-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic Bulk Document Creation #2772
Conversation
@@ -107,7 +114,8 @@ def evaluate_trailer(datafile, trailer_count, multiple_trailer_errors, is_last_l | |||
def rollback_records(unsaved_records, datafile): | |||
"""Delete created records in the event of a failure.""" | |||
logger.info("Rolling back created records.") | |||
for model in unsaved_records: | |||
for document in unsaved_records: | |||
model = document.Django.model | |||
num_deleted, models = model.objects.filter(datafile=datafile).delete() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of curiosity i ran the big fake_rollback
file and noticed the delete requests going crazy
Screen.Recording.2023-12-20.at.10.39.15.AM.mov
but the db and elastic eventually both end up in a consistent state.
this may be a topic for another ticket, but implementing some sort of bulk_delete
may be helpful. existing solutions might not honor the signals needed for elastic, though. https://stackoverflow.com/a/36935536
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, our transaction-less rollback is a pain point for sure. It would be interesting to test out the raw_delete
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have to be careful with raw_delete
though!
): | ||
self.model = model | ||
self.document = document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this change makes me nervous because of the potential side-effects, but you've covered all the cases i've thought of so far
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch and nice fix!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This reverts commit 26455b4.
|
||
datafile = fields.ObjectField(properties={ | ||
'pk': fields.IntegerField(), | ||
'id': fields.IntegerField(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only thing i would say here is that id
is not always the pk
and we've run into cases in the past where this assumption has broken down. everything works in this case, though
…elastic-bulk-doc-creation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@elipe17 test results below. im not seeing the large file fully parse locally or in dev environment, so im unable to check those bulk create results. i assume thats beyond the scope of this ticket, so no reason to hold this up!
apennington@HHSLBDSWL73 MINGW64 ~/GitHub/RAFT/TANF-app (elastic-bulk-doc-creation)
$ curl -H "Content-Type: application/json" -X GET "localhost:9200/tanf_t1_submissions/_count"
{"count":1,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0}}
# large file parsing timed out locally and apps crashed, potentially due to memory issues
dev env results for large file submission
2024-01-11T16:17:00.42-0500 [APP/PROC/WEB/0] ERR [2024-01-11 21:17:00,425: ERROR/MainProcess] Process 'ForkPoolWorker-2' pid:173 exited with 'signal 9 (SIGKILL)'
2024-01-11T16:17:00.43-0500 [APP/PROC/WEB/0] ERR [2024-01-11 21:17:00,437: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL) Job: 2.')
2024-01-11T16:17:00.43-0500 [APP/PROC/WEB/0] ERR Traceback (most recent call last):
2024-01-11T16:17:00.43-0500 [APP/PROC/WEB/0] ERR File "/home/vcap/deps/1/python/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
2024-01-11T16:17:00.43-0500 [APP/PROC/WEB/0] ERR raise WorkerLostError(
2024-01-11T16:17:00.43-0500 [APP/PROC/WEB/0] ERR billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL) Job: 2.
Summary of Changes
How to Test
curl -H "Content-Type: application/json" -X GET "localhost:9200/<INDEX_NAME>/_count"
where <INDEX_NAME> can betanf_t1_submissions
.ADS.E2J.NDM1.TS53_fake.txt
and verify that it takes minutes and not hours to parse/validate. Should be around 3 - 5 minutes.Deliverables
More details on how deliverables herein are assessed included here.
Deliverable 1: Accepted Features
Checklist of ACs:
adpennington
approvedDeliverable 2: Tested Code
CodeCov Report
comment in PR)CodeCov Report
comment in PR)Deliverable 3: Properly Styled Code
Deliverable 4: Accessible
iamjolly
andttran-hub
using Accessibility Insights reveal any errors introduced in this PR?Deliverable 5: Deployed
Deliverable 6: Documented
Deliverable 7: Secure
Deliverable 8: User Research
Research product(s) clearly articulate(s):