Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can I use this library to work with Django's native bulk_update to update the modified field? #5

Closed
simkimsia opened this issue May 11, 2020 · 2 comments

Comments

@simkimsia
Copy link

simkimsia commented May 11, 2020

related to your comment abt updating modified but only when values actually changed #3 (comment)

So in that issue, I gave an example of updating many records but with the same set of values.

In this issue, I am talking abt updating many records but each record may have a different set of new values. We will not know beforehand.

My code is currently like this

changes = [
  {
    'id': 1
    'field1': 'new_value11',
    'field2': 'new_value12',
  },
  {
    'id': 2
    'field1': 'new_value21',
    'field2': 'new_value22',
  },
]

to_be_updated = []
to_be_affected = Book.objects.filter(**some_filter_params).all()

# ... some function that will loop through to_be_affected and changes and when the ids match, then have their fields adopt the values in `changes` and then append to the list `to_be_updated`

Book.objects.bulk_update(to_be_updated, ['field1', 'field2']) # i use the native bulk_update

This works well so far so good. But my modified field is not updated

What if i stick to bulk_update

if i continue to use bulk_update i can add the modified field as well and set them all to datetime.now()

but this modified field will be changed even if the values are actually unchanged.

I can of course write more code to do the checks as well. But I think it's not elegant

Can I use model-values in some way?

Because of your previous comment, i realize it might be possible to do bulk changes but selectively update the modified field depending if the values themselves change.

I also don't mind dropping bulk_update if that helps.

@coady
Copy link
Owner

coady commented May 14, 2020

There a few different options here.

bulk_changed will find which rows have changed efficiently, but it only handles one field at a time. There could be a multi-field variant, but it would be an ugly interface.

for field in ('field1', 'field2'):
     data = {row['id']: row[field] for row in changes}
     diff = qs.bulk_changed(field, data)
    # bulk update keys from diff

update is extended to translate a dict into a case statement. So it's similar to the builtin bulk_update, but naturally with a data-oriented interface. By itself, it would over-count the rows, but works nicely with bulk_changed.

Finally there's bulk_change, which efficiently updates and returns the modified counts. It also is one field at a time, but more importantly works by inverting the data to use pk__in queries. This is much more efficient if the number of unique values is low, like an enum or bool.

I would start with bulk_changed and go from there.

@simkimsia
Copy link
Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants