Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #373 - Refactor MultiColSource. #7

Merged
merged 3 commits into from
Jul 18, 2024

Conversation

csirmazbendeguz
Copy link
Owner

Trac ticket number

ticket-373

Branch description

Checklist

  • This PR targets the main branch.
  • The commit message is written in past tense, mentions the ticket number, and ends with a period.
  • I have checked the "Has patch" ticket flag in the Trac system.
  • I have added or updated relevant tests.
  • I have added or updated relevant docs, including release notes if applicable.
  • I have attached screenshots in both light and dark modes for any UI changes.

@csirmazbendeguz csirmazbendeguz force-pushed the ticket_373_multicolsource branch from 0052f19 to aa95616 Compare July 16, 2024 17:12
Copy link

@charettes charettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my understanding the tuple notation works on all tuple operations for SQLite, MySQL, and Postgres.

On oracle, only exact, and in works. The others could be emulated by adding a as_oracle method on them.


Otherwise I like where this is going. I think there could be further refinement in a single a Tuple(Expression) that could be used for Col and other expressions but I'm not sure if we still need Cols for the use case in build_filter. I suspect we don't.

django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
django/db/models/fields/related_lookups.py Show resolved Hide resolved
django/db/models/fields/related_lookups.py Outdated Show resolved Hide resolved
if not self.rhs:
raise EmptyResultSet

# e.g.: (a, b, c) in [(x1, y1, z1), (x2, y2, z2)] as SQL:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one seems to work even on Oracle?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the only one that will work on Oracle.

django/db/models/expressions.py Outdated Show resolved Hide resolved
cols_sql.append(sql)
cols_params.extend(params)

return ", ".join(cols_sql), cols_params

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this wrap in parenthesizes?

Suggested change
return ", ".join(cols_sql), cols_params
return "(%s)" % ", ".join(cols_sql), cols_params

Copy link
Owner Author

@csirmazbendeguz csirmazbendeguz Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would result in SELECT (a, b) FROM c , which I don't think is standard SQL

class Cols(Expression):
def __init__(self, alias, targets, sources, output_field):
super().__init__(output_field=output_field)
self.alias, self.targets, self.sources = alias, targets, sources

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how you managed to combine both Cols and MultiColSource source 🎉

I'm not sure Cols is good name as it's practically a pair. Maybe ColsPair would be a better name?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about going with 2 classes (as explained below)?

col = MultiColSource(
alias, targets, join_info.targets, join_info.final_field
)
col = Cols(alias, targets, join_info.targets, join_info.final_field)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know under which circumstances targets != join_info.targets? This seems to be the only usage where sources and targets might be different. If it never happens we could likely greatly simplify Cols.

Copy link
Owner Author

@csirmazbendeguz csirmazbendeguz Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is used by ForeignObject, where sources = from_fields and targets = to_fields. So here it will always be different.

When Cols is created by the composite field, it will always be the same.

This does sound like we shouldn't combine MultiColSource and Cols. At the same time, both expressions resolve into tuple operations, so it would be nice to combine them.

The other solution I thought about is to have 2 classes:

  1. MultiColSource = base class, used by ForeignObject
  2. Cols(MultiColSource) = sets targets and sources to the same, used by composite fields

This way in the code we could keep checking if it's an instance of MultiColSource.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that eludes here though is that in the case of a single target (the len(targets) == 1 branch above) returning a single Col is enough for the resolving to work fine.

Knowing that, why wouldn't a Tuple of calls to Tuple(self._get_col(target, join_info.final_field, alias) for target in targets) be enough for the related lookups isinstance(self.lhs, Tuple) to take the right decisions? What makes SubqueryConstraint so special?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great question, I haven't dived deep into SubqueryConstraint, let me read into this...

Copy link
Owner Author

@csirmazbendeguz csirmazbendeguz Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SubqueryConstraint is a little convoluted.

Whenever doing a __in lookup on a related field which has multiple columns, first the related table A is joined. Then, the same table A as B is queried in a sub-select.

In MySQL, it does (A.col1, A.col2) IN (SELECT B.col1, B.col2 FROM A as B).
In other backends, it does EXISTS (SELECT B.col1, B.col2 FROM A as B WHERE A.col1 = B.col1 AND A.col2 = B.col2).

It seems we do need special logic to perform a subquery on multiple fields, and this is handled by SubqueryConstraint.

@csirmazbendeguz csirmazbendeguz merged commit 3d2d528 into ticket_373 Jul 18, 2024
9 checks passed
@csirmazbendeguz
Copy link
Owner Author

Since @charettes confirmed this is a good direction, I've merged it to the main branch - I'll address the rest of the issues there.

@csirmazbendeguz
Copy link
Owner Author

csirmazbendeguz commented Jul 18, 2024

On oracle, only exact, and in works.

exact doesn't work, it produces ORA-00907: missing right parenthesis error. I'll check in tomorrow (it should work afaik).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants