Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Use Unknown | T_inferred for undeclared public symbols #15674

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ def f():
reveal_type(a7) # revealed: None
reveal_type(a8) # revealed: Literal[1]
# TODO: This should be Color.RED
reveal_type(b1) # revealed: Literal[0]
reveal_type(b1) # revealed: Unknown | Literal[0]

# error: [invalid-type-form]
invalid1: Literal[3 + 4]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def _(flag: bool):
# that `Foo.__iadd__` may be unbound as additional context.
f += "Hello, world!"

reveal_type(f) # revealed: int | Unknown
reveal_type(f) # revealed: Unknown | int
```

## Partially bound with `__add__`
Expand All @@ -96,7 +96,7 @@ def _(flag: bool):
f = Foo()
f += "Hello, world!"

reveal_type(f) # revealed: int | str
reveal_type(f) # revealed: Unknown | int | str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand where the new Unknown is coming from here? We are revealing the type of f locally within the scope, so it shouldn't be due to modifiability of f. And both __add__ and __iadd__ are declared in the body of Foo (because function definition statements are declarations), so I don't think either of those should be unioned with Unknown?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__iadd__ is possibly undeclared.

I'm not 100% clear on the semantics of __add__ and __iadd__, so I suspect that you think that this possibly-undeclaredness should be absorbed by the definite-declaredness of __add__?

I'll look into that tomorrow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I said off the top of my head that we should add Unknown to the union for possibly-undeclared, because the "declared type" in the undeclared path is Unknown. But now I'm wondering if that's just wrong. An external assignment that violates the possibly-declared type would effectively be creating a conflicting-declarations situation, and we currently error on conflicting declarations, so it seems like we should also error on that assignment. Which we would do, if we didn't union with Unknown for possibly-undeclared.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. To make it more concrete, my understanding is that we would always want to show an error in the last line here, because it is possible that flag is True, and in this case, we would violate the declaration.

def _(flag: bool):
    class C:
        if flag:
            var: int

        var = 1

    C.var = "a"

we would always want to show an error in the last line.

I'll revert f4ca4b8 and add a new test.

```

## Partially bound target union
Expand All @@ -116,7 +116,7 @@ def _(flag1: bool, flag2: bool):
f = 42.0
f += 12

reveal_type(f) # revealed: int | str | float
reveal_type(f) # revealed: Unknown | int | str | float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question; why does Unknown show up here now?

```

## Target union
Expand Down Expand Up @@ -160,5 +160,5 @@ def f(flag: bool, flag2: bool):
f = Bar()
f += 12

reveal_type(f) # revealed: int | str | float
reveal_type(f) # revealed: Unknown | int | str | float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And same question here.

```
17 changes: 8 additions & 9 deletions crates/red_knot_python_semantic/resources/mdtest/attributes.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ class C:

reveal_type(C.pure_class_variable1) # revealed: str

# TODO: this should be `Literal[1]`, or `Unknown | Literal[1]`.
# TODO: Should be `Unknown | Literal[1]`.
reveal_type(C.pure_class_variable2) # revealed: Unknown

c_instance = C()
Expand Down Expand Up @@ -252,8 +252,7 @@ class C:

reveal_type(C.variable_with_class_default1) # revealed: str

# TODO: this should be `Unknown | Literal[1]`.
reveal_type(C.variable_with_class_default2) # revealed: Literal[1]
sharkdp marked this conversation as resolved.
Show resolved Hide resolved
reveal_type(C.variable_with_class_default2) # revealed: Unknown | Literal[1]

c_instance = C()

Expand Down Expand Up @@ -296,8 +295,8 @@ def _(flag: bool):
else:
x = 4

reveal_type(C1.x) # revealed: Literal[1, 2]
reveal_type(C2.x) # revealed: Literal[3, 4]
reveal_type(C1.x) # revealed: Unknown | Literal[1, 2]
reveal_type(C2.x) # revealed: Unknown | Literal[3, 4]
```

## Inherited class attributes
Expand All @@ -311,7 +310,7 @@ class A:
class B(A): ...
class C(B): ...

reveal_type(C.X) # revealed: Literal["foo"]
reveal_type(C.X) # revealed: Unknown | Literal["foo"]
```

### Multiple inheritance
Expand All @@ -334,7 +333,7 @@ class A(B, C): ...
reveal_type(A.__mro__)

# `E` is earlier in the MRO than `F`, so we should use the type of `E.X`
reveal_type(A.X) # revealed: Literal[42]
reveal_type(A.X) # revealed: Unknown | Literal[42]
```

## Unions with possibly unbound paths
Expand All @@ -356,7 +355,7 @@ def _(flag1: bool, flag2: bool):
C = C1 if flag1 else C2 if flag2 else C3

# error: [possibly-unbound-attribute] "Attribute `x` on type `Literal[C1, C2, C3]` is possibly unbound"
reveal_type(C.x) # revealed: Literal[1, 3]
reveal_type(C.x) # revealed: Unknown | Literal[1, 3]
```

### Possibly-unbound within a class
Expand All @@ -379,7 +378,7 @@ def _(flag: bool, flag1: bool, flag2: bool):
C = C1 if flag1 else C2 if flag2 else C3

# error: [possibly-unbound-attribute] "Attribute `x` on type `Literal[C1, C2, C3]` is possibly unbound"
reveal_type(C.x) # revealed: Literal[1, 2, 3]
reveal_type(C.x) # revealed: Unknown | Literal[1, 2, 3]
```

### Unions with all paths unbound
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -262,7 +262,8 @@ class A:
class B:
__add__ = A()

reveal_type(B() + B()) # revealed: int
# TODO: this could be `int` if we declare `B.__add__` using a `Callable` type
reveal_type(B() + B()) # revealed: Unknown | int
Comment on lines +265 to +266
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one example of a particularly annoying case that also comes up with __iter__ = …, __bool__ = …, __enter__ = …, __exit__ = …, etc.

The symbol __add__ is not declared here, so we end up inferring Unknown | A for it, which results in Unknown | int when called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not too worried about this, because almost always these will be normal methods defined within the class (which is a declaration), not assigned like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be nice if there was an Infer annotation we could use for "just infer the type for me and use it as the public type" :/ Callable annotations in Python are, um, not ergonomic :(

However, realistically: it's very unusual to use non-function instances as methods like this. This kind of thing is quite common:

class Foo:
    def __or__(self, other) -> int:
        return 42

    __ror__ = __or__

But I think that will still be fine with this change, since functions have the same declared type as their inferred type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think we'd have a problem with __ror__ in that example, because it's not declared (__or__ is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think we'd have a problem with __ror__ in that example, because it's not declared (__or__ is).

Yes. It would be the exact same case as with __add__ here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I see, thanks. This seems bad :( I just don't know how we'd explain this to users

```

## Integration test: numbers from typeshed
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,23 @@ that is, a use of a symbol from another scope. If a symbol has a declared type i
(e.g. `int`), we use that as the symbol's "public type" (the type of the symbol from the perspective
of other scopes) even if there is a more precise local inferred type for the symbol (`Literal[1]`).

If a symbol has no declared type, we use the union of `Unknown` with the inferred type as the public
type. If there is no declaration, then the symbol can be reassigned to any type from another scope;
the union with `Unknown` reflects that its type must at least be as large as the type of the
assigned value, but could be arbitrarily larger.

We test the whole matrix of possible boundness and declaredness states. The current behavior is
summarized in the following table, while the tests below demonstrate each case. Note that some of
this behavior is questionable and might change in the future. See the TODOs in `symbol_by_id`
(`types.rs`) and [this issue](https://github.com/astral-sh/ruff/issues/14297) for more information.
In particular, we should raise errors in the "possibly-undeclared-and-unbound" as well as the
"undeclared-and-possibly-unbound" cases (marked with a "?").

| **Public type** | declared | possibly-undeclared | undeclared |
| ---------------- | ------------ | -------------------------- | ------------ |
| bound | `T_declared` | `T_declared \| T_inferred` | `T_inferred` |
| possibly-unbound | `T_declared` | `T_declared \| T_inferred` | `T_inferred` |
| unbound | `T_declared` | `T_declared` | `Unknown` |
| **Public type** | declared | possibly-undeclared | undeclared |
| ---------------- | ------------ | ------------------------------------- | ----------------------- |
| bound | `T_declared` | `Unknown \| T_declared \| T_inferred` | `Unknown \| T_inferred` |
| possibly-unbound | `T_declared` | `Unknown \| T_declared \| T_inferred` | `Unknown \| T_inferred` |
| unbound | `T_declared` | `Unknown \| T_declared` | `Unknown` |

| **Diagnostic** | declared | possibly-undeclared | undeclared |
| ---------------- | -------- | ------------------------- | ------------------- |
Expand Down Expand Up @@ -106,8 +111,8 @@ if flag():
```py
from mod import x, y

reveal_type(x) # revealed: Literal[1] | Any
reveal_type(y) # revealed: Literal[2] | Unknown
reveal_type(x) # revealed: Unknown | Literal[1] | Any
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably cancel one of the dynamic types here? We might have a ticket for that. I'll note it down as a follow up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a TODO in Type::is_gradual_equivalent_to pointing out that not doing this causes problems... but I don't know that we have a ticket

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should do that, but it will also make this test less clearly testing what it is supposed to test. The reason we use x: Any here is that it a) doesn't disappear in union with Literal[1] but b) also doesn't cause an invalid-declaration diagnostic.

But we can discuss how to handle this in the follow up PR.

reveal_type(y) # revealed: Unknown | Literal[2]
```

### Possibly undeclared and possibly unbound
Expand All @@ -132,8 +137,8 @@ else:
# error: [possibly-unbound-import]
from mod import x, y

reveal_type(x) # revealed: Literal[1] | Any
reveal_type(y) # revealed: Literal[2] | str
reveal_type(x) # revealed: Unknown | Literal[1] | Any
reveal_type(y) # revealed: Unknown | Literal[2] | str
```

### Possibly undeclared and unbound
Expand All @@ -153,23 +158,21 @@ if flag():
# on top of this document.
from mod import x

reveal_type(x) # revealed: int
reveal_type(x) # revealed: Unknown | int
```

## Undeclared

### Undeclared but bound

We use the inferred type as the public type, if a symbol has no declared type.

```py path=mod.py
x = 1
```

```py
from mod import x

reveal_type(x) # revealed: Literal[1]
reveal_type(x) # revealed: Unknown | Literal[1]
```

### Undeclared and possibly unbound
Expand All @@ -189,7 +192,7 @@ if flag:
# on top of this document.
from mod import x

reveal_type(x) # revealed: Literal[1]
reveal_type(x) # revealed: Unknown | Literal[1]
```

### Undeclared and unbound
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def _(flag: bool):

a = PossiblyNotCallable()
result = a() # error: "Object of type `PossiblyNotCallable` is not callable (possibly unbound `__call__` method)"
reveal_type(result) # revealed: int
reveal_type(result) # revealed: Unknown | int
```

## Possibly unbound callable
Expand All @@ -52,7 +52,7 @@ class NonCallable:
__call__ = 1

a = NonCallable()
# error: "Object of type `NonCallable` is not callable"
# error: "Object of type `Unknown | Literal[1]` is not callable (due to union element `Literal[1]`)"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another instance of the callable problem. Notice how the usefulness of the error message degrades.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is really begging for nested diagnostics (so you get both "Object of type NonCallable is not callable" and nested details explaining why.)

We could easily adjust the handling so you still get "Object of type NonCallable is not callable" here, but then you lose the details of which union element failed callability.

Even without nested diagnostics we could still provide all of the information here in a new error, it just requires more and more bespoke error-cases handling in the __call__ lookup code. (We would need to match on the case that callability of __call__ failed due to a union, and then write a new custom error message for that case which mentions both the outer NonCallable type and the details of why its __call__ isn't callable.) Nested diagnostics would just let us handle this kind of situation more generically, with less special casing.

cc @BurntSushi @MichaReiser re diagnostics considerations

reveal_type(a()) # revealed: Unknown
```

Expand All @@ -67,7 +67,7 @@ def _(flag: bool):
def __call__(self) -> int: ...

a = NonCallable()
# error: "Object of type `Literal[1] | Literal[__call__]` is not callable (due to union element `Literal[1]`)"
# error: "Object of type `Unknown | Literal[1] | Literal[__call__]` is not callable (due to union element `Literal[1]`)"
reveal_type(a()) # revealed: Unknown | int
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@ class IntIterable:
def __iter__(self) -> IntIterator:
return IntIterator()

# revealed: tuple[int, int]
# TODO: This could be a `tuple[int, int]` if we model that `y` can not be modified in the outer comprehension scope
# revealed: tuple[int, Unknown | int]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

every comprehension has its own scope, so we also look up y as a public symbol 🫤

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is something @AlexWaygood wanted to look into, if I'm not mistaken (after outcome)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for nested scopes that are known to execute eagerly (list/set/dict comprehensions, class bodies), and for ones that probably execute eagerly (generator expressions), we should model the name lookup as a "use" in the outer scope, at that point in control flow of the outer scope, rather than as a public type lookup. Alex has WIP on this, and it should fix this TODO and the one below.

[[reveal_type((x, y)) for x in IntIterable()] for y in IntIterable()]
```

Expand All @@ -66,7 +67,8 @@ class IterableOfIterables:
def __iter__(self) -> IteratorOfIterables:
return IteratorOfIterables()

# revealed: tuple[int, IntIterable]
# TODO: This could be a `tuple[int, int]` (see above)
# revealed: tuple[int, Unknown | IntIterable]
[[reveal_type((x, y)) for x in y] for y in IterableOfIterables()]
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,29 @@
```py
def _(flag: bool):
class A:
always_bound = 1
always_bound: int = 1

if flag:
union = 1
else:
union = "abc"

if flag:
possibly_unbound = "abc"
union_declared: int = 1
else:
union_declared: str = "abc"

if flag:
possibly_unbound: str = "abc"

reveal_type(A.always_bound) # revealed: int

reveal_type(A.always_bound) # revealed: Literal[1]
reveal_type(A.union) # revealed: Unknown | Literal[1, "abc"]

reveal_type(A.union) # revealed: Literal[1, "abc"]
reveal_type(A.union_declared) # revealed: int | str

# error: [possibly-unbound-attribute] "Attribute `possibly_unbound` on type `Literal[A]` is possibly unbound"
reveal_type(A.possibly_unbound) # revealed: Literal["abc"]
reveal_type(A.possibly_unbound) # revealed: Unknown | str

# error: [unresolved-attribute] "Type `Literal[A]` has no attribute `non_existent`"
reveal_type(A.non_existent) # revealed: Unknown
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ reveal_type("x" or "y" and "") # revealed: Literal["x"]
## Evaluates to builtin

```py path=a.py
redefined_builtin_bool = bool
redefined_builtin_bool: type[bool] = bool
sharkdp marked this conversation as resolved.
Show resolved Hide resolved

def my_bool(x) -> bool:
return True
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,10 +172,10 @@ class IntUnion:
def __len__(self) -> Literal[SomeEnum.INT, SomeEnum.INT_2]: ...

reveal_type(len(Auto())) # revealed: int
reveal_type(len(Int())) # revealed: Literal[2]
reveal_type(len(Int())) # revealed: int
reveal_type(len(Str())) # revealed: int
reveal_type(len(Tuple())) # revealed: int
reveal_type(len(IntUnion())) # revealed: Literal[2, 32]
reveal_type(len(IntUnion())) # revealed: int
Comment on lines 174 to +178
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, um... it's not your fault, but I think this test should definitely have some comments next to it saying how we don't really support enums at all yet 😄 I think I can see why the result changes here, but it seems incorrect that we allow Literal[SomeEnum.INT] at all as an annotation given that we don't yet infer the correct type for SomeEnum.INT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also seem to remember that we said in the original PR that those test cases are not particularly important, so I didn't bother to fix them / add a TODO. Let me know if you think otherwise.

```

### Negative integers
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ wrong_innards: MyBox[int] = MyBox("five")
# TODO reveal int, do not leak the typevar
reveal_type(box.data) # revealed: T

reveal_type(MyBox.box_model_number) # revealed: Literal[695]
reveal_type(MyBox.box_model_number) # revealed: Unknown | Literal[695]
```

## Subclassing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ reveal_type(y)
# error: [possibly-unbound-import] "Member `y` of module `maybe_unbound` is possibly unbound"
from maybe_unbound import x, y

reveal_type(x) # revealed: Literal[3]
reveal_type(y) # revealed: Literal[3]
reveal_type(x) # revealed: Unknown | Literal[3]
reveal_type(y) # revealed: Unknown | Literal[3]
```

## Maybe unbound annotated
Expand Down Expand Up @@ -52,8 +52,8 @@ Importing an annotated name prefers the declared type over the inferred type:
# error: [possibly-unbound-import] "Member `y` of module `maybe_unbound_annotated` is possibly unbound"
from maybe_unbound_annotated import x, y

reveal_type(x) # revealed: Literal[3]
reveal_type(y) # revealed: int
reveal_type(x) # revealed: Unknown | Literal[3]
reveal_type(y) # revealed: Unknown | int
```

## Maybe undeclared
Expand All @@ -71,7 +71,7 @@ if coinflip():
```py
from maybe_undeclared import x

reveal_type(x) # revealed: int
reveal_type(x) # revealed: Unknown | int
```

## Reimport
Expand Down Expand Up @@ -119,5 +119,5 @@ else:
```py
from b import x

reveal_type(x) # revealed: int
reveal_type(x) # revealed: Unknown | int
```
Original file line number Diff line number Diff line change
Expand Up @@ -109,9 +109,9 @@ reveal_type(x)
def _(flag: bool):
class NotIterable:
if flag:
__iter__ = 1
__iter__: int = 1
else:
__iter__ = None
__iter__: None = None

for x in NotIterable(): # error: "Object of type `NotIterable` is not iterable"
pass
Expand All @@ -135,7 +135,7 @@ for x in nonsense: # error: "Object of type `Literal[123]` is not iterable"
class NotIterable:
def __getitem__(self, key: int) -> int:
return 42
__iter__ = None
__iter__: None = None

for x in NotIterable(): # error: "Object of type `NotIterable` is not iterable"
pass
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,9 +99,9 @@ def _(x: str | int):
class A: ...
class B: ...

alias_for_type = type
Copy link
Contributor Author

@sharkdp sharkdp Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not annotate this alias as Literal[type], as that can't be spelled (except using knot_extensions.TypeOf, which I didn't want to use here).

type[type] is too broad for the current narrowing code which pattern-matches on ClassLiteral("type")


def _(x: A | B):
alias_for_type = type

if alias_for_type(x) is A:
reveal_type(x) # revealed: A
```
Expand Down
Loading
Loading