[red-knot] Use `Unknown | T_inferred` for undeclared public symbols #15674

sharkdp · 2025-01-22T15:39:14Z

Summary

Use Unknown | T_inferred as the type for undeclared public symbols.

Test Plan

Updated tests.

sharkdp · 2025-01-22T15:48:03Z

crates/red_knot_python_semantic/resources/mdtest/unary/not.md

@@ -139,7 +139,7 @@ reveal_type(not AlwaysFalse())

 # We don't get into a cycle if someone sets their `__bool__` method to the `bool` builtin:
 class BoolIsBool:
-    __bool__ = bool
+    __bool__: type[bool] = bool


This is currently a workaround to avoid running into #15672

Maybe that information could be recorded in a TODO comment here?

sharkdp · 2025-01-22T15:49:54Z

crates/red_knot_python_semantic/resources/mdtest/attributes.md

-# TODO: this should be `Unknown | Literal[1]`.
-reveal_type(C.variable_with_class_default2)  # revealed: Literal[1]


This is the actual TODO I attempted to resolve!

sharkdp · 2025-01-22T15:52:03Z

crates/red_knot_python_semantic/resources/mdtest/binary/instances.md

+# TODO: this could be `int` if we declare `B.__add__` using a `Callable` type
+reveal_type(B() + B())  # revealed: Unknown | int


This is one example of a particularly annoying case that also comes up with __iter__ = …, __bool__ = …, __enter__ = …, __exit__ = …, etc.

The symbol __add__ is not declared here, so we end up inferring Unknown | A for it, which results in Unknown | int when called.

Not too worried about this, because almost always these will be normal methods defined within the class (which is a declaration), not assigned like this.

it would be nice if there was an Infer annotation we could use for "just infer the type for me and use it as the public type" :/ Callable annotations in Python are, um, not ergonomic :(

However, realistically: it's very unusual to use non-function instances as methods like this. This kind of thing is quite common:

class Foo: def __or__(self, other) -> int: return 42 __ror__ = __or__

But I think that will still be fine with this change, since functions have the same declared type as their inferred type?

No, I think we'd have a problem with __ror__ in that example, because it's not declared (__or__ is).

No, I think we'd have a problem with __ror__ in that example, because it's not declared (__or__ is).

Yes. It would be the exact same case as with __add__ here.

Yeah, I see, thanks. This seems bad :( I just don't know how we'd explain this to users

sharkdp · 2025-01-22T15:53:06Z

crates/red_knot_python_semantic/resources/mdtest/call/callable_instance.md

@@ -52,7 +52,7 @@ class NonCallable:
    __call__ = 1

 a = NonCallable()
-# error: "Object of type `NonCallable` is not callable"
+# error: "Object of type `Unknown | Literal[1]` is not callable (due to union element `Literal[1]`)"


Another instance of the callable problem. Notice how the usefulness of the error message degrades.

Yes, this is really begging for nested diagnostics (so you get both "Object of type NonCallable is not callable" and nested details explaining why.)

We could easily adjust the handling so you still get "Object of type NonCallable is not callable" here, but then you lose the details of which union element failed callability.

Even without nested diagnostics we could still provide all of the information here in a new error, it just requires more and more bespoke error-cases handling in the __call__ lookup code. (We would need to match on the case that callability of __call__ failed due to a union, and then write a new custom error message for that case which mentions both the outer NonCallable type and the details of why its __call__ isn't callable.) Nested diagnostics would just let us handle this kind of situation more generically, with less special casing.

cc @BurntSushi @MichaReiser re diagnostics considerations

crates/red_knot_python_semantic/resources/mdtest/expression/boolean.md

sharkdp · 2025-01-22T15:57:02Z

crates/red_knot_python_semantic/resources/mdtest/narrow/type.md

@@ -99,9 +99,9 @@ def _(x: str | int):
 class A: ...
 class B: ...

-alias_for_type = type


I can not annotate this alias as Literal[type], as that can't be spelled (except using knot_extensions.TypeOf, which I didn't want to use here).

type[type] is too broad for the current narrowing code which pattern-matches on ClassLiteral("type")

crates/red_knot_python_semantic/src/types.rs

carljm · 2025-01-22T16:55:36Z

crates/red_knot_python_semantic/resources/mdtest/binary/instances.md

+# TODO: this could be `int` if we declare `B.__add__` using a `Callable` type
+reveal_type(B() + B())  # revealed: Unknown | int


Not too worried about this, because almost always these will be normal methods defined within the class (which is a declaration), not assigned like this.

crates/red_knot_python_semantic/resources/mdtest/boundness_declaredness/public.md

carljm · 2025-01-22T16:59:56Z

crates/red_knot_python_semantic/resources/mdtest/call/callable_instance.md

@@ -52,7 +52,7 @@ class NonCallable:
    __call__ = 1

 a = NonCallable()
-# error: "Object of type `NonCallable` is not callable"
+# error: "Object of type `Unknown | Literal[1]` is not callable (due to union element `Literal[1]`)"


Yes, this is really begging for nested diagnostics (so you get both "Object of type NonCallable is not callable" and nested details explaining why.)

We could easily adjust the handling so you still get "Object of type NonCallable is not callable" here, but then you lose the details of which union element failed callability.

Even without nested diagnostics we could still provide all of the information here in a new error, it just requires more and more bespoke error-cases handling in the __call__ lookup code. (We would need to match on the case that callability of __call__ failed due to a union, and then write a new custom error message for that case which mentions both the outer NonCallable type and the details of why its __call__ isn't callable.) Nested diagnostics would just let us handle this kind of situation more generically, with less special casing.

cc @BurntSushi @MichaReiser re diagnostics considerations

crates/red_knot_python_semantic/resources/mdtest/expression/boolean.md

AlexWaygood

Thank you for this! This makes it clear what the impact of this change would be. I agree that the most unintuitive thing is that we'd infer Unknown unions when local scopes reference types from enclosing scopes:

X = 1

def foo():
    reveal_type(X)  # Unknown | Literal[1]  :(

This is probably more correct, though? Because X might be mutated by other modules "monkey-patching" this module's globals.

crates/red_knot_python_semantic/resources/mdtest/binary/booleans.md

AlexWaygood · 2025-01-22T17:23:47Z

crates/red_knot_python_semantic/resources/mdtest/binary/instances.md

+# TODO: this could be `int` if we declare `B.__add__` using a `Callable` type
+reveal_type(B() + B())  # revealed: Unknown | int


it would be nice if there was an Infer annotation we could use for "just infer the type for me and use it as the public type" :/ Callable annotations in Python are, um, not ergonomic :(

However, realistically: it's very unusual to use non-function instances as methods like this. This kind of thing is quite common:

class Foo: def __or__(self, other) -> int: return 42 __ror__ = __or__

But I think that will still be fine with this change, since functions have the same declared type as their inferred type?

AlexWaygood · 2025-01-22T17:29:28Z

crates/red_knot_python_semantic/resources/mdtest/expression/len.md

 reveal_type(len(Auto()))  # revealed: int
-reveal_type(len(Int()))  # revealed: Literal[2]
+reveal_type(len(Int()))  # revealed: int
 reveal_type(len(Str()))  # revealed: int
 reveal_type(len(Tuple()))  # revealed: int
-reveal_type(len(IntUnion()))  # revealed: Literal[2, 32]
+reveal_type(len(IntUnion()))  # revealed: int


wow, um... it's not your fault, but I think this test should definitely have some comments next to it saying how we don't really support enums at all yet 😄 I think I can see why the result changes here, but it seems incorrect that we allow Literal[SomeEnum.INT] at all as an annotation given that we don't yet infer the correct type for SomeEnum.INT

Yes, I also seem to remember that we said in the original PR that those test cases are not particularly important, so I didn't bother to fix them / add a TODO. Let me know if you think otherwise.

…ference (#15690) ## Summary Make the remaining `infer.rs` unit tests independent from public symbol type inference decisions (see upcoming change in #15674). ## Test Plan - Made sure that the unit tests actually fail if one of the `assert_type` assertions is changed.

…ence (#15691) ## Summary Another small PR to focus #15674 solely on the relevant changes. This makes our Markdown tests less dependent on precise types of public symbols, without actually changing anything semantically in these tests. Best reviewed using ignore-whitespace-mode. ## Test Plan Tested these changes on `main` and on the branch from #15674.

sharkdp · 2025-01-23T14:10:13Z

crates/red_knot_python_semantic/resources/mdtest/comprehensions/basic.md

@@ -43,7 +43,7 @@ class IntIterable:
    def __iter__(self) -> IntIterator:
        return IntIterator()

-# revealed: tuple[int, int]
+# revealed: tuple[int, Unknown | int]


every comprehension has its own scope, so we also look up y as a public symbol 🫤

I believe this is something @AlexWaygood wanted to look into, if I'm not mistaken (after outcome)

Yes, for nested scopes that are known to execute eagerly (list/set/dict comprehensions, class bodies), and for ones that probably execute eagerly (generator expressions), we should model the name lookup as a "use" in the outer scope, at that point in control flow of the outer scope, rather than as a public type lookup. Alex has WIP on this, and it should fix this TODO and the one below.

sharkdp · 2025-01-23T15:13:10Z

crates/red_knot_python_semantic/src/types.rs

+    // We special-case known-instance types here since symbols like `typing.Any` are typically
+    // not declared in the stubs (e.g. `Any = object()`), but we still want to treat them as
+    // such.
+    let is_known_instance = inferred
+        .ignore_possibly_unbound()
+        .is_some_and(|ty| matches!(ty, Type::KnownInstance(_)));


This is probably a questionable choice. I don't know what else to do though. The problem appears with typing.Any, typing.List, typing.Union, …, knot_extensions.Unknown, knot_extensions.AlwaysTrue.

sharkdp · 2025-01-23T20:46:24Z

crates/red_knot_python_semantic/resources/mdtest/boundness_declaredness/public.md

@@ -106,8 +111,8 @@ if flag():
 ```py
 from mod import x, y

-reveal_type(x)  # revealed: Literal[1] | Any
-reveal_type(y)  # revealed: Literal[2] | Unknown
+reveal_type(x)  # revealed: Unknown | Literal[1] | Any


We should probably cancel one of the dynamic types here? We might have a ticket for that. I'll note it down as a follow up.

I think we have a TODO in Type::is_gradual_equivalent_to pointing out that not doing this causes problems... but I don't know that we have a ticket

I agree that we should do that, but it will also make this test less clearly testing what it is supposed to test. The reason we use x: Any here is that it a) doesn't disappear in union with Literal[1] but b) also doesn't cause an invalid-declaration diagnostic.

But we can discuss how to handle this in the follow up PR.

carljm

Thank you!

I would like to understand where Unknown is coming from in the cases I commented inline; otherwise this looks good to go.

carljm · 2025-01-23T21:32:29Z

crates/red_knot_python_semantic/resources/mdtest/assignment/augmented.md

@@ -96,7 +96,7 @@ def _(flag: bool):
    f = Foo()
    f += "Hello, world!"

-    reveal_type(f)  # revealed: int | str
+    reveal_type(f)  # revealed: Unknown | int | str


I'm not sure I understand where the new Unknown is coming from here? We are revealing the type of f locally within the scope, so it shouldn't be due to modifiability of f. And both __add__ and __iadd__ are declared in the body of Foo (because function definition statements are declarations), so I don't think either of those should be unioned with Unknown?

__iadd__ is possibly undeclared.

I'm not 100% clear on the semantics of __add__ and __iadd__, so I suspect that you think that this possibly-undeclaredness should be absorbed by the definite-declaredness of __add__?

I'll look into that tomorrow.

Hmm. I said off the top of my head that we should add Unknown to the union for possibly-undeclared, because the "declared type" in the undeclared path is Unknown. But now I'm wondering if that's just wrong. An external assignment that violates the possibly-declared type would effectively be creating a conflicting-declarations situation, and we currently error on conflicting declarations, so it seems like we should also error on that assignment. Which we would do, if we didn't union with Unknown for possibly-undeclared.

carljm · 2025-01-23T21:32:59Z

crates/red_knot_python_semantic/resources/mdtest/assignment/augmented.md

@@ -116,7 +116,7 @@ def _(flag1: bool, flag2: bool):
        f = 42.0
    f += 12

-    reveal_type(f)  # revealed: int | str | float
+    reveal_type(f)  # revealed: Unknown | int | str | float


Same question; why does Unknown show up here now?

carljm · 2025-01-23T21:33:14Z

crates/red_knot_python_semantic/resources/mdtest/assignment/augmented.md

@@ -160,5 +160,5 @@ def f(flag: bool, flag2: bool):
        f = Bar()
    f += 12

-    reveal_type(f)  # revealed: int | str | float
+    reveal_type(f)  # revealed: Unknown | int | str | float


And same question here.

carljm · 2025-01-23T21:38:12Z

crates/red_knot_python_semantic/resources/mdtest/boundness_declaredness/public.md

@@ -106,8 +111,8 @@ if flag():
 ```py
 from mod import x, y

-reveal_type(x)  # revealed: Literal[1] | Any
-reveal_type(y)  # revealed: Literal[2] | Unknown
+reveal_type(x)  # revealed: Unknown | Literal[1] | Any


I agree that we should do that, but it will also make this test less clearly testing what it is supposed to test. The reason we use x: Any here is that it a) doesn't disappear in union with Literal[1] but b) also doesn't cause an invalid-declaration diagnostic.

But we can discuss how to handle this in the follow up PR.

carljm · 2025-01-23T21:42:02Z

crates/red_knot_python_semantic/resources/mdtest/comprehensions/basic.md

@@ -43,7 +43,7 @@ class IntIterable:
    def __iter__(self) -> IntIterator:
        return IntIterator()

-# revealed: tuple[int, int]
+# revealed: tuple[int, Unknown | int]


Yes, for nested scopes that are known to execute eagerly (list/set/dict comprehensions, class bodies), and for ones that probably execute eagerly (generator expressions), we should model the name lookup as a "use" in the outer scope, at that point in control flow of the outer scope, rather than as a public type lookup. Alex has WIP on this, and it should fix this TODO and the one below.

carljm · 2025-01-23T21:45:14Z

crates/red_knot_python_semantic/resources/mdtest/unary/not.md

@@ -139,7 +139,7 @@ reveal_type(not AlwaysFalse())

 # We don't get into a cycle if someone sets their `__bool__` method to the `bool` builtin:
 class BoolIsBool:
-    __bool__ = bool
+    __bool__: type[bool] = bool


Maybe that information could be recorded in a TODO comment here?

sharkdp added the red-knot Multi-file analysis & type inference label Jan 22, 2025

This comment was marked as resolved.

Sign in to view

sharkdp commented Jan 22, 2025

View reviewed changes

crates/red_knot_python_semantic/resources/mdtest/expression/boolean.md Show resolved Hide resolved

sharkdp commented Jan 22, 2025

View reviewed changes

crates/red_knot_python_semantic/src/types.rs Outdated Show resolved Hide resolved

carljm reviewed Jan 22, 2025

View reviewed changes

AlexWaygood reviewed Jan 22, 2025

View reviewed changes

sharkdp force-pushed the david/union-of-unknown-and-inferred-type branch from cd9230f to c19648a Compare January 23, 2025 12:52

sharkdp mentioned this pull request Jan 23, 2025

[red-knot] Make infer.rs unit tests independent of public symbol inference #15690

Merged

sharkdp force-pushed the david/union-of-unknown-and-inferred-type branch from c19648a to ded8625 Compare January 23, 2025 13:32

sharkdp mentioned this pull request Jan 23, 2025

[red-knot] MDTests: Do not depend on precise public-symbol type inference #15691

Merged

sharkdp force-pushed the david/union-of-unknown-and-inferred-type branch from ded8625 to 026dbd3 Compare January 23, 2025 13:54

sharkdp added 3 commits January 23, 2025 15:00

[red-knot] Use Unknown | T_inferred for undeclared public symbols

f7f3d92

Extend description

9bf84e5

Adapt 'Public type' table

7a035b7

sharkdp commented Jan 23, 2025

View reviewed changes

Use helper to apply the same restrictions to instance_member

5bf3c9b

sharkdp force-pushed the david/union-of-unknown-and-inferred-type branch from 026dbd3 to 5bf3c9b Compare January 23, 2025 15:05

Add TODO comment

49bcf53

sharkdp marked this pull request as ready for review January 23, 2025 15:10

sharkdp requested a review from MichaReiser as a code owner January 23, 2025 15:10

Minor change in imports

f29add5

sharkdp commented Jan 23, 2025

View reviewed changes

Add new test for '__slots__' modifications

36b3af6

sharkdp added 2 commits January 23, 2025 21:39

Union with Unknown for *possibly* undeclared types

f4ca4b8

Revert change in symbol.rs

a1c5647

sharkdp commented Jan 23, 2025

View reviewed changes

carljm approved these changes Jan 23, 2025

View reviewed changes

		# TODO: this should be `Unknown \| Literal[1]`.
		reveal_type(C.variable_with_class_default2) # revealed: Literal[1]

		# TODO: this could be `int` if we declare `B.__add__` using a `Callable` type
		reveal_type(B() + B()) # revealed: Unknown \| int

[red-knot] Use Unknown | T_inferred for undeclared public symbols #15674

Are you sure you want to change the base?

[red-knot] Use Unknown | T_inferred for undeclared public symbols #15674

Conversation

sharkdp commented Jan 22, 2025 • edited Loading

Summary

Test Plan

This comment was marked as resolved.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sharkdp Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexWaygood left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

carljm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

[red-knot] Use `Unknown | T_inferred` for undeclared public symbols #15674

[red-knot] Use `Unknown | T_inferred` for undeclared public symbols #15674

sharkdp commented Jan 22, 2025 •

edited

Loading

sharkdp Jan 22, 2025 •

edited

Loading