-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Handle public symbols declared as Unknown
like undeclared symbols
#15715
Conversation
reveal_type(e) # revealed: @Todo(full tuple[...] support) | tuple[()] | ||
reveal_type(f) # revealed: @Todo(full tuple[...] support) | tuple[Literal["42"], Literal[b"42"]] | ||
reveal_type(g) # revealed: @Todo(full tuple[...] support) | tuple[Literal["42"], Literal[b"42"]] | ||
|
||
# TODO: support more kinds of type expressions in annotations | ||
reveal_type(h) # revealed: @Todo(full tuple[...] support) | ||
reveal_type(h) # revealed: @Todo(full tuple[...] support) | tuple[list, list] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes look a bit strange, but they will all go away when the TODO is resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. For f
, and g
once we understand the annotation we will infer the type as being
tuple[str, *tuple[int, ...], bytes] | tuple[Literal["42"], Literal[b"42"]]
Which, since the second element in the union is a subtype of the first element, simplifies to
tuple[str, *tuple[int, ...], bytes]
which is just what they have as their annotation already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that reasoning is true for every fully-static declared+inferred type. If we did not issue a diagnostic on the definition x: T_declared = <expr of type T_inferred>
, it means that T_inferred
is assignable to T_declared
. And if T_declared
and T_inferred
are fully-static, that means T_inferred <: T_declared
, i.e. T_declared | T_inferred = T_declared
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any tests where the assigned value is inferred as having a non-fully-static type but the declared type is fully static? If not, could we add some?
E.g. for this:
from typing_extensions import Any, reveal_type
def f(x: Any):
y: int = x
def g():
reveal_type(y)
We reveal int
on main
. What do we reveal with this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we reveal with this PR?
int | Any
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's a reasonable outcome. The assignment y: int = x
is only valid because x
has a not fully static type that can materialize to something that's a subtype of int
. I don't think after that assignment we should be anymore accounting for a possibility that y
has a type wider than int
. If anything it seems like the type of y
(and maybe also x
!) here should be int & Any
, but I don't think that's quite right either, it corresponds to effectively ignoring annotated types on valid assignments, which is what we do in local inference but is not right for public types.
Overall I'm not sold on this PR yet, would like to think it over a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's a good point that I hadn't thought about. And I'm not sure I like the behavior. In the x: Any = 1
case, we make the resulting public type more precise. But in the case x: int = returns_any()
here, we make the public type of x
less precise. I'm not sure if that's desirable. Maybe we want (T_declared | T_inferred) & T_declared
? 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything it seems like this example should introduce a narrowing constraint on x
: x: Any & int
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember that this is about public uses of x
. Someone could have modified x
in the meantime. This is why we also want a lower bound, I think. So (int | Any) & int = int | Any & int = int
for this case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should document this by adding a test, since this PR changes behaviour for things like this!
Done.
@@ -18,7 +18,7 @@ def f(*args: Unpack[Ts]) -> tuple[Unpack[Ts]]: | |||
# TODO: should understand the annotation | |||
reveal_type(args) # revealed: tuple | |||
|
|||
reveal_type(Alias) # revealed: @Todo(Unsupported or invalid type in a type expression) | |||
reveal_type(Alias) # revealed: @Todo(Unsupported or invalid type in a type expression) | Literal[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other comment: This change look a bit strange, but it will go away when the TODO is resolved.
fde7f20
to
5bb5ca5
Compare
T_declared | T_inferred
for declared public symbolsUnknown
like undeclared symbols
5bb5ca5
to
a5e3304
Compare
a5e3304
to
a3d730b
Compare
symbol, not just declarations), we use `T_decl | T_decl & T_inf` as the public type, which | ||
simplifies to `T_decl` for `T_inf = Unknown` (the unbound case). | ||
|
||
[TODO: more explanation] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO for me.
Summary
For definitely declared symbols, we currently rely on the declared type only, even if we also see bindings for that symbol. This means that we handle the two cases (1) a symbol is undeclared and (2) a symbol is declared with type
Any
/Unknown
differently, which seems inconsistent.One potential solution to this problem is the following: Instead of returning
T_declared
as the public type of a declared symbol, we returnT_declared | T_declared & T_inferred
instead. This represents a type that is no smaller than and no larger thanT_declared
. For fully static types, this construct simplifies toT_declared
. But for non-fully-static types, this can lead to more precise types. For example, if the declared type isAny
(orUnknown
), we now useAny | Any & T
which can be simplified toAny | T
. This represents the fact that the public type for this symbol must be at least as large as the inferred typeT
. This allows us to issue diagnostics in cases like the following:Red knot (main), mypy and pyright issue no diagnostic in the last line here.
In the example given above, it looks like the intersection is not necessary. However, consider the following scenario. We still want to see
int
as the public type here. This is possible becauseint | int & Any = int
:The same behavior should also apply to possibly-declared symbols.
Why
(T_declared | T_declared & T_inferred)
?The nice property of this expression is that it works for other cases in the matrix below as well.
Undeclared (
T_declared = Unknown
)If a symbol is definitely undeclared, we can set
T_declared = Unknown
. In this case, the expressionT_declared | T_declared & T_inferred = Unknown | Unknown & T_inferred
simplifies toUnknown | T_inferred
. This is exactly what we have in the last column of the matrix below (where the entry in the last row further simplifies becauseUnknown | Unknown = Unknown
).Unbound (
T_inferred = Unknown
)If a symbol is definitely unbound, we can set
T_inferred = Unknown
. In this case, the expressionT_declared | T_declared & T_inferred = T_declared | T_declared & Unknown
simplifies toT_declared
. This is exactly what we have in the last row of the matrix (where the entry in the last column simplifies becauseT_declared
isUnknown
in this case as well)Type inference matrix
(changes affect the first column only)
Before
T_decl
T_decl | T_inf
Unknown | T_inf
T_decl
T_decl | T_inf
Unknown | T_inf
T_decl
T_decl
Unknown
After
T_decl | T_decl & T_inf
T_decl | T_inf
Unknown | T_inf
T_decl | T_decl & T_inf
T_decl | T_inf
Unknown | T_inf
T_decl
T_decl
Unknown
What about possibly-undeclared?
It would be great if we could replace
T_declared -> T_declared | Unknown
for this case (and similar for possibly unbound symbols). This is doable, but leads to lots of unions/intersections withUnknown
, something that we don't fully simplify at the moment, so I have not fully looked into this yet. As it stands, our handling of possibly-undeclared symbols is at least debatable. For example, should we issue a diagnostic in the following example (we currently do not).Test Plan
Updated existing tests.