Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type inference #1010

Merged
merged 1 commit into from
Mar 1, 2024
Merged

Type inference #1010

merged 1 commit into from
Mar 1, 2024

Conversation

chriseth
Copy link
Member

@chriseth chriseth commented Feb 2, 2024

TODO:

  • documentation
  • add inferred types to declarations.

@chriseth chriseth changed the base branch from main to arrays February 2, 2024 15:50
@chriseth chriseth force-pushed the type_inference branch 2 times, most recently from bf0967e to 24399d9 Compare February 5, 2024 11:54
Base automatically changed from arrays to main February 6, 2024 10:36
@chriseth chriseth changed the base branch from main to introduce_identity February 7, 2024 16:37
Base automatically changed from introduce_identity to main February 7, 2024 16:40
@chriseth chriseth force-pushed the type_inference branch 10 times, most recently from 6d1bcbd to ceee377 Compare February 14, 2024 15:54
@chriseth chriseth force-pushed the type_inference branch 7 times, most recently from 38d5d59 to 2edfffb Compare February 22, 2024 07:14
@chriseth chriseth changed the base branch from main to extend_types February 22, 2024 07:14
@chriseth chriseth marked this pull request as ready for review February 28, 2024 16:17
@chriseth
Copy link
Member Author

Ok it seems we have to implement some of the optimization ideas here before we can merge this :)

namespace F(N);
let id = |i| i;
let inv = |i| N - i;
let a: int -> int = |i| [0, 1, 0, 1, 2, 1][i];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, or if you declare the array with only 5 elements, i guess its caught in the evaluator also?

let y: col[2] = [x(0), x(1)];
namespace std::convert(4);
let fe = || fe();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this namespace and definition doing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::convert::fe is a built-in symbol. We still have to declare it (because I don't think it's a good idea to also clutter name resolution with builtins), but the actual semantics is replaced during type-checking and execution.
We can maybe find a better solution at some point.

}
if let Some(called) = graph.get(name) {
for c in called {
let n = graph.get_key_value(c.as_str()).unwrap().0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't c.as_str() be used directly in place of n?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is a lifetime transfer thing....

@chriseth
Copy link
Member Author

I got tremendous speed up by avoiding creating a type variable for each literal number. This keeps the substitution hashmap small and I think this is sufficient for now.

ty.as_ref().cloned().unwrap_or_else(|| TypeName::Int)
};
Ok(match ty {
TypeName::Fe => Value::FieldElement(n.to_arbitrary_integer().into()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't n already a field element?

0 => 0,
1 => 1,
_ => fib(i - 1) + fib(i - 2),
};
let result = fib(20);
let result: int = fib(20);
Copy link
Collaborator

@pacheco pacheco Feb 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this int be inferred from fib's type?

.iter_mut()
.map(|(name, (symbol, value))| {
(
name.clone(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if having a couple intermediate variables here would make the identation more readable:
let name = name.clone();
let type_expr = ...;
(name, type_expr)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it a bit.

@@ -383,7 +463,7 @@ namespace N(16);
}

#[test]
#[should_panic = "Operator - not supported on types"]
#[should_panic = "Type expr[] is required to satisfy trait Sub, but does not"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just Type expr[] does not satisfy trait Sub is clearer i think

@@ -594,4 +680,20 @@ namespace N(16);
let formatted = analyze_string::<GoldilocksField>(input).to_string();
assert_eq!(formatted, input);
}

#[test]
#[should_panic = "Set of declared and used type variables are not the same"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this message include the set of undeclared type variables?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended it a little and added another test.

}

#[test]
#[should_panic = "Set of declared and used type variables are not the same"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for this, but for unused type variables?

pol fixed ISLAST(i) { if i == %last_row { 1 } else { 0 } };
pol fixed ISALMOSTLAST(i) { if i == %last_row - 1 { 1 } else { 0 } };
pol fixed ISLAST(i) { if i == last_row { 1 } else { 0 } };
pol fixed ISALMOSTLAST(i) { if i == last_row - 1 { 1 } else { 0 } };
pol fixed ISFIRST = [ 1, 0 ] + [0]*;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be [1] + [0]*?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, I don't know why it was done like that.

// This is a function that takes an expression as input and returns
// a constraint enforcing this expression increments by a certain value
// between rows.
// The type will be inferred here because `'` is only valid on `expr`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought ' was only valid for columns, can we do something like (x+y)' = x+y+1; to say one must increment but not both?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle yes, the next operator is "linear" in that (5*x + y + 1)' = 5*(x') + y' + 1, but I think the condenser would have to do that transformation and it currently does not do it (although I think it would be simple), but I might be wrong.

let<T: FromLiteral + Mul + Add> bn: T, T, T, T, T, T, T, T -> T = |a1, a2, a3, a4, a5, a6, a7, a8| array::fold(
[a1, a2, a3, a4, a5, a6, a7, a8],
0,
|acc, el| acc * 0x100000000 + el
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't << 32 clearer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be, but << is only implemented on integers, here we also have to support expr.

@@ -7,18 +7,21 @@ from the type of the symbol and the way the symbol is used.
Symbols can be declared using ``let <name>;`` and they can be declared and defined
using ``let <name> = <value>;``, where ``<value>`` is an expression. The [type](./types.md) of the symbol
can be explicitly specified using ``let <name>: <type>;`` and ``let <name>: <type> = <value>;``.
It is even possible to define generic symbols using ``let<TV1, TV1, ..> <name>: <type> = <value>;``,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic type symbols can be defined using...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

- Symbols without type but with a value that is a function with a single parameter are fixed columns.
- Symbols defined with a value and type ``col`` (or ``col[]``) are fixed columns (or arrays of fixed columns).
- Symbols without a value are witness columns or arrays of witness columns. Their type can be omitted. If it is given, it must be ``col`` or ``col[k]``.
- Symbols with type ``fe`` are constants.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't every definition constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"constant" is a concept from polygon-pil. Maybe we should just not mention it, since it does not really matter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea agree

The integer value is converted to a field element during evaluation, but it has to be non-negative and less than
the field modulus.

If you reference such a symbol, the type of the reference is `expr`, which means it is only symbolically referenced.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does symbolically referenced mean?

let<T: FromLiteral + Add> add_one: T -> T = |i| i + 1;
```

## Types of Symbols
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call this section Declaring and referencing columns or sth in those lines?

A symbol declared to have type `col` (or `col[k]`) is a bit special:

If you assign it a value, that value is expected to have type `int -> fe` or `int -> int` (or an array thereof).
This allows the simple declaration of a column `let byte = |i| i & 0xff;` without complicated conversions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need the explicit : col for columns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha true :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still needs update

the field modulus.

If you reference such a symbol, the type of the reference is `expr`, which means it is only symbolically referenced.
A byte constraint is as easy as `{ X } in { byte }` now, since the expected types in plookup columns is `expr`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove now?

If you reference such a symbol, the type of the reference is `expr`, which means it is only symbolically referenced.
A byte constraint is as easy as `{ X } in { byte }` now, since the expected types in plookup columns is `expr`.
The downside is that you cannot evaluate columns as functions. If you want to do that, you have to assign
a copy to an `int -> int` symbol: `let byte_f = |i| i & 0xff; let byte = byte_f;`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't byte_f and byte be explicitly typed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, byte should be, byte_f does not need to be, but I can add types for clarity.

// The signature of `y` could be seen as fixing the type of `x`,
// but in order to do that, we have to unify the derived type of
// `y` with the declared and this would maybe create problems in that
// we would not derive the most generic type for generic functions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that knowledgeable with how type unification/inference works, can you elaborate why y needs to be "unified" for x to take the int return type from it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way the type checker infers generic types is that it completely ignores the declared type and instead creates a new and fully unconstrained type variable. Then it runs the type inference algorithm which more and more constrains that type variable. In the end, some type expression is obtained. If that type expression is equal to the declared type, everything is fine.

Now what happens in this test is that it derives the following types:

x: <T: FromLiteral>, -> T,
y: <T1, T2: FromLiteral>, T1 -> T2

This is an error because the type of x is not a simple type, and the type of y does not match the declaration.

We could compare / unify <T1, T2: FromLiteral>, T1 -> T2 and <T> T -> int and would maybe arrive at the correct types, but I think the current way is better.

}

#[test]
fn type_check_arrays() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this test checking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it verifies that type checking is performed inside RepeatedArrays

if f1.params.len() != f2.params.len() {
Err(format!(
"Function types have different number of parameters: {f1} and {f2}"
))?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its clearer to return the errors

pub fn ensure_bound(&mut self, ty: &Type, bound: String) -> Result<(), String> {
if let Type::TypeVar(n) = ty {
if let Some(sub) = self.substitutions.get(n) {
self.ensure_bound(&sub.clone(), bound)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the .clone() needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because it is a ref to self and ensure_bound needs a &mut to self. Changed it to avoid the clone nevertheless.

}

pub fn unify_types(&mut self, mut inner: Type, mut expected: Type) -> Result<(), String> {
if let (Type::TypeVar(n1), Type::TypeVar(n2)) = (&inner, &expected) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't the check just be inner == expected here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid the more complicated recursive comparison. I think I can also just remove the whole thing.

// We have to check type schemes last, because only at this point do we know
// that other types that should be concrete do not occur as type variables in the
// inferred type scheme any more.
let type_var_mapping = self.inferred_types.iter().map(|(name, inferred_type)| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems inferred_types doesn't need to be a member of Self, as it's only used inside this function. As a variable, you could into_iter here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, great observation! Will change!

Ok(type_var_mapping)
}

/// Updates generic arguments and literal annotations with the proper resolved types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's a literal annotation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second element of Expression::Number that specifies which type the literal has.

expressions: &mut [(&mut Expression<T>, ExpectedType)],
) -> Result<(), String> {
for (e, expected_type) in expressions {
if expected_type.allow_array {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify a bit why .allow_array is needed and where it comes from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be documented at the type: It says if arrays of the expected type are allowed as well. This is mainly used at statement context where we expect either constr or constr[].

})
}

fn process_function_call<'b, T: FieldElement>(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

infer_function_call_type?


/// Creates a type scheme out of a type by making all unsubstituted
/// type variables generic.
/// TODO this is wrong for mutually recursive generic functions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this mean that for now those would fail type check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually haven't tried what happens...
Maybe it works because we specify the type schemes explicitly.

Copy link
Member

@leonardoalt leonardoalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review for all crates except pil-analyzer and book.

@@ -826,6 +871,9 @@ pub struct PolynomialReference {
/// Optional because it is filled in in a second stage of analysis.
/// TODO make this non-optional
pub poly_id: Option<PolyID>,
/// The type arguments if the symbol is generic.
/// Might be empty before type checking is completed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds a bit sketchy. Shouldn't it use a different type then? Or be a different type completely, pre and post type checking?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had that before and it was a big mess especially with the assembly that is still not properly integrated. Would prefer to keep it an option for now.

let y: col = |i| i + 20;
let x = |i| y(i) + 1;
let y = |i| i + 20;
let X: col = x;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think declaring a variable twice should be allowed. Or rather: we should decide if the language is supposed to remain declarative or not

@@ -163,7 +163,7 @@ pub fn evaluate_function<'a, T: FieldElement>(
arguments: Vec<Rc<evaluator::Value<'a, T, evaluator::NoCustom>>>,
) -> evaluator::Value<'a, T, evaluator::NoCustom> {
let symbols = evaluator::Definitions(&analyzed.definitions);
let function = symbols.lookup(function).unwrap();
let function = symbols.lookup(function, &[]).unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not about this PR but: why is this test in Pipeline actually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used by the powdr_std tests.


/// Evaluates to the array [f(0), f(1), ..., f(length - 1)].
let new = |length, f| std::utils::fold(length, f, [], |acc, e| (acc + [e]));
let<T> new: int, (int -> T) -> T[] = |length, f| std::utils::fold(length, f, [], |acc, e| (acc + [e]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

let sum: int[] -> int = |arr| fold(arr, 0, |a, b| a + b);
// TODO: Should make use of the Default or Zero trait instead of FromLiteral (then we can also
// use this function to flatten an array of arrays.
let<T: Add + FromLiteral> sum: T[] -> T = |arr| fold(arr, 0, |a, b| a + b);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fancy

@@ -1,9 +1,10 @@
use std::check;

machine Empty {
let line: col = |i| i - 7;
let line = |i| i - 7;
let line_col: col = line;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the previous case is not valid anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, unfortunately not.

// Add %offset in a single step of computation
constant %offset = 1;
// Add offset in a single step of computation
let offset = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Member

@leonardoalt leonardoalt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review for docs

@@ -7,18 +7,21 @@ from the type of the symbol and the way the symbol is used.
Symbols can be declared using ``let <name>;`` and they can be declared and defined
using ``let <name> = <value>;``, where ``<value>`` is an expression. The [type](./types.md) of the symbol
can be explicitly specified using ``let <name>: <type>;`` and ``let <name>: <type> = <value>;``.
It is even possible to define generic symbols using ``let<TV1, TV1, ..> <name>: <type> = <value>;``,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

- Symbols without type but with a value that is a function with a single parameter are fixed columns.
- Symbols defined with a value and type ``col`` (or ``col[]``) are fixed columns (or arrays of fixed columns).
- Symbols without a value are witness columns or arrays of witness columns. Their type can be omitted. If it is given, it must be ``col`` or ``col[k]``.
- Symbols with type ``fe`` are constants.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea agree

- Everything else is a "generic symbol" that is not a column or constant.

> Note that the type ``col`` is the same as ``int -> fe``, so ``let w: int -> fe`` also declares a witness column.
> Note that a symbol with declared type ``int -> fe`` is not considered a column.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea I think it starts getting complicated to explain columns stuff since it's an arithmetization specific concept that almost becomes a lower level language inside pil itself

- `constr` (constraint)
- `!` ("bottom" or "unreachable" type)

> The `col` type is special in that it is not a proper type, but can be used for declaring symbols.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned this in another comment, maybe it's worth a chat discussion: it's almost like the col and wit business is a lower level language inside pil that's arithmetization specific.

> The `col` type is special in that it is not a proper type, but can be used for declaring symbols.
> See [Types of Symbols](#types-of-symbols) for details.

Powdr-pil does perform a Hindley-Milner-kind type checking. After type checking, for all symbols
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

A symbol declared to have type `col` (or `col[k]`) is a bit special:

If you assign it a value, that value is expected to have type `int -> fe` or `int -> int` (or an array thereof).
This allows the simple declaration of a column `let byte = |i| i & 0xff;` without complicated conversions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still needs update

the field modulus.

If you reference such a symbol, the type of the reference is `expr`, which means it is only symbolically referenced.
A byte constraint is as easy as `{ X } in { byte }` now, since the expected types in plookup columns is `expr`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A byte constraint is as easy as `{ X } in { byte }` now, since the expected types in plookup columns is `expr`.
A byte constraint is as easy as `{ X } in { byte }` now, since the expected type in plookup columns is `expr`.

book/src/pil/types.md Outdated Show resolved Hide resolved
If `f` is called in a different context (for example inside a function that defines a constant column),
then the column references `x` and `y` are interpreted to have the type `int -> fe` and thus
adding them results in a type error.
Because of that, `g` is inferred to have type `-> expr`, which is compatible with the literal `7`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

#[test]
fn assignment() {
// This should derive a concrete type for x due to how it is used by y.
let input = "let x = [|i| i]; let y: int[] = [x[0](2)];";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make this an array on purpose or do we still need the trick of putting lambdas in arrays?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not needed any more. Maybe it was when I wrote the test. But regardless, it makes the test a bit more complicated so probably still good.

let g = || g();
let x = |a, b| if g() { a } else { b + 2 };
let c: int = 2;
let y = [|i| x(c, i)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiosity: what's missing after this to be able to do just let y = |i| x(c, i);?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing

("g", "", "-> bool"),
("x", "", "int, int -> int"),
("c", "", "int"),
("y", "", "(int -> int)[]"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

let y: col;
let set_equal: expr, expr -> constr = |a, b| a = b;
let<T1, T2> array_map: int, T1[], (T1 -> T2) -> T2[] = |n, a, f| if n == 0 { [] } else { array_map(n - 1, a, f) + [f(a[n - 1])] };
// This does not type check, because based on `x`, it instantiates array_map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't the test fail then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an outdated comment about a bug that I fixed in the meantime. Will remove.

@@ -179,6 +179,10 @@ impl<T, D: AnalysisDriver<T>> ExpressionProcessor<T, D> {
PolynomialReference {
name: self.driver.resolve_ref(path),
poly_id: None,
// These will be filled by the type checker.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I don't really like this approach of having an empty type that gets filled out in the same data structure, but we've talked about this before. I understand it's convenient to keep the same type and just change this in a mutable pass. Can we at least make this an Option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will make it an option.

TypeChecker::default().infer_types(definitions, expressions)
}

/// A type to expect and a flag that says if arrays of that type are also fine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain a bit more why this is needed/useful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used for all the places that are not definitions in the form of let x = y. This includes constraints at the statement level, but also array elements for the RepeatedArrays. The type checker is called with a list of definitions and a (much much longer) list of expressions that appear not as a value of a declaration and the corresponding expected type.

Since we allow either constr or constr[] at statement level, we have the allow_array flag to support these two types.

/// Types for local variables, might contain type variables.
local_var_types: Vec<Type>,
/// Declared types for symbols. Type scheme for polymorphic symbols
/// and unquantified type variables for symbols without type.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's unquantified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In let<T> x: (T, K) = (1,2), T is quantified, K is unquantified. Of course, in the source code, unquantified type variables are illegal but they appear during type checking until they are substituted by something else that does hopefully not contain any unquantified type variables any more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also write "new unused" if that is easier to understand?

Copy link
Collaborator

@pacheco pacheco Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "undeclared"? but if quantified is the proper terminology better leave it there and add the clarification in parenthesis

// symbol is referenced). Unifying the declared type with the inferred
// type is done at the end.
for name in names {
// Ignore builtins (removed from definitions) and definitions without value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

};

let declared_type = self.declared_types[&name].clone();
(if declared_type.vars.is_empty() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's this ( closing? Can you rearrange this code a bit? It's kinda hard to read atm

base,
length: Some(_),
}) if base.as_ref() == &Type::Expr => {
// An array of intermediate columns with fixed length. We ignore the length.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you know here that this is an array of intermediate columns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type is expr[k], that's how you declare intermediate columns.

- `!` ("bottom" or "unreachable" type)

> The `col` type is special in that it is only used for declaring columns, but cannot appear as the type of an expression.
> See [Types of Symbols](#declaring-and-referencing-columns) for details.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name of the section has changed

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, I updated the link but forget the text :)

Ok(())
}

/// Process and expresison and return the type of the expression.
Copy link
Collaborator

@pacheco pacheco Mar 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an expression

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's still a typo in expression :P

}
}

// We check type schemes last, because only at this point do we know
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this last part could be a check_type_schemes method that takes in the inferred_types?

@@ -874,8 +874,8 @@ pub struct PolynomialReference {
/// TODO make this non-optional
pub poly_id: Option<PolyID>,
/// The type arguments if the symbol is generic.
/// Might be empty before type checking is completed.
pub generic_args: Vec<Type>,
/// Guarenteed to be Some(_) after type checking is completed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's guaranteed maybe it should be a GuaranteedGenericArgs type :D

Copy link
Collaborator

@pacheco pacheco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@chriseth chriseth enabled auto-merge March 1, 2024 16:37
@chriseth chriseth added this pull request to the merge queue Mar 1, 2024
Merged via the queue into main with commit f21a691 Mar 1, 2024
5 checks passed
@chriseth chriseth deleted the type_inference branch March 1, 2024 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants