Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Having types as possible module arguments #40

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

samsa1
Copy link

@samsa1 samsa1 commented Jun 5, 2024

A proposed design for having types as module arguments.

Rendered version

In summary :

module List (type a) = struct
    type t = a list
    let size : t -> int = List.length
    let fold : ('acc -> a -> 'acc) -> 'acc -> t -> 'acc = List.fold_left

module M = List(type int)

let f (x : List(type int).t) = ...

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC is lacking the part where you argue why this feature is desirable/useful. Do you have compelling examples? (I know from private communication that there are some examples coming from modular implicits, and I think that it would be interesting to explain them, but maybe there are also compelling examples without them today?)

One way to look for examples would be to look for uses of sig type t end in OCaml codebases in the wild. Here is for example a Github search URL, I have not looked at the results much.

This RFC proposes to extend a bit the module language by adding the possibility
of giving types arguments to modules.

The goal of this feature is to simplify replace cases where the user might want
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: to simplify cases

```


## Important restraints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restrictions

Only type paths should be authorized when doing an application inside
a type path.

In other words, `M(type <m: int>).t` should be rejected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the wording of this paragraph obscure. I'm not fully sure what you mean by "type path", and "inside a type path" is also not clear. I believe that you are talking about type-functor applications that show up inside type expressions: M(type foo).t; but can you give more details on the restriction on foo, for example by listing the allowed constructions?

@gasche
Copy link
Member

gasche commented Jun 5, 2024

(cc @yallop who may be interested in this feature)

@clementblaudeau
Copy link

clementblaudeau commented Jun 5, 2024

You can look at example also via this sherlocode search
Regarding applicativity, I should stress that modules taking types as arguments are indeed applicative functors : no generative call or unpacking of first class module should be allowed in their body.
[EDIT] *unpacking of first class module is not allowed only when it creates fresh types

@diremy
Copy link
Collaborator

diremy commented Jun 5, 2024

The counter example

module F (type a) = struct let r : a option ref = ref None end

is indeed a counter example if the compilation would not be compiled as a function.
Still, this functor should not be defined as applicative, since it allocates a reference.
Hence, two successive applications to the same type still returns structures that are not equal. Hence, it should rather be make generative and defined as

module F (type a) () = struct let r : a option ref = ref None end

@diremy
Copy link
Collaborator

diremy commented Jun 5, 2024

I asume that the encoding is :

1:    module F (type a) = BODY [a]
2:    F (type P) 

stand for

1:      module F (A : Type)  = BODY [A.t]
2:     F (struct type t = P end)

where Type is either predefined or inlined as sig type t end.
Shouldn't this be added to the RFC as It shows that the proposal does not add expressiveness, just conciseness. And removes any possible ambiguity for its meaning.

@yallop
Copy link
Member

yallop commented Jun 6, 2024

If we could avoid giving the type arguments at runtime for modules like with functions we would have soundness issues.

This isn't clear to me. Have you considered the alternative of using a value restriction?

In current OCaml, type arguments to functions, like this

let f (type a) = ...

don't have run-time representations, and type arguments to constructors, like this

let f (T (type a) (x : a)) = ...

don't have run-time representations, so it's a bit non-uniform for type arguments to modules to have run-time representations. Type arguments to functions don't delay evaluation

let f (type a) = begin print_endline "ok" end (* prints "ok" *)

so it's rather surprising if type arguments to modules do delay evaluation:

module M (type a) = struct print_endline "ok" end (* prints nothing *)

For type arguments to functions, the usual value restriction applies, so in the following, a is left weak (ungeneralized):

let f (type a) =
   let r : a option ref = ref None in
   ...

Could a similar approach work for type arguments to modules?

Continuing the analogy, type arguments to functions don't need to be supplied explicitly (though perhaps it'd be useful to have the option to do so). Could a similar approach work for type arguments to modules?

@gasche
Copy link
Member

gasche commented Jun 6, 2024

Side-note: I find it surprising that (type a) does not force generalizability, and I would prefer the a option ref example to be rejected with an error.

@diremy
Copy link
Collaborator

diremy commented Jun 6, 2024

I was actually wondering the same as Jeremy. I understand that the proposal is just syntactic sugar for the encoding I proposed, hence the actual semantics. However, it might be more interesting to not make it syntactic sugar and restric the body to be a value form. If we chose this sematics and restriction, the actual behavior when the body is not a value form can be recovered by passing an extra empty structure strut end to make it applicative or () to make it generative... or using the encoding :-)
The question is then for which kinds of examples would the construct be used in practice? If in most cases, the body is a value, it might be worth considering this as a new construct with a new semantics.
The State example cannot be treated as a value since it generates a new effect whose type depend on the type parameter.
I think this should deserve should more thoughts.

@samsa1
Copy link
Author

samsa1 commented Jun 7, 2024

This isn't clear to me. Have you considered the alternative of using a value restriction?

In the case where we we want to keep the descripted semantic with explicit application by the user, we can remove the application if the functor is pure. Because all applications will give the same result thus it can be pre-computed.

I think the proposal from @diremy of not handling this as syntactic sugar and restricting the body to be a value seems to be a good idea. The biggest argument being that we can still obtain the actual behavior by adding the argument that is currently added at runtime in the proposal.

Continuing the analogy, type arguments to functions don't need to be supplied explicitly (though perhaps it'd be useful to have the option to do so). Could a similar approach work for type arguments to modules?

I think we might have cases where being able to explicitly give the type argument is still beneficial. I agree that sometimes not writing that argument could increase concision (such as M.f instead of M(type t).f. However sometimes it is still better to be able to write explicitly the type argument (such as inside type annotations). Thus treating this as an optional argument might be the best approach.

@gasche
Copy link
Member

gasche commented Jun 7, 2024

I don't think that handling this as an optional argument works well. Consider the case where you abstract over several type arguments in sequence, there is no way at application time to indicate which of the arguments you have in mind on explicit application.

I'm of two minds regarding application expliciteness:

  1. @yallop makes a good point that all existing instances of the (type a) abstraction have implicit application, so doing this is more consistent. But I expect that users will often want to specify the argument explicitly. In the case of GADT constructors we tried to give a syntax for explicit binding at use-site on GADT patterns : K (type a) ((foo, bar) : a * a baz) -> ..., which was somewhat awkward. We don't have a good, consistent solution to how to pass (type a) arguments explicitly yet preserve their unnamed and order-independent nature expected in ML.

  2. I think this may be related to syntactic questions about modular explicits and implicits. Is there a cute syntax to turn an explicit module argument into an implicit module argument, can we reuse the same here? If you use (module A : S) for explicit arguments and {module A : S} for implicit arguments, we could have {type a} here. (But this is inconsistent with existing (type a) constructs.) If you use a keyword, we could say have both (explicit module A : S) and (implicit module A : S), with (module A : S) a suitable default, and then we could have both (explicit type a) and (implicit type a), and possibly decide that the default meaning of (type a) depends on the language construct.
    (I don't remember, is there a proposed syntax to explicitly apply implicit module arguments? Can we reuse the same for types?)

@samsa1
Copy link
Author

samsa1 commented Jun 7, 2024

(I don't remember, is there a proposed syntax to explicitly apply implicit module arguments? Can we reuse the same for types?)

The propose syntax was to write f {M} if we wanted to give explicitly the first implicit argument. However, this was the proposal given in the modular implicits paper and I haven't heard of a proposal to distinguish modular explicits from modular implicits (but I think this should be discussed).

@gasche
Copy link
Member

gasche commented Jun 7, 2024

I think that this is a question that is weighting on @diremy's mind.

Note: Coq/Rocq allows the application syntax f (a := foo) to explicit apply the implicit argument a of f (manual > Explicit applications). Maybe we should seriously consider something like this for OCaml. In particular we could enable this syntax for all (type a) abstractions that exist today, in a compatible and consistent way.

@samsa1
Copy link
Author

samsa1 commented Nov 26, 2024

After some thinking I arrived at the conclusion that using value restriction was indeed the best direction to go. This could allow optimization with reduced module allocation and functor application. But also you could still have a computing semantic using a module instead of a type.

However some question would need some decisions about what we want :

  • is value restriction or weak value restriction better ?
module M (type a) = struct let r = ref None end

Should be rejected of accepted with signatrue sig val r : '_weak1 option ref end ?
Same for :

module M2 (type a) = struct
  let () = print_endline "Here" 
  • should we accept unit-typed expressions ?
module M3 (type a) = struct
   ...
   print_endline "Here";;
   ...
end

@gasche
Copy link
Member

gasche commented Nov 26, 2024

It is always easier to restrict things at first, and allow them later when a good use-case comes. I would therefore answer "reject" to both of your questions.

@diremy
Copy link
Collaborator

diremy commented Nov 26, 2024

It is always easier to restrict things at first, and allow them later when a good use-case comes. I would therefore answer "reject" to both of your questions.

I am not so convinced. The question is what specification do we give to the construct, and I see two options.

OPTION 1

We just say that type abstraction does not block the evaluation, then

module  M(type a) =  struct  let  r =  ref  None  end
module  M(type a) =  struct print_endline "Here" end

should not be different from

module  M =  struct  let  r =  ref  None  end
module  M =  struct print_endline "Here" end

which are currently both accepted. Then, why should we make an exception to the behavior and fail in the former and not in the later?

OPTION 2

Perhaps, we may add in the specification that type abstraction should only be used in front of pure modules and that it is the user's responsability to ensure that, although the typechecker will help and fail, when it can tell that
this is not the case? Then, we could fail in both cases.

@samsa1
Copy link
Author

samsa1 commented Nov 27, 2024

The main problem is that if we have a non-blocking lambda, then the values M(type t1).v and M(type t2).v are physically equal.

As such when computing the type of r in :

module M(type a) = struct   let r = ref None   end

We need to use weak value restriction to lower the scope of the unification variable 'a to the toplevel scope (above M) as done with:

module M2 = struct   let r = ref None   end

However the difficulty with M comes from the fact that the type a is defined in this scope.
Thus, when doing this lifting of unification variables in contravariant positions we would need to handle more cases than currently because we would not only need to lower unification variable but also check that some type do not escape their scope. For example, r1 and r2 should be both rejected in the example bellow :

module M(type a) = struct
  let r1 : a option ref = ref None
  type t = A of a
  let r2 : t option ref = ref None
end

As such I propose to split the proposal in two step :

  • a first one that uses strict value restriction to reject many code examples including the ones above,
  • then improve weak value restriction such that lower_contravariant may detect type escaping their scope like in the example above to reduce the syntactic and semantic restriction of such arguments

@diremy
Copy link
Collaborator

diremy commented Nov 27, 2024

In this case (I missed these details in my previous comment) I agree that we should go for the two-step process, starting with the strict value restriction, and only see later if we need to relax it and use the weak value restriction.

@craff
Copy link

craff commented Nov 28, 2024

Nice, I can't wait to have this to experiment with float/rational/gmp as argument and see if I can have both a common definition for all functions and recover the performance of float.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants