Skip to content

[asl] maintain a type annotation for every typechecked expression#1870

Draft
Roman-Manevich wants to merge 1 commit into
masterfrom
asl-910-maintain-types
Draft

[asl] maintain a type annotation for every typechecked expression#1870
Roman-Manevich wants to merge 1 commit into
masterfrom
asl-910-maintain-types

Conversation

@Roman-Manevich

Copy link
Copy Markdown
Collaborator

This PR enables maintaining the type inferred for each expression during typechecking as an annotation attached to the expression. The intention is to make this available for tools built on top of aslref.

Comment thread asllib/AST.mli Outdated
Comment thread asllib/ASTUtils.ml

let integer_exact' e = well_constrained' [ Constraint_Exact e ]
let integer_exact' e =
well_constrained' [ Constraint_Exact (e |> with_integer_ty) ]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I'm a bit surprised that this is editing the type of e

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replaces e with a copy that has the integer type as its type annotation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we not expect e to be type-annotated already? If not, does the integer type annotation add anything?

Comment thread asllib/ASTUtils.ml

let qualifier_equal (q1 : func_qualifier option) q2 = Option.equal ( = ) q1 q2
let var_ x = E_Var x |> add_dummy_annotation
let int_expr_of_var x = E_Var x |> add_dummy_annotation |> with_integer_ty

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the usage of with_integer_ty seems suspect to me - how do we know whether it should be (un)constrained, parameterised? I think we should be relying on type-checking to fill this in wherever possible, and avoiding overwriting types as we do several times in this file

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is needed for the cases where we are constructing typed AST expressions without later annotating them. In such cases, we need to establish the types ourselves.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that we may not have annotated these, but how can we know that integer is the right type?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it is possible that we could type-check those expressions, even if it's a bit useless.

Comment thread asllib/ASTUtils.ml
List.map plug_one

(** Checks that all expressions have [opt_type = Some t] *)
module CheckTypeAnnotations = struct

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we performing this check? This is just additional information, it's not mandated?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used it for testing and then turned it off. We can probably remove it, but we may want to preserve it until we have more validation by some actual use cases.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove for the merge of this PR.

we may want to preserve it until we have more validation by some actual use cases.

I think the opposite actually - this is speculating a need for exhaustivity that we don't yet have supported by usage. I suggest we start with a basic level of annotation and expand as necessary

Comment thread asllib/Typing.ml Outdated
Comment thread asllib/Typing.ml
@Roman-Manevich Roman-Manevich force-pushed the asl-910-maintain-types branch from a8097cc to 124b7b9 Compare June 18, 2026 12:56
@Roman-Manevich Roman-Manevich requested a review from hrutvik June 18, 2026 12:57

@HadrienRenaud HadrienRenaud left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be more carefully documented. For example one of the important thing that is not said is that none of those type informations are used for type-checking.

Another important choice made here is not to use dependent types for some constructed expressions. This needs to be justified. I can see other options:

  1. Do not put any type annotations for those.
  2. Put only integer, as is done here
  3. Use recursive values to provide full precision types for literal values (this seems dangerous to me).

Maybe this should be hidden behind a command line flag not to impact the reference interpreter.
In the long term, we should probably write a typed AST which would incorporate those informations, and document what precision is given by those.

Comment thread asllib/AST.mli
Comment on lines +142 to +148
type 'a annotated = {
desc : 'a;
pos_start : position;
pos_end : position;
version : version;
opt_type : ty option;
}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep it separated with the following:

Suggested change
type 'a annotated = {
desc : 'a;
pos_start : position;
pos_end : position;
version : version;
opt_type : ty option;
}
type ('term, 'ty) annotated = {
desc : 'term;
pos_start : position;
pos_end : position;
version : version;
opt_type : 'ty option;
}

And then declare expr as (expr_desc, ty) annotated, while for example stmt could be (stmt_desc, unit) annotated

Comment thread asllib/ASTUtils.ml

let integer_exact' e = well_constrained' [ Constraint_Exact e ]
let integer_exact' e =
well_constrained' [ Constraint_Exact (e |> with_integer_ty) ]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this? At the very least, the choice of typing those as integer needs to be documented and justified. Also what needs to be justified is that this does not break anything, as now a naive implementation of type_of would find those of type integer and not integer {e}, and maybe this breaks the invariant that all expressions in types are bounded.

IMHO if it is possible, we would leave those without a type.

Naive implementation of type_of for reference.

let type_of e =
  match e.type_opt with
  | Some t -> t
  | None -> Typing.annotate_expr e |> snd

Note that here type_of one_expr != Typing.annotate_expr one_expr |> snd

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO if it is possible, we would leave those without a type.

Agreed with this - I think any type annotations we introduce in the first instance should be coming out of the existing type-checking

Comment thread asllib/Native.ml

let primitives =
let e_var x = E_Var x |> add_dummy_annotation in
let e_int_var x = E_Var x |> add_dummy_annotation |> with_integer_ty in

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be justified.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure in what sense this needs to be justified as this is a construction function. Each of the usages indeed requires an integer-typed variable (for bitwidth, for an integer exponentiation operation, and for the integer variable N.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However it doesn't justify why this shouldn't be a constrained or parameterised integer. For example I see it used in a T_Bits, which usually needs a constrained or parameterised integer right?

Comment thread asllib/Typing.ml
Comment on lines 1935 to 1945
and annotate_expr env (e : expr) : ty * expr * SES.t =
let t, e', ses = annotate_expr_impl env e in
let e_with_type = { e' with opt_type = Some t } in
(t, e_with_type, ses)

(** [annotate_expr_impl] annotates the expression [e] in [env] without setting
a type annotation. *)
and annotate_expr_impl env (e : expr) : ty * expr * SES.t =
let () = if false then Format.eprintf "@[Annotating %a@]@." PP.pp_expr e in
let here x = add_pos_from ~loc:e x and loc = to_pos e in
match e.desc with

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can avoid putting a new function in this very recursive definition with the following:

Suggested change
and annotate_expr env (e : expr) : ty * expr * SES.t =
let t, e', ses = annotate_expr_impl env e in
let e_with_type = { e' with opt_type = Some t } in
(t, e_with_type, ses)
(** [annotate_expr_impl] annotates the expression [e] in [env] without setting
a type annotation. *)
and annotate_expr_impl env (e : expr) : ty * expr * SES.t =
let () = if false then Format.eprintf "@[Annotating %a@]@." PP.pp_expr e in
let here x = add_pos_from ~loc:e x and loc = to_pos e in
match e.desc with
(** [annotate_expr] annotates the expression [e] in [env]. *)
and annotate_expr env (e : expr) : ty * expr * SES.t =
let () = if false then Format.eprintf "@[Annotating %a@]@." PP.pp_expr e in
let here x = add_pos_from ~loc:e x and loc = to_pos e in
set_type3
@@
match e.desc with

Maybe in ASTUtils.ml

let set_type3: 'ty * ('val, 'ty) annotated * 'data -> 'ty * ('val, 'ty) annotated * 'data =
  fun (t, e, d) -> (t, with_ty_annot t e, d)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I went with set_expr_type_annotation instead of set_type3.

Comment thread asllib/ASTUtils.mli
If both arguments are physically equal, then the result is also physically
equal. *)

val with_ty_annot : AST.ty -> 'a annotated -> 'a annotated

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With my proposal for the definition of annotated, it can have a more flexible definition, without losing the type constraints:

Suggested change
val with_ty_annot : AST.ty -> 'a annotated -> 'a annotated
val with_ty_annot : 'ty -> ('a, 'ty) annotated -> ('a, 'ty) annotated

(with a slight change in how with_ty_annot is written, we could also transform it to 'ty -> ('a, 'b) annotated -> ('a, 'ty) annotated)

Comment thread asllib/ASTUtils.mli
Comment thread asllib/Typing.ml
Comment on lines +4474 to +4477
match C.check with
| TypeCheck ->
false (* set to true if debugging missing annotations is needed *)
| _ -> false

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you not add a command line flag for this? I'm wondering if adding the type information should also be a command line flag.

@Roman-Manevich Roman-Manevich force-pushed the asl-910-maintain-types branch from 124b7b9 to df82466 Compare June 18, 2026 13:01

@hrutvik hrutvik left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to reduce the scope of this PR. I think it should contain only:

  • Mechanism for storing type annotations - whether the existing version or Hadrien's proposal.
  • Storing type annotations as calculated during type-checking already (i.e. wrap annotate_{expr,lexpr}, base_value)

I.e. the additional type annotations not provided by the type system, the exhaustivity checks should be removed. In future we can discuss whether we consider "missing" type annotations as truly missing (and if so whether they should be inserted post-hoc or the type system should be changed to calculate them appropriately), and how this could all be encoded into a truly typed AST.

Please let me know if you disagree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants