# Research

## Current work

Since September 2022, I am working at OCamlPro on Flambda2, an optimising middle-end for the OCaml compiler. It is used to compile code from Lambda, a high-level, untyped language that is the output of OCaml’s frontend; to Cmm, a low-level C-like language with explicit tagging and boxing operations. Among other optimisations, it performs inlining, value analysis, unboxing and interprocedural liveness analysis.

## Previous work

### An efficient and formally verified convertibility checker

I did my Ph.D. from September 2019 to August 2022 in the Inria Cambium research team, under the supervision of Xavier Leroy. The goal of my Ph.D. was to write a verified and efficient convertibility test for the Coq proof assistant.

Convertibility is about checking whether two terms reduce to the same
term. One of the rules of Coq is that if two types are convertible, then
a term that belongs to one type also belongs to the other. The current
convertibility test in Coq is however very slow, and there has been work
on making more efficient versions of it. Unfortunately, efficient
versions are often more complicated and bug-prone, as illustrated by the
not-so-old bug in Coq’s
`vm_compute`

tactic, which allowed to prove
`False`

. Therefore, a verified and efficient implementation
is much desired, lest we introduce yet another bug inside Coq’s trusted code
base.

I developed a machine able to decide the convertibility of two
lambda-terms, and able to use shortcutting when applying the same
constant to different arguments: if `x`

and `y`

are convertible, then we do not need to expand the definition of
`f`

to prove that `f x`

and `f y`

are
convertible as well. Furthermore, this machine has a worst-case
complexity only dependent on the shortest proof of convertibility
between its input terms (unlike Coq,
which can take an arbitrarily long time for this in the worst case). It
has also been formally verified using Coq, lending confidence in its
correctness.

I will defend my Ph.D. on 2024-09-19 at Inria Paris.

### Control-flow analysis for skeletal semantics

I did an internship from September 2018 to July 2019 in the Celtique research team, with Thomas Jensen and Alan Schmitt. I worked on automatically extracting an analyser from skeletal semantics.

Skeletal semantics are a way to specify the semantics of programming language at a fine-grained level, taking the primitive operations of the language as granted, similarly to what the main evaluation function of an interpreter would be once the primitives are defined. From these, it is possible to automatically extract an interpreter, or analysers such as abstract interpreters. My work was about extracting analysers giving interesting results on programs with non-static control flow, for languages that contain first-class functions, objects, or function pointers.

The analyser thus extracted works by generating a set of constraints from the program and the description of the semantics and its primitive operations, and then solving those constraints by taking advantage of their special form. The code can be found here. While we did not try to publish this work, it has been invaluable in extending skeletal semantics to make them usable for the specification of real-world languages.

### Verified code generation for the polyhedral model

I did an internship from March to July 2018 in the Gallium research team, supervised by Xavier Leroy. During this internship, I worked on verifying the code generation step in the polyhedral model.

The goal of polyhedral compilation is to optimise loop nests and imperfect loop nests. It is traditionally separated in three steps: nested loops identification, reordering, and code generation. The first step transforms the original program into a representation where each instruction corresponds to the integer points of a polyhedron where it should be executed, with dependencies between some iterations, that mean they must be executed in some order. The second step is reordering, which modifies the original polyhedron into another with the same points, but preserving dependencies. The last and the one I worked on verifying is code generation, where this intermediate representation, after reordering, is transformed back into a traditional loop nest.

I wrote a code generator that does this last step in Coq, with Verimag's Verified Polyhedra Library to handle some polyhedral operations. The code generator handles code generation for perfect and imperfect loop nests (one or more than one polyhedra in its source language), and is fully verified using Coq. Its code can be found here. If you want to read more about it, you can read the report or slides for the internship defense (both in French), or the paper about it.

### Verified code generation for the PVS2C code generator

From March to July 2017, I was in the Computer Science Laboratory at SRI International, for an internship under the supervision of Natarajan Shankar. During that time, I worked on verifying the PVS2C code generator in PVS.

I did not work on verifying the whole PVS2C generator: indeed, it is written in Lisp, which PVS can not verify, and to the best of my knowledge, there is (or at least, was at that time) no formal semantics of PVS in PVS. Instead, I worked on verifying the code generation from an intermediate language in A-normal form to an imperative C-like language, most notably including the verification of reference counting and using destructive updates when possible.

The resulting code can be found here. The proof still is incomplete in some places, due to a combination of bugs of PVS either making the proof of some subgoals needlessly complicated, not asking to prove some subgoals before some fixes in PVS, and associating proofs with the wrong subgoals when rerunning proofs. Nevertheless, the proof is mostly complete, with only a few subgoals remaining that should be simple but tedious to prove, and it also allowed to spot the bugs of PVS in the first place. In case you want to read more about it, you can get the report, the slides for the internship defense, or the paper (DOI, mirror) we published about it after some additional work by Antoine Séré.

### Improved widening operators for proving termination by abstract interpretation

I spent seven weeks in June and July 2016 at Chair of Programming Methodology at ETH Zürich, for an internship supervised by Caterina Urban. I worked on improving the widening operators used in FuncTion, an analyser based on abstract interpretation that derives sufficient preconditions for termination, guarantee and recurrence properties of C programs.

FuncTion works by inferring ranking functions at each program point, that give an upper bound to the number of program steps needed before termination. As usual for abstract interpretation, termination of the analysis depends on the widening operators being used, which are heuristical in nature. Thus, they are often one of the main sources of precision or imprecision of the analysis. I improved the existing operators taking inspiration from this paper on precise widening operators for the domain of convex polyhedra.

The code of FuncTion, with includes my improvements, can be found here. If you want to know more about these improved widening operators, you can read my internship report, the slides for the defense, slides for an audience unfamiliar with abstract interpretation, or the paper we published about them (DOI, mirror).

## Publications

- Nathanaëlle Courant, Xavier Leroy - Verified Code Generation for the Polyhedral Model.
- Accepted at POPL21.
- Nathanaëlle Courant, Antoine Séré, Natarajan Shankar - The Correctness of a Code Generator for a Functional Language.
- VMCAI 2020. (DOI, mirror)
- Nathanaëlle Courant, Caterina Urban - Precise Widening Operators for Proving Termination by Abstract Interpretation.
- TACAS 2017. (DOI, mirror)