| Age | Commit message (Collapse) | Author |
|
Main change is splitting apart the Sail->IR compilation stage and the
C code generation and optimization phase. Rather than variously
calling the intermediate language either bytecode (when it's not
really) or simply IR, we give it a name: Jib (a type of Sail). Most of
the types are still prefixed by c/C, and I don't think it's worth
changing this.
The various parts of the C backend are now in the src/jib/ subdirectory
src/jib/anf.ml - Sail->ANF translation
src/jib/jib_util.ml - various Jib AST processing and helper functions (formerly bytecode_util)
src/jib/jib_compile.ml - Sail->Jib translation (using Sail->ANF)
src/jib/c_backend.ml - Jib->C code generator and optimizations
Further, bytecode.ott is now jib.ott and generates jib.ml (which still
lives in src/ for now)
The optimizations in c_backend.ml should eventually be moved in a
separate jib_optimization file.
The Sail->Jib compilation can be parameterised by two functions - one
is a custom ANF->ANF optimization pass that can be specified on a per
Jib backend basis, and the other is the rule for translating Sail
types in Jib types. This can be more or less precise depending on how
precise we want to be about bit-widths etc, i.e. we only care about <64
and >64 for C, but for SMT generation we would want to be as precise
as possible.
Additional improvements:
The Jib IR is now agnostic about whether arguments are allocated on
the heap vs the stack and this is handled by the C code generator.
jib.ott now has some more comments explaining various parts of the Jib
AST.
A Set module and comparison function for ctyps is defined, and some
functions now return ctyp sets rather than lists to avoid repeated
work.
|
|
|
|
Add some comments in constant_fold.ml
|
|
|
|
For example:
sail -c_extra_params "CPUMIPSState *env" -c_extra_args env
would pass a QEMU MIPS cpu state to every non-builtin function call.
Also add documentation for each C compilation option in C_backend.mli
|
|
Can now set a prefix for generated C functions with -c_prefix so
-c_prefix sail_ would give sail_execute_CGetPerm over
zexecute_CGetPerm. We still have to use our standard name-mangling
scheme to avoid possible collisions within the name.
Can build C that doesn't expect the standard runtime, which leaves
operations like read_memory, write_memory etc to be stubbed in by
another C program including the generated Sail. Things like
letbindings are still an issue because we rely on a very small runtime
to initialize global letbindings and similar.
-c_separate_execute splits the execute function apart in the generated C
|
|
|
|
Add a flag in C backend ctx that allows us to generate arbitrary
precision signed integer types, rather than just int64
|
|
If we want to use our low-level intermediate representation to generate
SMT, then we want to be more precise than just splitting integers into
64-bits and larger. This commit changes CT_int and CT_int64 into CT_lint
for large integers and CT_fint n for (signed) fixed precision integers
that fit within n bits. This follows the convention for bitvectors where
we have CT_fbits for fixed-length bitvectors and CT_lbits for large
arbitrary precision bitvectors.
|
|
Run C tests with -O -Oconstant_fold -auto_mono
|
|
Specifically remove branches where flow-typing implies false, as this
allows the optimizer to prove anything, which can result in nonsense
code. This does incur something of a performance hit.
|
|
Useful to see what constraints we are generating that are particularly
hard, and which of our specs work with different solvers.
Refactor code to use smt in names rather than specifically z3
|
|
All sizeof expressions now removed by the type-checker, so it's now
properly a type error if they cannot be removed rather than a bizarre
re-write error. This also greatly improves compilation speed overall, at
the expense of the first type-checking pass.
|
|
Make all backends behave the same when a catch block does not catch a
specific exception.
|
|
Fix pretty printer bug
|
|
|
|
|
|
For example in RISC-V for the translation table walk:
$optimize unroll 2
val walk32 ...
function walk32 ...
would create two extra copies of the walk_32 function,
walk_32_unroll_1 and walk_32_unroll_2, with only walk_32_unroll_2
being recursive. Currently we only support the case where we have
$optimize unroll, directly followed by a valspec, then a function, but
this should be generalised in future.
This optimization nearly doubles the performance of RISC-V
It is implemented using a new Optimize.recheck rewrite that replaces
the ordinary recheck_defs pass. It uses a new typechecker
check_with_envs function that allows re-writes to utilise intermediate
typechecking environments to minimize the amount of AST checking that
occurs, for performance reasons.
Note that older Sail versions including the current OPAM release will
complain about the optimize pragma, so this cannot be used until they
become up to date with this change.
|
|
|
|
We want to ensure that no_devices.sail and devices.sail have the same
effect footprint, because with a snapshot-type release in sail-arm, we
can't rebuild the spec with asl_to_sail every time we switch from
running elf binaries to booting OS's. This commit allows registers to
have arbitrary effects, so registers that are really representing
memory-mapped devices don't have to have the wmem/rmem effect.
|
|
Remove unused name schemes and DEF_kind
|
|
Work on improving the formatting and quality of error messages
When sail is invoked with sail -i, any type errors now drop the user
down to the interactive prompt, with the interactive environment being
the environment at the point the type error occurred, this means the
typechecker state can be interactively queried in the interpreter to help
diagnose type errors.
|
|
Add an extra argument for Type_check.prove for the location of the prove
call (as prove __POS__) to help debug SMT related issues
|
|
|
|
|
|
* Improve type inference for numeric if statements (if_infer test)
* Correctly handle constraints for existentially quantified constructors (constraint_ctor test)
* Canonicalise all numeric types in function arguments, which
triggers some weird edge cases between parametric polymorphism and
subtyping of numeric arguments
* Because of this eq_int, eq_range, and eq_atom etc become identical
* Avoid duplicating destruct_exist in Env
* Handle some odd subtyping cases better
|
|
Change Typ_arg_ to A_. We use it a lot more now typ_arg is used instead of
uvar as the result of unify. Plus A_ could either stand for argument, or
Any/A type which is quite appropriate in most use cases.
Restore instantiation info in infer_funapp'. Ideally we would save this
instead of recomputing it ever time we need it. However I checked and
there are over 300 places in the code that would need to be changed to add
an extra argument to E_app. Still some issues causing specialisation to
fail however.
Improve the error message when we swap how we infer/check an l-expression,
as this could previously cause the actual cause of a type-checking failure
to be effectively hidden.
|
|
Mostly this is to change how we desugar types in order to make us more
flexible with what we can parse as a valid constraint as
type. Previously the structure of the initial check forced some
awkward limitations on what was parseable due to how the parse AST is
set up.
As part of this, I've taken the de-scattering of scattered functions
out of the initial check, and moved it to a re-writing step after
type-checking, where I think it logically belongs. This doesn't change
much right now, but opens up some more possibilities in the future:
Since scattered functions are now typechecked normally, any future
module system for Sail would be able to handle them specially, and the
Latex documentation backend can now document scattered functions
explicitly, rather than relying on hackish 'de-scattering' logic to
present documentation as the functions originally appeared.
This has one slight breaking change which is that union clauses must
appear before their uses in scattered functions, so
union ast = Foo : unit
function clause execute(Foo())
is ok, but
function clause execute(Foo())
union ast = Foo : unit
is not. Previously this worked because the de-scattering moved union
clauses upwards before type-checking, but as this now happens after
type-checking they must appear in the correct order. This doesn't
occur in ARM, RISC-V, MIPS, but did appear in Cheri and I submitted a
pull request to re-order the places where it happens.
|
|
- Completely remove the nexp = nexp syntax in favour of nexp ==
nexp. All our existing specs have already switched over. As part of
this fix every test that used the old syntax, and update the
generated aarch64 specs
- Remove the `type when constraint` syntax. It just makes changing the
parser in any way really awkward.
- Change the syntax for declaring new types with multiple type
parameters from:
type foo('a : Type) ('n : Int), constraint = ...
to
type foo('a: Type, 'n: Int), constraint = ...
This makes type declarations mimic function declarations, and makes
the syntax for declaring types match the syntax for using types, as
foo is used as foo(type, nexp). None of our specifications use types
with multiple type parameters so this change doesn't actually break
anything, other than some tests. The brackets around the type
parameters are now mandatory.
- Experiment with splitting Type/Order type parameters from Int type
parameters in the parser.
Currently in a type bar(x, y, z) all of x, y, and z could be either
numeric expressions, orders, or types. This means that in the parser
we are severely restricted in what we can parse in numeric
expressions because everything has to be parseable as a type (atyp)
- it also means we can't introduce boolean type
variables/expressions or other minisail features (like removing
ticks from type variables!) because we are heavily constrained by
what we can parse unambigiously due to how these different type
parameters can be mixed and interleaved.
There is now experimental syntax: vector::<'o, 'a>('n) <-->
vector('n, 'o, 'a) which splits the type argument list into two
between Type/Order-polymorphic arguments and Int-polymorphic
arguments. The exact choice of delimiters isn't set in stone - ::<
and > match generics in Rust. The obvious choices of < and > / [ and
] are ambigious in various ways.
Using this syntax right now triggers a warning.
- Fix undefined behaviour in C compilation when concatenating a
0-length vector with a 64-length vector.
|
|
- Fix pretty printing nested constraints
- Add flow typing for if condition then { throw exn }; ... blocks
- Add optimisations for bitvector concatenation in C
|
|
Should hopefully fix memory leak in RISC-V.
Also adds an optimization pass that removes copying structs and allows
some structs to simply alias each other and avoid copying their
contents. This requires knowing certain things about the lifetimes of
the structs involved, as can't free the struct if another variable is
referencing it - therefore we conservatively only apply this
optimization for variables that are lifted outside function
definitions, and should therefore never get freed until the model
exits - however this may cause issues outside ARMv8, as there may be
cases where a struct can exist within a variant type (which are not
yet subject to this lifting optimisation), that would break these
assumptions - therefore this optimisation is only enabled with the
-Oexperimental flag.
|
|
This optimisation re-uses variables if possible, rather than
allocating new ones.
|
|
Bitvectors that aren't fixed size, but can still be shown to fit
within 64-bits, now have a specialised representation. Still need to
introduce more optimized functions, as right now we mostly have to
convert them into large bitvectors to pass them into most
functions. Nevertheless, this doubles the performance of the TLBLookup
function in ARMv8.
|
|
- Propagate types more accurately to improve optimization on ANF
representation.
- Add a generic optimization pass to remove redundant variables that
simply alias other variables.
- Modify Sail interactive mode, so it can compile a specification with
the :compile command, view generated intermediate representation via
the :ir <function> command, and step-through the IR with :exec <exp>
(although this is very incomplete)
- Introduce a third bitvector representation, between fast
fixed-precision bitvectors, and variable length large
bitvectors. The bitvector types are now from most efficient to least
* CT_fbits for fixed precision, 64-bit or less bitvectors
* CT_sbits for 64-bit or less, variable length bitvectors
* CT_lbits for arbitrary variable length bitvectors
- Support for generating C code using CT_sbits is currently
incomplete, it just exists in the intermediate representation right
now.
- Include ctyp in AV_C_fragment, so we don't have to recompute it
|
|
Also fix some C optimisations
|
|
This was _really_ slow - about 50secs for ARM. If this changes causes
breakages we should fix them in some other way.
Also using Reporting.err_unreachable in ANF translation, and fix slice
optimization when creating slices larger than 64-bits in C translation
|
|
|
|
This should fix the issue in cheri128
Also introduce a feature to more easily debug the C backend:
sail -dfunction Name
will pretty-print the ANF and IR representation of just the Name
function. I want to make this work for the type-checker as well, but
it's a bit hard to get that to not fire during re-writing passes right
now.
|
|
This goes partway to resolving issue #23, as it now generates C code,
but it seems like there is still an issue with the generated C.
|
|
There is no Reporting_complex, so it's not clear what the basic is
intended to signify anyway.
Add a GitHub issue link to any err_unreachable errors (as they are all
bugs)
|
|
For example, for a function like
```
val aget_X : forall 'n, 0 <= 'n <= 31. int('n) -> bits(64)
function test(n : int) -> unit = {
let y = aget_X(n);
()
}
```
we get the message
> Could not resolve quantifiers for aget_X (0 <= 'ex7# & 'ex7# <= 31)
>
> Try adding named type variables for n : atom('ex7#)
>
> The property (0 <= n & n <= 31) must hold
which suggests adding a name for the type variable 'ex7#, and gives
the property in terms of the variable n. If we give n a type variable name:
```
val test : int -> unit
function test(n as 'N) = {
let y = aget_X(n);
()
}
```
It will suggest a constraint involving the type variable name
> Could not resolve quantifiers for aget_X (0 <= 'ex6# & 'ex6# <= 31)
>
> Try adding the constraint (0 <= 'N & 'N <= 31)
|
|
Changes the representation of function types in the ast from
Typ_fn : typ -> typ
to
Typ_fn : typ list -> typ
to more accurately represent their use in the various backends, where we often compile functions to either their curried representations as in Lem and Isabelle, or just
multiple argument functions in C. There's still some oddity because a single pattern in a function clause can bind against multiple arguments, and maybe we want to
forbid this in the future. The syntax also hasn't changed (yet), so in theory this change shouldn't break anything (but it invariably will...).
In the future we would ideally require that a function with N arguments has exactly N patterns in its declaration, one for each argument so
f : (x, y) -> z
f _ = ...
would be disallowed (as _ matches both x and y), forcing
f(_, _) = z
this would simply quite a few things,
Also we could have a different syntax for function argument lists and tuples, because it's rather hard to define a function that actually takes a tuple with the syntax
how it is now.
Some issues I noticed when doing this refactoring:
Line 1926 of Coq translation. untuple_args_pat is maybe no longer needed? However there's still some funnyness where a pattern can be used to bind multiple function
arguments so maybe it still is.
Line 2306 of monomorphisation. I simplified the logic here. I think it's equivalent now, but I could be wrong.
Line 4517 of rewrites. I'm not sure what make_cstr_mappings is doing here, but hopefully the simpler version is the same.
|
|
When converting to A-normal form I just used the type of the then
branch of if statements to get the type of the whole if statement -
usually they'd be the same, but with flow typing one of the branches
can have a false constraint, which then allows the optimizer to fit
any integer into a 64-bit integer causing an overflow. The fix is to
correctly use the type the typechecker gives for the whole if
statement.
Also add decimal_string_of_bits to the C output.
Rename is_reftyp to is_ref_typ to be more consistent with other
is_X_typ functions in Ast_util.
|
|
This really demonstrates why we should switch to Typ_fn being a typ
list * typ constructor because the implementation here feels *really*
hacky with dummy Typ_tup constructors being used to enforce single
arguments for constructors.
|
|
|
|
|
|
typing
Added a regression test as c/test/downcast_fn.sail
|
|
|
|
constructors
Add a new printing function for debugging that recursively prints
constructor types.
Fix an interpreter bug when pattern matching on constructors with
tuple types.
|
|
Now all we need to do is make sure the RISC-V builtins are mapped to
the correct C functions, and RISC-V in C should work
(hopefully). We're still missing some of the functions in sail.c for
the mappings so those have to be implemented.
|