| Age | Commit message (Collapse) | Author |
|
|
|
Generates much better SMT that assigning each field one-by-one
starting with an undefined struct.
|
|
|
|
|
|
Currently only supports pure termination measures for loops with effects.
The user syntax uses separate termination measure declarations, as in the
previous recursive termination measures, which are rewritten into the
loop AST nodes before type checking (because it would be rather difficult
to calculate the correct environment to type check the separate declaration
in).
|
|
Had to change the hundreds and hundreds of places such values were
used. However this now lets us automatically prove cheri-concentrate
properties. Such as showing
function prop_cap_round_trip(cap: bits(128)) -> bool = {
let cap_rt = capToBits(capBitsToCapability(true, cap));
cap == cap_rt
}
is always true.
|
|
Currently only works with CVC4, test cases are in test/smt. Can prove
that RISC-V add instruction actually adds values in registers and
that's about it for now.
|
|
- Rename DeIid to Operator. It corresponds to operator <string> in the
syntax. The previous name is from when it was called deinfix in
sail1.
- Removed things that weren't actually common from
pretty_print_common.ml, e.g. printing identifiers is backend
specific. The doc_id function here was only used for a very specific
use case in pretty_print_lem, so I simplified it and renamed it to
doc_sia_id, as it is always used for a SIA.Id whatever that is.
- There is some support for anonymous records in constructors, e.g.
union Foo ('a : Type) = {
MkFoo : { field1 : 'a, field2 : int }
}
somewhat similar to the enum syntax in Rust. I'm not sure when this
was added, but there were a few odd things about it. It was
desugared in the preprocessor, rather than initial_check, and the
desugaring generated incorrect code for polymorphic anonymous
records as above.
I moved the code to initial_check, so the pre-processor now just
deals with pre-processor things and not generating types, and I
fixed the code to work with polymorphic types. This revealed some
issues in the C backend w.r.t. polymorphic structs, which is the
bulk of this commit. I also added some tests for this feature.
- OCaml backend can now generate a valid string_of function for
polymorphic structs, previously this would cause the ocaml to fail
to compile.
- Some cleanup in the Sail ott definition
- Add support for E_var in interpreter previously this would just
cause the interpreter to fail
|
|
Previously the specialization would remove any polymorphic union
constructor that was never created anywhere in the
specification. While this wasn't usually problematic, it does leave an
edge case where such a constructor could be matched upon in a pattern,
and then the resulting match would fail to compile as it would be
matching on a constructor kind that doesn't exists.
This should fix that issue by chaging the V_ctor_kind value into an
F_ctor_kind fragment. Previously a polymorphic constructor-kind would
have been represented by its mangled name, e.g.
V_ctor_kind "zSome_unit"
would now be represented as
V_ctor_kind ("Some", unifiers, ty)
where ty is a monomorphic version of the original constructor's type
such that
ctyp_unify original_ty ty = unifiers
and the mangled name we generate is
zencode_string ("Some_" ^ string_of_list "_" string_of_ctyp unifiers)
|
|
Allows us to track the last version of the return variable when the AST
in in SSA form.
|
|
Remove unused experimental optimizations
|
|
Avoids duplication between l-expressions and expressions. Also means that
special variables like current_exception and have_exception are treated
normally by functions such as instr_reads and instr_writes etc. Furthermore
we can now easily annotate Jib identifiers in ways that were not previously
possible with plain sail ids.
|
|
|
|
Add a CL_void l-expression so we don't have redundant unit-typed
variables everywhere, and add an optimization in Jib_optimize called
optimize_unit which introduces these.
Remove the basic control-flow graph in Jib_util and add a new mutable
control-flow graph type in Jib_ssa which allows the IR to be converted
into SSA form. The mutable graph allows for more efficient updates,
and includes both back and forwards references making it much more
convenient to traverse.
Having an SSA representation should make some optimizations much
simpler, and is also probably more natural for SMT generation where
variables have to be defined once using declare-const anyway.
Debug option -ddump_flow_graphs now outputs SSA'd graphs of the
functions in a specification.
|
|
Main change is splitting apart the Sail->IR compilation stage and the
C code generation and optimization phase. Rather than variously
calling the intermediate language either bytecode (when it's not
really) or simply IR, we give it a name: Jib (a type of Sail). Most of
the types are still prefixed by c/C, and I don't think it's worth
changing this.
The various parts of the C backend are now in the src/jib/ subdirectory
src/jib/anf.ml - Sail->ANF translation
src/jib/jib_util.ml - various Jib AST processing and helper functions (formerly bytecode_util)
src/jib/jib_compile.ml - Sail->Jib translation (using Sail->ANF)
src/jib/c_backend.ml - Jib->C code generator and optimizations
Further, bytecode.ott is now jib.ott and generates jib.ml (which still
lives in src/ for now)
The optimizations in c_backend.ml should eventually be moved in a
separate jib_optimization file.
The Sail->Jib compilation can be parameterised by two functions - one
is a custom ANF->ANF optimization pass that can be specified on a per
Jib backend basis, and the other is the rule for translating Sail
types in Jib types. This can be more or less precise depending on how
precise we want to be about bit-widths etc, i.e. we only care about <64
and >64 for C, but for SMT generation we would want to be as precise
as possible.
Additional improvements:
The Jib IR is now agnostic about whether arguments are allocated on
the heap vs the stack and this is handled by the C code generator.
jib.ott now has some more comments explaining various parts of the Jib
AST.
A Set module and comparison function for ctyps is defined, and some
functions now return ctyp sets rather than lists to avoid repeated
work.
|
|
|
|
If we want to use our low-level intermediate representation to generate
SMT, then we want to be more precise than just splitting integers into
64-bits and larger. This commit changes CT_int and CT_int64 into CT_lint
for large integers and CT_fint n for (signed) fixed precision integers
that fit within n bits. This follows the convention for bitvectors where
we have CT_fbits for fixed-length bitvectors and CT_lbits for large
arbitrary precision bitvectors.
|
|
Perhaps suprisingly to some, this did not mean that Sail was
unable to typecheck the identify function.
While doing this rename Effect_opt_pure to Effect_opt_none - as
Effect_opt_pure was the effect equivalent of Typ_annot_opt_none,
and actually means that the function definition lacks an effect
annnotation, not that the function is actually pure, so this was
*extremely* misleading. The effect_opt that actually indicated a
function is pure was (and still is) the succinct:
Effect_opt_aux (Effect_opt_effect (Effect_aux (Effect_set [], _)), _)
In fact because in the grammar we only specify effects on
valspecs (they can always be inferred for fundefs in the absence
of a valspec) effect_opts are basically vestigial and are always
Effect_opt_none.
What might actually be super nice would be to remove rec_opt,
effect_opt and typ_annot_opt from fundefs in ast.ml altogether
and if we want them in the syntax just have them in parse_ast.ml
and pull them into a valspec during the initial check.
|
|
|
|
This supports the following syntax:
type xlen : Int = 64
type ylen : Int = 1
type xlenbits = bits(xlen)
bitfield Mstatus : xlenbits = {
SD : xlen - ylen,
SXL : xlen - ylen - 1 .. xlen - ylen - 3
}
|
|
|
|
|
|
We want to ensure that no_devices.sail and devices.sail have the same
effect footprint, because with a snapshot-type release in sail-arm, we
can't rebuild the spec with asl_to_sail every time we switch from
running elf binaries to booting OS's. This commit allows registers to
have arbitrary effects, so registers that are really representing
memory-mapped devices don't have to have the wmem/rmem effect.
|
|
|
|
|
|
function (preparing for marshalling)
|
|
Remove unused name schemes and DEF_kind
|
|
|
|
|
|
We should infer type variable kinds better in initial_check.ml, but we really don't want to have to deal
with that everywhere, especially when we can no longer easily cheat and assume KOpt_none implies K_int.
|
|
This only applies to recursive functions and uses the termination measure
merely as a limit to the recursive call depth, rather than proving the
measure correct.
|
|
Change Typ_arg_ to A_. We use it a lot more now typ_arg is used instead of
uvar as the result of unify. Plus A_ could either stand for argument, or
Any/A type which is quite appropriate in most use cases.
Restore instantiation info in infer_funapp'. Ideally we would save this
instead of recomputing it ever time we need it. However I checked and
there are over 300 places in the code that would need to be changed to add
an extra argument to E_app. Still some issues causing specialisation to
fail however.
Improve the error message when we swap how we infer/check an l-expression,
as this could previously cause the actual cause of a type-checking failure
to be effectively hidden.
|
|
On a new branch because it's completely broken everything for now
|
|
Mostly this is to change how we desugar types in order to make us more
flexible with what we can parse as a valid constraint as
type. Previously the structure of the initial check forced some
awkward limitations on what was parseable due to how the parse AST is
set up.
As part of this, I've taken the de-scattering of scattered functions
out of the initial check, and moved it to a re-writing step after
type-checking, where I think it logically belongs. This doesn't change
much right now, but opens up some more possibilities in the future:
Since scattered functions are now typechecked normally, any future
module system for Sail would be able to handle them specially, and the
Latex documentation backend can now document scattered functions
explicitly, rather than relying on hackish 'de-scattering' logic to
present documentation as the functions originally appeared.
This has one slight breaking change which is that union clauses must
appear before their uses in scattered functions, so
union ast = Foo : unit
function clause execute(Foo())
is ok, but
function clause execute(Foo())
union ast = Foo : unit
is not. Previously this worked because the de-scattering moved union
clauses upwards before type-checking, but as this now happens after
type-checking they must appear in the correct order. This doesn't
occur in ARM, RISC-V, MIPS, but did appear in Cheri and I submitted a
pull request to re-order the places where it happens.
|
|
This makes dealing with records and field expressions in Sail much
nicer because the constructors are no longer stacked together like
matryoshka dolls with unnecessary layers. Previously to get the fields
of a record it would be either
E_aux (E_record (FES_aux (FES_Fexps (fexps, _), _)), _)
but now it is simply:
E_aux (E_record fexps, _)
|
|
Rather than having K_aux (K_kind [BK_aux (BK_int, _)], _) represent
the Int kind, we now just have K_aux (K_int, _). Since the language is
first order we have no need for fancy kinds in the AST.
|
|
They weren't needed for ASL parser like I thought they would be, and
they increase the complexity of dealing with constraints throughout
Sail, so just remove them.
Also fix some compiler warnings
|
|
Should hopefully fix memory leak in RISC-V.
Also adds an optimization pass that removes copying structs and allows
some structs to simply alias each other and avoid copying their
contents. This requires knowing certain things about the lifetimes of
the structs involved, as can't free the struct if another variable is
referencing it - therefore we conservatively only apply this
optimization for variables that are lifted outside function
definitions, and should therefore never get freed until the model
exits - however this may cause issues outside ARMv8, as there may be
cases where a struct can exist within a variant type (which are not
yet subject to this lifting optimisation), that would break these
assumptions - therefore this optimisation is only enabled with the
-Oexperimental flag.
|
|
- Propagate types more accurately to improve optimization on ANF
representation.
- Add a generic optimization pass to remove redundant variables that
simply alias other variables.
- Modify Sail interactive mode, so it can compile a specification with
the :compile command, view generated intermediate representation via
the :ir <function> command, and step-through the IR with :exec <exp>
(although this is very incomplete)
- Introduce a third bitvector representation, between fast
fixed-precision bitvectors, and variable length large
bitvectors. The bitvector types are now from most efficient to least
* CT_fbits for fixed precision, 64-bit or less bitvectors
* CT_sbits for 64-bit or less, variable length bitvectors
* CT_lbits for arbitrary variable length bitvectors
- Support for generating C code using CT_sbits is currently
incomplete, it just exists in the intermediate representation right
now.
- Include ctyp in AV_C_fragment, so we don't have to recompute it
|
|
The main changes so far are:
* Allow markdown formatting in doc comments. We parse the markdown
using Omd, which is a OCaml library for parsing markdown. The nice
thing about this library is it's pure OCaml and has no dependencies
other the the stdlib. Incidentally it was also developed at OCaml
labs. Using markdown keeps our doc-comments from becoming latex
specfic, and having an actual parser is _much_ nicer than trying to
hackily process latex in doc-comments using OCamls somewhat sub-par
regex support.
* More sane conversion latex identifiers the main approach is to
convert Sail identifiers to lowerCamelCase, replacing numbers with
words, and then add a 'category' code based on the type of
identifier, so for a function we'd have fnlowerCamelCase and for
type synonym typelowerCamelCase etc. Because this transformation is
non-injective we keep track of identifiers we've generated so we end
up with identifierA, identifierB, identifierC when there are
collisions.
* Because we parse markdown in doc comments doc comments can use Sail
identifiers directly in hyperlinks, without having to care about how
they are name-mangled down into TeX compatible things.
* Allow directives to be passed through the compiler to
backends. There are various $latex directives that modify the latex
output. Most usefully there's a
$latex newcommand name markdown
directive that uses the markdown parser to generate latex
commands. An example of why this is useful is bellow. We can also use
$latex noref id
To suppress automatically inserting links to an identifier
* Refactor the latex generator to make the overall generation process
cleaner
* Work around the fact that some operating systems consider
case-sensitive file names to be a good thing
* Fix a bug where latex generation wouldn't occur unless the directory
specified by -o didn't exist
This isn't quite all the requested features for good CHERI
documentation, but new features should be much easier to add now.
|
|
Currently not enabled by default, the flag -Xconstraint_synonyms
enables them
For generating constraints in ASL parser, we want to be able to give
names to the constraints that we attach to certain variables. It's
slightly awkward right now when constraints get long complicated
because the entire constraint always has to be typed out in full
whenever it appears, and there's no way to abstract away from that.
This adds constraint synonyms, which work much like type synonyms
except for constraints, e.g.
constraint Size('n) = 'n in {1, 2, 4, 8} | 128 <= 'n <= 256
these constraints can then be used instead of the full constraint, e.g.
val f : forall 'n, where Size('n). int('n) -> unit
Unfortunatly we need to have a keyword to 'call' the constraint
synonym otherwise the grammer stops being LR(1). This could be
resolved by parsing all constraints into Parse_ast.atyp and then
de-sugaring them into constraints, which is what happens for
n-expressions already, but that would require quite a bit of work on
the parser.
To avoid this forcing changes to any other parts of Sail, the intended
invariant is that all constraints appearing anywhere in a type-checked
AST have no constraint synonyms, so they don't have to worry about
matching on NC_app, or calling Env.expand_typquant_synonyms (which
isn't even exported for this reason).
|
|
Changes the representation of function types in the ast from
Typ_fn : typ -> typ
to
Typ_fn : typ list -> typ
to more accurately represent their use in the various backends, where we often compile functions to either their curried representations as in Lem and Isabelle, or just
multiple argument functions in C. There's still some oddity because a single pattern in a function clause can bind against multiple arguments, and maybe we want to
forbid this in the future. The syntax also hasn't changed (yet), so in theory this change shouldn't break anything (but it invariably will...).
In the future we would ideally require that a function with N arguments has exactly N patterns in its declaration, one for each argument so
f : (x, y) -> z
f _ = ...
would be disallowed (as _ matches both x and y), forcing
f(_, _) = z
this would simply quite a few things,
Also we could have a different syntax for function argument lists and tuples, because it's rather hard to define a function that actually takes a tuple with the syntax
how it is now.
Some issues I noticed when doing this refactoring:
Line 1926 of Coq translation. untuple_args_pat is maybe no longer needed? However there's still some funnyness where a pattern can be used to bind multiple function
arguments so maybe it still is.
Line 2306 of monomorphisation. I simplified the logic here. I think it's equivalent now, but I could be wrong.
Line 4517 of rewrites. I'm not sure what make_cstr_mappings is doing here, but hopefully the simpler version is the same.
|
|
|
|
Make the C l-expression type in Sail more generic and expressive, and
refactor the generation of conversions into a seperate
codegen_conversion function, which can handle more complex cases than
the previous more ad-hoc method.
|
|
Add additional well-formedness check when calling typing rules
|
|
types
|
|
These match the new ASL pattern constructors:
- !p matches if the pattern p does not match
- { p1, ... pn } matches if any of the patterns p1 ... pn match
We desugar the set pattern "{p1, ... pn}" into "p1 | (p2 | ... pn)".
ASL does not have pattern binding but Sail does. The rules at the
moment are that none of the pattern can contain patterns. This could
be relaxed by allowing "p1 | p2" to bind variables provided p1 and p2
both bind the same variables.
|
|
E_internal_cast, E_sizeof_internal, E_internal_exp,
E_internal_exp_user, E_comment, and E_comment_struc were all
unused. For a lem based interpreter, we want to be able to compile it
to iUsabelle, and due to slowness inherent in Isabelle's datatype
package we want to remove unused constructors in our AST type.
Also remove the lem_ast backend - it's heavily bitrotted, and for
loading the ARM ast into the interpreter it's just not viable to use
this approach as it just doesn't scale. We really need a way to be
able to serialise and deserialise the AST efficiently in Lem.
|
|
|
|
|