| Age | Commit message (Collapse) | Author |
|
|
|
can now write e.g.
forall (constant 'n : Int) rather than forall ('n: Int)
which requires 'n to be a constant integer value whenever the function
is called. I added this to the 'addrsize variable on memory
reads/writes to absolutely guarantee in the SMT generation that we
don't have to worry about the address being a variable length
bitvector.
|
|
Previously path conditionals for a node were defined as the path
conditional of the immediate dominator (+ a guard for explicit guard
nodes after conditional branches), whereas now they are the path
conditional of the immediate dominator plus an expression
encapsulating all the guards between the immediate dominator and the
node. This is needed as the previous method was incorrect for certain
control flow graphs.
This slows down the generated SMT massively, because it causes the
path conditionals to become huge when the immediate dominator is far
away from the node in question. It also changes computing path
conditionals from O(n) to O(n^2) which is not ideal as our inlined
graphs can become massive. Need to figure out a better way to generate
minimal path conditionals between the immediate dominator and the
node.
I upped the timeout for the SMT tests from 20s to 300s each but this
may still cause a failure in Jenkins because that machine is slow.
|
|
|
|
|
|
Fixes C backend optimizations that were disabled due to changes in the
IR while working on the SMT generation.
Also add a -Oaarch64_fast option that optimizes any integer within a
struct to be an int64_t, which is safe for the ARM v8.5 spec and
improves performance significantly (reduces Linux boot times by 4-5
minutes). Eventually this should probably be a directive that can be
attached to any arbitrary struct/type.
Fixes the -c_specialize option for ARM v8.5. However this only gives a
very small performance improvment for a very large increase in
compilation time however.
|
|
generated for Sail AST
|
|
Need to get these working again before we can thing about merging back
into sail2
|
|
Get rid of separate V_op and V_unary constructors. jib.ott now defines
the valid operations for V_call including zero/sign extension, in such
a way that the operation ctyp can be inferred. Overall this makes the
IR less ad-hoc, and means we can share more code between SMT and C.
string_of_cval no longer used by c_backend, which now uses sgen_cval
following other sgen_ functions in the code generator, meaning
string_of_cval doesn't have to produce valid C code anymore and so can
be used for backend-agnostic debug and error messages.
|
|
|
|
Generates much better SMT that assigning each field one-by-one
starting with an undefined struct.
|
|
|
|
|
|
Currently only supports pure termination measures for loops with effects.
The user syntax uses separate termination measure declarations, as in the
previous recursive termination measures, which are rewritten into the
loop AST nodes before type checking (because it would be rather difficult
to calculate the correct environment to type check the separate declaration
in).
|
|
Had to change the hundreds and hundreds of places such values were
used. However this now lets us automatically prove cheri-concentrate
properties. Such as showing
function prop_cap_round_trip(cap: bits(128)) -> bool = {
let cap_rt = capToBits(capBitsToCapability(true, cap));
cap == cap_rt
}
is always true.
|
|
Currently only works with CVC4, test cases are in test/smt. Can prove
that RISC-V add instruction actually adds values in registers and
that's about it for now.
|
|
- Rename DeIid to Operator. It corresponds to operator <string> in the
syntax. The previous name is from when it was called deinfix in
sail1.
- Removed things that weren't actually common from
pretty_print_common.ml, e.g. printing identifiers is backend
specific. The doc_id function here was only used for a very specific
use case in pretty_print_lem, so I simplified it and renamed it to
doc_sia_id, as it is always used for a SIA.Id whatever that is.
- There is some support for anonymous records in constructors, e.g.
union Foo ('a : Type) = {
MkFoo : { field1 : 'a, field2 : int }
}
somewhat similar to the enum syntax in Rust. I'm not sure when this
was added, but there were a few odd things about it. It was
desugared in the preprocessor, rather than initial_check, and the
desugaring generated incorrect code for polymorphic anonymous
records as above.
I moved the code to initial_check, so the pre-processor now just
deals with pre-processor things and not generating types, and I
fixed the code to work with polymorphic types. This revealed some
issues in the C backend w.r.t. polymorphic structs, which is the
bulk of this commit. I also added some tests for this feature.
- OCaml backend can now generate a valid string_of function for
polymorphic structs, previously this would cause the ocaml to fail
to compile.
- Some cleanup in the Sail ott definition
- Add support for E_var in interpreter previously this would just
cause the interpreter to fail
|
|
Previously the specialization would remove any polymorphic union
constructor that was never created anywhere in the
specification. While this wasn't usually problematic, it does leave an
edge case where such a constructor could be matched upon in a pattern,
and then the resulting match would fail to compile as it would be
matching on a constructor kind that doesn't exists.
This should fix that issue by chaging the V_ctor_kind value into an
F_ctor_kind fragment. Previously a polymorphic constructor-kind would
have been represented by its mangled name, e.g.
V_ctor_kind "zSome_unit"
would now be represented as
V_ctor_kind ("Some", unifiers, ty)
where ty is a monomorphic version of the original constructor's type
such that
ctyp_unify original_ty ty = unifiers
and the mangled name we generate is
zencode_string ("Some_" ^ string_of_list "_" string_of_ctyp unifiers)
|
|
Allows us to track the last version of the return variable when the AST
in in SSA form.
|
|
Remove unused experimental optimizations
|
|
Avoids duplication between l-expressions and expressions. Also means that
special variables like current_exception and have_exception are treated
normally by functions such as instr_reads and instr_writes etc. Furthermore
we can now easily annotate Jib identifiers in ways that were not previously
possible with plain sail ids.
|
|
|
|
Add a CL_void l-expression so we don't have redundant unit-typed
variables everywhere, and add an optimization in Jib_optimize called
optimize_unit which introduces these.
Remove the basic control-flow graph in Jib_util and add a new mutable
control-flow graph type in Jib_ssa which allows the IR to be converted
into SSA form. The mutable graph allows for more efficient updates,
and includes both back and forwards references making it much more
convenient to traverse.
Having an SSA representation should make some optimizations much
simpler, and is also probably more natural for SMT generation where
variables have to be defined once using declare-const anyway.
Debug option -ddump_flow_graphs now outputs SSA'd graphs of the
functions in a specification.
|
|
Main change is splitting apart the Sail->IR compilation stage and the
C code generation and optimization phase. Rather than variously
calling the intermediate language either bytecode (when it's not
really) or simply IR, we give it a name: Jib (a type of Sail). Most of
the types are still prefixed by c/C, and I don't think it's worth
changing this.
The various parts of the C backend are now in the src/jib/ subdirectory
src/jib/anf.ml - Sail->ANF translation
src/jib/jib_util.ml - various Jib AST processing and helper functions (formerly bytecode_util)
src/jib/jib_compile.ml - Sail->Jib translation (using Sail->ANF)
src/jib/c_backend.ml - Jib->C code generator and optimizations
Further, bytecode.ott is now jib.ott and generates jib.ml (which still
lives in src/ for now)
The optimizations in c_backend.ml should eventually be moved in a
separate jib_optimization file.
The Sail->Jib compilation can be parameterised by two functions - one
is a custom ANF->ANF optimization pass that can be specified on a per
Jib backend basis, and the other is the rule for translating Sail
types in Jib types. This can be more or less precise depending on how
precise we want to be about bit-widths etc, i.e. we only care about <64
and >64 for C, but for SMT generation we would want to be as precise
as possible.
Additional improvements:
The Jib IR is now agnostic about whether arguments are allocated on
the heap vs the stack and this is handled by the C code generator.
jib.ott now has some more comments explaining various parts of the Jib
AST.
A Set module and comparison function for ctyps is defined, and some
functions now return ctyp sets rather than lists to avoid repeated
work.
|
|
|
|
If we want to use our low-level intermediate representation to generate
SMT, then we want to be more precise than just splitting integers into
64-bits and larger. This commit changes CT_int and CT_int64 into CT_lint
for large integers and CT_fint n for (signed) fixed precision integers
that fit within n bits. This follows the convention for bitvectors where
we have CT_fbits for fixed-length bitvectors and CT_lbits for large
arbitrary precision bitvectors.
|
|
Perhaps suprisingly to some, this did not mean that Sail was
unable to typecheck the identify function.
While doing this rename Effect_opt_pure to Effect_opt_none - as
Effect_opt_pure was the effect equivalent of Typ_annot_opt_none,
and actually means that the function definition lacks an effect
annnotation, not that the function is actually pure, so this was
*extremely* misleading. The effect_opt that actually indicated a
function is pure was (and still is) the succinct:
Effect_opt_aux (Effect_opt_effect (Effect_aux (Effect_set [], _)), _)
In fact because in the grammar we only specify effects on
valspecs (they can always be inferred for fundefs in the absence
of a valspec) effect_opts are basically vestigial and are always
Effect_opt_none.
What might actually be super nice would be to remove rec_opt,
effect_opt and typ_annot_opt from fundefs in ast.ml altogether
and if we want them in the syntax just have them in parse_ast.ml
and pull them into a valspec during the initial check.
|
|
|
|
This supports the following syntax:
type xlen : Int = 64
type ylen : Int = 1
type xlenbits = bits(xlen)
bitfield Mstatus : xlenbits = {
SD : xlen - ylen,
SXL : xlen - ylen - 1 .. xlen - ylen - 3
}
|
|
|
|
|
|
We want to ensure that no_devices.sail and devices.sail have the same
effect footprint, because with a snapshot-type release in sail-arm, we
can't rebuild the spec with asl_to_sail every time we switch from
running elf binaries to booting OS's. This commit allows registers to
have arbitrary effects, so registers that are really representing
memory-mapped devices don't have to have the wmem/rmem effect.
|
|
|
|
|
|
function (preparing for marshalling)
|
|
Remove unused name schemes and DEF_kind
|
|
|
|
|
|
We should infer type variable kinds better in initial_check.ml, but we really don't want to have to deal
with that everywhere, especially when we can no longer easily cheat and assume KOpt_none implies K_int.
|
|
This only applies to recursive functions and uses the termination measure
merely as a limit to the recursive call depth, rather than proving the
measure correct.
|
|
Change Typ_arg_ to A_. We use it a lot more now typ_arg is used instead of
uvar as the result of unify. Plus A_ could either stand for argument, or
Any/A type which is quite appropriate in most use cases.
Restore instantiation info in infer_funapp'. Ideally we would save this
instead of recomputing it ever time we need it. However I checked and
there are over 300 places in the code that would need to be changed to add
an extra argument to E_app. Still some issues causing specialisation to
fail however.
Improve the error message when we swap how we infer/check an l-expression,
as this could previously cause the actual cause of a type-checking failure
to be effectively hidden.
|
|
On a new branch because it's completely broken everything for now
|
|
Mostly this is to change how we desugar types in order to make us more
flexible with what we can parse as a valid constraint as
type. Previously the structure of the initial check forced some
awkward limitations on what was parseable due to how the parse AST is
set up.
As part of this, I've taken the de-scattering of scattered functions
out of the initial check, and moved it to a re-writing step after
type-checking, where I think it logically belongs. This doesn't change
much right now, but opens up some more possibilities in the future:
Since scattered functions are now typechecked normally, any future
module system for Sail would be able to handle them specially, and the
Latex documentation backend can now document scattered functions
explicitly, rather than relying on hackish 'de-scattering' logic to
present documentation as the functions originally appeared.
This has one slight breaking change which is that union clauses must
appear before their uses in scattered functions, so
union ast = Foo : unit
function clause execute(Foo())
is ok, but
function clause execute(Foo())
union ast = Foo : unit
is not. Previously this worked because the de-scattering moved union
clauses upwards before type-checking, but as this now happens after
type-checking they must appear in the correct order. This doesn't
occur in ARM, RISC-V, MIPS, but did appear in Cheri and I submitted a
pull request to re-order the places where it happens.
|
|
This makes dealing with records and field expressions in Sail much
nicer because the constructors are no longer stacked together like
matryoshka dolls with unnecessary layers. Previously to get the fields
of a record it would be either
E_aux (E_record (FES_aux (FES_Fexps (fexps, _), _)), _)
but now it is simply:
E_aux (E_record fexps, _)
|
|
Rather than having K_aux (K_kind [BK_aux (BK_int, _)], _) represent
the Int kind, we now just have K_aux (K_int, _). Since the language is
first order we have no need for fancy kinds in the AST.
|
|
They weren't needed for ASL parser like I thought they would be, and
they increase the complexity of dealing with constraints throughout
Sail, so just remove them.
Also fix some compiler warnings
|
|
Should hopefully fix memory leak in RISC-V.
Also adds an optimization pass that removes copying structs and allows
some structs to simply alias each other and avoid copying their
contents. This requires knowing certain things about the lifetimes of
the structs involved, as can't free the struct if another variable is
referencing it - therefore we conservatively only apply this
optimization for variables that are lifted outside function
definitions, and should therefore never get freed until the model
exits - however this may cause issues outside ARMv8, as there may be
cases where a struct can exist within a variant type (which are not
yet subject to this lifting optimisation), that would break these
assumptions - therefore this optimisation is only enabled with the
-Oexperimental flag.
|
|
- Propagate types more accurately to improve optimization on ANF
representation.
- Add a generic optimization pass to remove redundant variables that
simply alias other variables.
- Modify Sail interactive mode, so it can compile a specification with
the :compile command, view generated intermediate representation via
the :ir <function> command, and step-through the IR with :exec <exp>
(although this is very incomplete)
- Introduce a third bitvector representation, between fast
fixed-precision bitvectors, and variable length large
bitvectors. The bitvector types are now from most efficient to least
* CT_fbits for fixed precision, 64-bit or less bitvectors
* CT_sbits for 64-bit or less, variable length bitvectors
* CT_lbits for arbitrary variable length bitvectors
- Support for generating C code using CT_sbits is currently
incomplete, it just exists in the intermediate representation right
now.
- Include ctyp in AV_C_fragment, so we don't have to recompute it
|
|
The main changes so far are:
* Allow markdown formatting in doc comments. We parse the markdown
using Omd, which is a OCaml library for parsing markdown. The nice
thing about this library is it's pure OCaml and has no dependencies
other the the stdlib. Incidentally it was also developed at OCaml
labs. Using markdown keeps our doc-comments from becoming latex
specfic, and having an actual parser is _much_ nicer than trying to
hackily process latex in doc-comments using OCamls somewhat sub-par
regex support.
* More sane conversion latex identifiers the main approach is to
convert Sail identifiers to lowerCamelCase, replacing numbers with
words, and then add a 'category' code based on the type of
identifier, so for a function we'd have fnlowerCamelCase and for
type synonym typelowerCamelCase etc. Because this transformation is
non-injective we keep track of identifiers we've generated so we end
up with identifierA, identifierB, identifierC when there are
collisions.
* Because we parse markdown in doc comments doc comments can use Sail
identifiers directly in hyperlinks, without having to care about how
they are name-mangled down into TeX compatible things.
* Allow directives to be passed through the compiler to
backends. There are various $latex directives that modify the latex
output. Most usefully there's a
$latex newcommand name markdown
directive that uses the markdown parser to generate latex
commands. An example of why this is useful is bellow. We can also use
$latex noref id
To suppress automatically inserting links to an identifier
* Refactor the latex generator to make the overall generation process
cleaner
* Work around the fact that some operating systems consider
case-sensitive file names to be a good thing
* Fix a bug where latex generation wouldn't occur unless the directory
specified by -o didn't exist
This isn't quite all the requested features for good CHERI
documentation, but new features should be much easier to add now.
|
|
Currently not enabled by default, the flag -Xconstraint_synonyms
enables them
For generating constraints in ASL parser, we want to be able to give
names to the constraints that we attach to certain variables. It's
slightly awkward right now when constraints get long complicated
because the entire constraint always has to be typed out in full
whenever it appears, and there's no way to abstract away from that.
This adds constraint synonyms, which work much like type synonyms
except for constraints, e.g.
constraint Size('n) = 'n in {1, 2, 4, 8} | 128 <= 'n <= 256
these constraints can then be used instead of the full constraint, e.g.
val f : forall 'n, where Size('n). int('n) -> unit
Unfortunatly we need to have a keyword to 'call' the constraint
synonym otherwise the grammer stops being LR(1). This could be
resolved by parsing all constraints into Parse_ast.atyp and then
de-sugaring them into constraints, which is what happens for
n-expressions already, but that would require quite a bit of work on
the parser.
To avoid this forcing changes to any other parts of Sail, the intended
invariant is that all constraints appearing anywhere in a type-checked
AST have no constraint synonyms, so they don't have to worry about
matching on NC_app, or calling Env.expand_typquant_synonyms (which
isn't even exported for this reason).
|