| Age | Commit message (Collapse) | Author |
|
Changes the representation of function types in the ast from
Typ_fn : typ -> typ
to
Typ_fn : typ list -> typ
to more accurately represent their use in the various backends, where we often compile functions to either their curried representations as in Lem and Isabelle, or just
multiple argument functions in C. There's still some oddity because a single pattern in a function clause can bind against multiple arguments, and maybe we want to
forbid this in the future. The syntax also hasn't changed (yet), so in theory this change shouldn't break anything (but it invariably will...).
In the future we would ideally require that a function with N arguments has exactly N patterns in its declaration, one for each argument so
f : (x, y) -> z
f _ = ...
would be disallowed (as _ matches both x and y), forcing
f(_, _) = z
this would simply quite a few things,
Also we could have a different syntax for function argument lists and tuples, because it's rather hard to define a function that actually takes a tuple with the syntax
how it is now.
Some issues I noticed when doing this refactoring:
Line 1926 of Coq translation. untuple_args_pat is maybe no longer needed? However there's still some funnyness where a pattern can be used to bind multiple function
arguments so maybe it still is.
Line 2306 of monomorphisation. I simplified the logic here. I think it's equivalent now, but I could be wrong.
Line 4517 of rewrites. I'm not sure what make_cstr_mappings is doing here, but hopefully the simpler version is the same.
|
|
|
|
Make the C l-expression type in Sail more generic and expressive, and
refactor the generation of conversions into a seperate
codegen_conversion function, which can handle more complex cases than
the previous more ad-hoc method.
|
|
Add additional well-formedness check when calling typing rules
|
|
types
|
|
These match the new ASL pattern constructors:
- !p matches if the pattern p does not match
- { p1, ... pn } matches if any of the patterns p1 ... pn match
We desugar the set pattern "{p1, ... pn}" into "p1 | (p2 | ... pn)".
ASL does not have pattern binding but Sail does. The rules at the
moment are that none of the pattern can contain patterns. This could
be relaxed by allowing "p1 | p2" to bind variables provided p1 and p2
both bind the same variables.
|
|
E_internal_cast, E_sizeof_internal, E_internal_exp,
E_internal_exp_user, E_comment, and E_comment_struc were all
unused. For a lem based interpreter, we want to be able to compile it
to iUsabelle, and due to slowness inherent in Isabelle's datatype
package we want to remove unused constructors in our AST type.
Also remove the lem_ast backend - it's heavily bitrotted, and for
loading the ARM ast into the interpreter it's just not viable to use
this approach as it just doesn't scale. We really need a way to be
able to serialise and deserialise the AST efficiently in Lem.
|
|
|
|
|
|
|
|
|
|
Registers can now be marked as configuration registers, for example:
register configuration CFG_RVBAR = 0x1300000
They work like ordinary registers except they can only be set by
functions with the 'configuration' effect and have no effect when
read. They also have an initialiser, like a let-binding. Internally
there is a new reg_dec constructor DEC_config. They are intended to
represent configuration parameters for the model, which can change
between runs, but don't change during execution. Currently they'll
only work when compiled to C. Internally registers can now have custom
effects for reads and writes rather than just rreg and wreg, so the
type signatures of Env.add_register and Env.get_register have changed,
as well as the Register lvar, so in the type checker we now write:
Env.add_register id read_effect write_effect typ
rather than
Env.add_register id typ
For the corresponding change to ASL parser there's a function
is_config in asl_to_sail.ml which controls what becomes a
configuration register for ARM. Some things we have to keep as
let-bindings because Sail can't handle them changing at runtime -
e.g. the length of vectors in other top-level definitions. Luckily
__SetConfig doesn't (yet) try to change those options.
Together these changes allow us to translate the ASL __SetConfig
function, which means we should get command-line option compatibility
with ArchEx for running the ARM conformance tests.
|
|
|
|
Previously the ANF->IR translation cared too much about how things
were allocated in C, so it had to constantly check whether things
needed to be allocated on the stack or heap, and generate different
cequences of IR instructions depending on either. This change removes
the ialloc IR instruction, and changes iinit and idecl so that the
code generator now generates different C for the same IR instructions
based on the variable types involved.
The next change in this vein would be to merge icopy and iconvert at
the IR level so that conversions between uint64_t and large-bitvectors
are inserted by the code generator. This would be good because it
would make the ANF->IR translation more robust to changes in the types
of variables caused by flow-typing, and optimization passes could
convert large bitvectors to uint64_t as local changes.
|
|
|
|
|
|
(involved some manual tinkering with gitignore, type_check, riscv)
|
|
|
|
optimizations.
Move the utility functions for graph generation and pretty printing of
intermediate representation instructions into a separate file,
bytecode_util.ml, by analogy with ast_util.ml.
Add an optimization pass that searches for specific patterns of struct
updates and removes uncessary copying of the structs involved. With
this optimisation pass the time taken for u-boot to run approx
57,000,000 instructions goes down from about 11-12 minutes to 8
minutes (about 120,000 IPS).
|
|
Also fixes to C backend for compiling MIPS spec to C
- Fix an issue with const correctness in internal_vector_update
functions generated by C backend
- Add builtins for MIPS to sail.h
- Fix an issue where reg_deref didn't work when called on pointers to
large bitvectors, i.e. vectors containing references to large
bitfields as in the MIPS TLB code
- Various bug fixes and changes for running U-boot on ARM model,
including for interpreter and OCaml compilation.
- Fix memory leak issues and incorrect shadowing for foreach loops
- Update C header file. Fixes memory leak in memory read/write
builtins.
- Add aux constructor to ANF representation to hold environment
information.
- Fix undefined behavior caused by optimisation left shifting uint64_t
vectors 64 or more times. Unfortunately there's more issues because
the same happens for X >> 64 right shifts. It would make sense for
this to be zero, because that would guarantee the property that ((X
>> n) >> m) == (X >> (n + m)) but we probably need to do (X >> (n -
1) >> 1) in the optimisation to ensure that we don't cause
UB. Shifting by 63 and then by 1 is well-defined, but shifting by 64
in one go isn't according to the C standard. This issue with
right-shifts only occurs for zero-length vectors, so it's not a huge
deal, but it's still annoying.
- Add versions of print_bits and print_int that print to
stderr. Follows OCaml convention of print/prerr. Should make things
more explicit. Different backends had different ideas about where
print should output to, not every backend needs to have this
(e.g. theorem prover backends don't need to print) but having both
stderr and stdout seperate and clear is useful for executable models
(UART needs to be stdout, debug messages should be stderr).
|
|
This means that a mapping which formerly had to be pre-declared like
val name : a <-> b
...
mapping name {
x <-> y,
...
}
can now be shortened to
mapping name : a <-> b {
x <-> y,
...
}
|
|
|
|
1. Experiment with allowing some flow typing on mutable variables for
translating ASL in a more idiomatic way. I realise after updating some
of the test cases that this could have some problematic side effects
for lem translation, where mutable variables are translated into
monadic code. We'd need to ensure that whatever flow typing happens
for mutable variables also works for monadic code, including within
transformed loops. If this doesn't work out some of these changes may
need to be reverted.
2. Make the type inference for l-expressions a bit smarter. Splits the
type checking rules for l-expressions into a inference part and a
checking part like the other bi-directional rules. Should not be able
to type check slightly more l-expresions, such as nested vector slices
that may not have checked previously.
The l-expression rules for vector patterns should be simpler now, but
they are also more strict about bounds checking. Previously the bounds
checks were derived from the corresponding operations that would
appear on the RHS (i.e. LEXP_vector would get it's check from
vector_access). This meant that the l-expression bounds checks could
be weakend by weakening the checks on those operations. Now this is no
longer possible, there is a -no_lexp_bounds_check option which turns
of bounds checking in l-expressions. Currently this is on for the
generated ARM spec, but this should only be temporary.
3. Add a LEXP_vector_concat which mirrors P_vector_concat except in
l-expressions. Previously there was a hack that overloaded LEXP_tup
for this to translate some ASL patterns, but that was fairly
ugly. Adapt the rewriter and other parts of the code to handle
this. The rewriter for lexp tuple vector assignments is now a rewriter
for vector concat assignments.
4. Include a newly generated version of aarch64_no_vector
5. Update the Ocaml test suite to use builtins in lib/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rename l2.ott to sail.ott
|
|
|
|
Can now compile RISCV. Requires some library tweaks before it'll pass any tests,
Also adds hyperlinks to wip latex output
|
|
Some functions are partial, e.g. converting a bitvector to an integer, which
might fail for the bit list representation due to undefined bits. Undefined
cases can be handled in different ways:
- call Lem's failwith, which maps to undefined/ARB in Isabelle and HOL (the
default so far),
- return an option type,
- raise a failure in the monad, or
- use a bitstream oracle to resolve undefined bits.
This patch adds different versions of partial functions corresponding to those
options. The desired behaviour can be selected by choosing a binding in the
Sail prelude. The naming scheme is that the failwith version is the default,
while the other versions have the suffixes _maybe, _fail, and _oracle,
respectively.
|
|
Also work on making C backend compile RISC-V
|
|
Previously union types could have no-argument constructors, for
example the option type was previously:
union option ('a : Type) = {
Some : 'a,
None
}
Now every union constructor must have a type, so option becomes:
union option ('a : Type) = {
Some : 'a,
None : unit
}
The reason for this is because previously these two different types of
constructors where very different in the AST, constructors with
arguments were used the E_app AST node, and no-argument constructors
used the E_id node. This was particularly awkward, because it meant
that E_id nodes could have polymorphic types, i.e. every E_id node
that was also a union constructor had to be annotated with a type
quantifier, in constrast with all other identifiers that have
unquantified types. This became an issue when monomorphising types,
because the machinery for figuring out function instantiations can't
be applied to identifier nodes. The same story occurs in patterns,
where previously unions were split across P_id and P_app nodes - now
the P_app node alone is used solely for unions.
This is a breaking change because it changes the syntax for union
constructors - where as previously option was matched as:
function is_none opt = match opt {
Some(_) => false,
None => true
}
it is now matched as
function is_none opt = match opt {
Some(_) => false,
None() => true
}
note that constructor() is syntactic sugar for constructor(()), i.e. a
one argument constructor with unit as it's value. This is exactly the
same as for functions where a unit-function can be called as f() and
not as f(()). (This commit also makes exit() work consistently in the
same way) An attempt to pattern match a variable with the same name as
a union-constructor now gives an error as a way to guard against
mistakes made because of this change.
There is probably an argument for supporting the old syntax via some
syntactic sugar, as it is slightly prettier that way, but for now I
have chosen to keep the implementation as simple as possible.
The RISCV spec, ARM spec, and tests have been updated to account for
this change. Furthermore the option type can now be included from
$SAIL_DIR/lib/ using
$include <option.sail>
|
|
Rather than just using strings to represent literals, now use value
types from sail_lib.lem to represent them. This allows for expressions
to be evaluated at compile time, which will be useful for future
optimisations involving constant folding and propagation, and allows
the intermediate bytecode to be interpreted using the same lem
builtins that the shallow embedding uses.
To get this to work I had to tweak the build process slightly to allow
ml files to import lem files from gen_lib/. Hopefully this doesn't
break anything!
|
|
With these optimisations on, now get about 10x performance over OCaml.
|
|
Just need to implement builtins, fix-up a few re-write passes, and
integrate some kind of elf-loading and it should work.
|
|
|
|
|
|
|
|
|
|
|
|
Describes precisely the intermediate representation used in the C
backend in an ott grammar, and also removes several C-isms where raw C
code was inserted into the IR, so in theory this IR could be
interpreted by a small VM/interpreter or compiled to LLVM bytecode
etc. Currently the I_raw constructor for inserting C code is just used
for inserting GCC attributes, so it can safely be ignored.
Also augment and refactor the instruction type with an aux constructor
so location information can be propagated down to this level.
|
|
|
|
|
|
|
|
For example:
bitfield cr : vector(8, dec, bit) = {
CR0 : 7 .. 4,
LT : 7,
CR1 : 3 .. 2,
CR2 : 1,
CR3 : 0,
}
The difference this creates a newtype wrapper around the vector type,
then generates getters and setters for all the fields once, rather
than having to handle this construct separately in every backend.
|