sail - Formal specification language for ISAs

Age	Commit message (Collapse)	Author
2019-03-08	C: Refactor C backend	Alasdair Armstrong
	Main change is splitting apart the Sail->IR compilation stage and the C code generation and optimization phase. Rather than variously calling the intermediate language either bytecode (when it's not really) or simply IR, we give it a name: Jib (a type of Sail). Most of the types are still prefixed by c/C, and I don't think it's worth changing this. The various parts of the C backend are now in the src/jib/ subdirectory src/jib/anf.ml - Sail->ANF translation src/jib/jib_util.ml - various Jib AST processing and helper functions (formerly bytecode_util) src/jib/jib_compile.ml - Sail->Jib translation (using Sail->ANF) src/jib/c_backend.ml - Jib->C code generator and optimizations Further, bytecode.ott is now jib.ott and generates jib.ml (which still lives in src/ for now) The optimizations in c_backend.ml should eventually be moved in a separate jib_optimization file. The Sail->Jib compilation can be parameterised by two functions - one is a custom ANF->ANF optimization pass that can be specified on a per Jib backend basis, and the other is the rule for translating Sail types in Jib types. This can be more or less precise depending on how precise we want to be about bit-widths etc, i.e. we only care about <64 and >64 for C, but for SMT generation we would want to be as precise as possible. Additional improvements: The Jib IR is now agnostic about whether arguments are allocated on the heap vs the stack and this is handled by the C code generator. jib.ott now has some more comments explaining various parts of the Jib AST. A Set module and comparison function for ctyps is defined, and some functions now return ctyp sets rather than lists to avoid repeated work.
2019-02-22	Generalize CT_int64 for arbitrary fixed size integers	Alasdair
	If we want to use our low-level intermediate representation to generate SMT, then we want to be more precise than just splitting integers into 64-bits and larger. This commit changes CT_int and CT_int64 into CT_lint for large integers and CT_fint n for (signed) fixed precision integers that fit within n bits. This follows the convention for bitvectors where we have CT_fbits for fixed-length bitvectors and CT_lbits for large arbitrary precision bitvectors.
2018-11-27	Fix memory leak in string_of_bits	Alasdair Armstrong
	Should hopefully fix memory leak in RISC-V. Also adds an optimization pass that removes copying structs and allows some structs to simply alias each other and avoid copying their contents. This requires knowing certain things about the lifetimes of the structs involved, as can't free the struct if another variable is referencing it - therefore we conservatively only apply this optimization for variables that are lifted outside function definitions, and should therefore never get freed until the model exits - however this may cause issues outside ARMv8, as there may be cases where a struct can exist within a variant type (which are not yet subject to this lifting optimisation), that would break these assumptions - therefore this optimisation is only enabled with the -Oexperimental flag.
2018-11-23	C backend improvements	Alasdair Armstrong
	- Propagate types more accurately to improve optimization on ANF representation. - Add a generic optimization pass to remove redundant variables that simply alias other variables. - Modify Sail interactive mode, so it can compile a specification with the :compile command, view generated intermediate representation via the :ir <function> command, and step-through the IR with :exec <exp> (although this is very incomplete) - Introduce a third bitvector representation, between fast fixed-precision bitvectors, and variable length large bitvectors. The bitvector types are now from most efficient to least * CT_fbits for fixed precision, 64-bit or less bitvectors * CT_sbits for 64-bit or less, variable length bitvectors * CT_lbits for arbitrary variable length bitvectors - Support for generating C code using CT_sbits is currently incomplete, it just exists in the intermediate representation right now. - Include ctyp in AV_C_fragment, so we don't have to recompute it
2018-08-20	Refactor tuple conversions in Sail to C compilation	Alasdair Armstrong
	Make the C l-expression type in Sail more generic and expressive, and refactor the generation of conversions into a seperate codegen_conversion function, which can handle more complex cases than the previous more ad-hoc method.
2018-08-02	Start working on a solution for correctly monomorphising polymorphic variant ↵	Alasdair Armstrong
	types
2018-07-05	Passes all tests and now builds mips and cheri again	Alasdair

2018-06-29	Try to fix some tricky C compilation bugs, break everything instead	Alasdair Armstrong

2018-06-26	Add configuration registers so __SetConfig ASL can be translated	Alasdair Armstrong
	Registers can now be marked as configuration registers, for example: register configuration CFG_RVBAR = 0x1300000 They work like ordinary registers except they can only be set by functions with the 'configuration' effect and have no effect when read. They also have an initialiser, like a let-binding. Internally there is a new reg_dec constructor DEC_config. They are intended to represent configuration parameters for the model, which can change between runs, but don't change during execution. Currently they'll only work when compiled to C. Internally registers can now have custom effects for reads and writes rather than just rreg and wreg, so the type signatures of Env.add_register and Env.get_register have changed, as well as the Register lvar, so in the type checker we now write: Env.add_register id read_effect write_effect typ rather than Env.add_register id typ For the corresponding change to ASL parser there's a function is_config in asl_to_sail.ml which controls what becomes a configuration register for ARM. Some things we have to keep as let-bindings because Sail can't handle them changing at runtime - e.g. the length of vectors in other top-level definitions. Luckily __SetConfig doesn't (yet) try to change those options. Together these changes allow us to translate the ASL __SetConfig function, which means we should get command-line option compatibility with ArchEx for running the ARM conformance tests.
2018-06-21	Simplify the ANF->IR translation	Alasdair Armstrong
	Previously the ANF->IR translation cared too much about how things were allocated in C, so it had to constantly check whether things needed to be allocated on the stack or heap, and generate different cequences of IR instructions depending on either. This change removes the ialloc IR instruction, and changes iinit and idecl so that the code generator now generates different C for the same IR instructions based on the variable types involved. The next change in this vein would be to merge icopy and iconvert at the IR level so that conversions between uint64_t and large-bitvectors are inserted by the code generator. This would be good because it would make the ANF->IR translation more robust to changes in the types of variables caused by flow-typing, and optimization passes could convert large bitvectors to uint64_t as local changes.
2018-06-14	Refactor C backend, and split RTS into multiple files	Alasdair

2018-06-06	Factor utility functions for IR into separate file and struct update ↵	Alasdair Armstrong
	optimizations. Move the utility functions for graph generation and pretty printing of intermediate representation instructions into a separate file, bytecode_util.ml, by analogy with ast_util.ml. Add an optimization pass that searches for specific patterns of struct updates and removes uncessary copying of the structs involved. With this optimisation pass the time taken for u-boot to run approx 57,000,000 instructions goes down from about 11-12 minutes to 8 minutes (about 120,000 IPS).
2018-05-31	Fixes to get ARM u-boot working in Sail.	Alasdair Armstrong
	Also fixes to C backend for compiling MIPS spec to C - Fix an issue with const correctness in internal_vector_update functions generated by C backend - Add builtins for MIPS to sail.h - Fix an issue where reg_deref didn't work when called on pointers to large bitvectors, i.e. vectors containing references to large bitfields as in the MIPS TLB code - Various bug fixes and changes for running U-boot on ARM model, including for interpreter and OCaml compilation. - Fix memory leak issues and incorrect shadowing for foreach loops - Update C header file. Fixes memory leak in memory read/write builtins. - Add aux constructor to ANF representation to hold environment information. - Fix undefined behavior caused by optimisation left shifting uint64_t vectors 64 or more times. Unfortunately there's more issues because the same happens for X >> 64 right shifts. It would make sense for this to be zero, because that would guarantee the property that ((X >> n) >> m) == (X >> (n + m)) but we probably need to do (X >> (n - 1) >> 1) in the optimisation to ensure that we don't cause UB. Shifting by 63 and then by 1 is well-defined, but shifting by 64 in one go isn't according to the C standard. This issue with right-shifts only occurs for zero-length vectors, so it's not a huge deal, but it's still annoying. - Add versions of print_bits and print_int that print to stderr. Follows OCaml convention of print/prerr. Should make things more explicit. Different backends had different ideas about where print should output to, not every backend needs to have this (e.g. theorem prover backends don't need to print) but having both stderr and stdout seperate and clear is useful for executable models (UART needs to be stdout, debug messages should be stderr).
2018-03-23	Fix build issue	Alasdair Armstrong

2018-03-19	Fixes to C backend for RISCV-compilation	Alasdair Armstrong
	Can now compile RISCV. Requires some library tweaks before it'll pass any tests, Also adds hyperlinks to wip latex output
2018-03-09	Specialise constructors for polymorphic unions	Alasdair Armstrong
	Also work on making C backend compile RISC-V
2018-03-02	Use sail_lib.lem values in C backend	Alasdair Armstrong
	Rather than just using strings to represent literals, now use value types from sail_lib.lem to represent them. This allows for expressions to be evaluated at compile time, which will be useful for future optimisations involving constant folding and propagation, and allows the intermediate bytecode to be interpreted using the same lem builtins that the shallow embedding uses. To get this to work I had to tweak the build process slightly to allow ml files to import lem files from gen_lib/. Hopefully this doesn't break anything!
2018-02-26	Add some obvious optimisations to C backend.	Alasdair Armstrong
	With these optimisations on, now get about 10x performance over OCaml.
2018-02-21	Have aarch64/no_vector compiling to C	Alasdair Armstrong
	Just need to implement builtins, fix-up a few re-write passes, and integrate some kind of elf-loading and it should work.
2018-02-15	List support in C backend	Alasdair Armstrong

2018-02-13	Support for large bitvector literals in C backend	Alasdair Armstrong

2018-02-12	Add support for top-level letbindings to C backend	Alasdair Armstrong

2018-02-09	Improve IR pretty-printing for debugging	Alasdair Armstrong

2018-02-09	Formalize C backend intermediate representation in Ott	Alasdair Armstrong
	Describes precisely the intermediate representation used in the C backend in an ott grammar, and also removes several C-isms where raw C code was inserted into the IR, so in theory this IR could be interpreted by a small VM/interpreter or compiled to LLVM bytecode etc. Currently the I_raw constructor for inserting C code is just used for inserting GCC attributes, so it can safely be ignored. Also augment and refactor the instruction type with an aux constructor so location information can be propagated down to this level.