aboutsummaryrefslogtreecommitdiff
path: root/kernel/byterun
AgeCommit message (Collapse)Author
2021-03-31[dune] Rename byterun to coqrunEmilio Jesus Gallego Arias
This seems the official name, the byterun name is just an artifact from the very preliminary dune build.
2021-03-13Set the lsb of return addresses on the bytecode interpreter stack.Guillaume Melquiond
This makes it possible to skip the check when scanning the stack for the garbage collector.
2021-03-03[build] Split stdlib to it's own opam package.Emilio Jesus Gallego Arias
We introduce a new package structure for Coq: - `coq-core`: Coq's OCaml tools code and plugins - `coq-stdlib`: Coq's stdlib [.vo files] - `coq`: meta-package that pulls `coq-{core,stdlib}` This has several advantages, in particular it allows to install Coq without the stdlib which is useful in several scenarios, it also open the door towards a versioning of the stdlib at the package level. The main user-visible change is that Coq's ML development files now live in `$lib/coq-core`, for compatibility in the regular build we install a symlink and support both setups for a while. Note that plugin developers and even `coq_makefile` should actually rely on `ocamlfind` to locate Coq's OCaml libs as to be more robust. There is a transient state where we actually look for both `$coqlib/plugins` and `$coqlib/../coq-core/plugins` as to support the non-ocamlfind plus custom variables. This will be much improved once #13617 is merged (which requires this PR first), then, we will introduce a `coq.boot` library so finally `coqdep`, `coqchk`, etc... can share the same path setup code. IMHO the plan should work fine.
2021-02-26Signed primitive integersAna
Signed primitive integers defined on top of the existing unsigned ones with two's complement. The module Sint63 includes the theory of signed primitive integers that differs from the unsigned case. Additions to the kernel: les (signed <=), lts (signed <), compares (signed compare), divs (signed division), rems (signed remainder), asr (arithmetic shift right) (The s suffix is not used when importing the Sint63 module.) The printing and parsing of primitive ints was updated and the int63_syntax_plugin was removed (we use Number Notation instead). A primitive int is parsed / printed as unsigned or signed depending on the scope. In the default (Set Printing All) case, it is printed in hexadecimal.
2021-02-26Merge PR #13676: Protect caml_process_pending_actions_exn with ↵Pierre-Marie Pédrot
caml_something_to_do. Reviewed-by: gasche Ack-by: ppedrot Reviewed-by: xavierleroy
2021-02-19Be less permissive with respect to nonsensical bytecode.Guillaume Melquiond
2021-02-19Make the generated file the official source of arity.Guillaume Melquiond
2021-02-19Add a file coq_arity.h generated by genOpcodeFiles.ml.Guillaume Melquiond
This avoids forgetting to add opcodes to coq_fix_code.c, and thus prevents arities being mistakenly set to zero.
2021-02-16Fix missing arities of VM opcodes.Guillaume Melquiond
Since the compiler initializes the arities to zero, coq_tcode_of_code wrongly believes that the word following a primitive operation contains an opcode, while it is the global index of the primitive operation. So, the function will try to translate it and thus corrupt it. But as long as the evaluated term fully reduces (which is always the case for CoqInterval), the corrupted word will never be read. At this point, it all depends on the arity of the global index (seen as an opcode). If it is zero, then coq_tcode_of_code will recover and correctly translate the following opcodes. If it is nonzero, then the function starts translating random words, possibly corrupting the memory past the end of the translation buffer. Independently of this memory corruption, coq_interprete will execute random code once it gets to the opcode following the primitive operation, since it has not been translated. The reason CoqInterval is not always crashing due to this bug is just plain luck. Indeed, the arity of the pseudo opcode only depends on the global index of the primitive operations. So, as long as this arity is zero, the memory corruption is fully contained. This happens in the vast majority of cases, since coq_tcode_of_code translates any unrecognized opcode to STOP, which has arity zero. This bug is exploitable.
2021-01-10Remove MAKEPROD.Guillaume Melquiond
MAKEPROD is just MAKEBLOCK2(0), but one word shorter. Since this opcode is never encountered in the fast path, this optimization is not worth the extra complexity.
2021-01-10Remove SETFIELD0 and SETFIELD1.Guillaume Melquiond
The generated bytecode almost never needs to modify the field of an OCaml object. The only exception is the laziness mechanism of coinductive types. But it modifies field 2, and thus uses the generic opcode SETFIELD anyway.
2021-01-10Remove LSLINT63CONST1 and LSRINT63CONST1 as they are unused.Guillaume Melquiond
2021-01-10Remove PUSHACC0, as it is strictly equivalent to PUSH.Guillaume Melquiond
2020-12-26Protect caml_process_pending_actions_exn with caml_something_to_do.Guillaume Melquiond
When using OCaml >= 4.10, function caml_process_pending_actions_exn is called whenever the bytecode interpreter tries to apply a term. This function exits immediately when caml_something_to_do is not set. But since term application happens every few opcodes, even exiting immediately still accounts between 5% and 10% of the whole interpreter. So, this commit makes sure the function is not called unless caml_something_to_do is already set (i.e., when the user presses Ctrl+C). This means that this conditional branch is perfectly predicted by the processor. On the following benchmark, this commit makes the VM 13% faster. Time Eval vm_compute in Pos.iter (fun x => x) tt 100000000. Note that, before OCaml 4.10, the VM code was checking the value of caml_signals_are_pending before calling caml_process_pending_signals. So, this commit actually fixes a regression.
2020-12-01Avoid compiler warnings.Guillaume Melquiond
2020-11-20Make sure accumulators do not exceed the minor heap (partly fix #11170).Guillaume Melquiond
Accumulators can grow arbitrarily large, even when well-typed. So, this commit makes sure they are allocated on the major heap when they are too large. If so, fields need to be filled with caml_initialize, in case they point to the minor heap.
2020-11-13Hardcode next_up and next_down instead of relying on nextafter.Guillaume Melquiond
Unfortunately, compilers are currently unable to optimize the nextafter function, even in the degenerate case where the second argument is explicitly infinite. So, this commit implements this case by hand. On the following testcase, this gives a 15% speedup. From Coq Require Import Int63 BinPos PrimFloat. Definition foo n := let eps := sub (next_up one) one in Pos.iter (fun x => next_down (add x eps)) two n. Time Eval vm_compute in foo 100000000. And when looking at the cost of just the allocation-free version of next_down, the speedup is 1500%. Said otherwise, the latency of next_down is now on par with the measurement noise due to cache misses and the like.
2020-11-13Add allocation-free variants of the nextup and nextdown floating-point ↵Guillaume Melquiond
operations. Floating-point values are boxed, which means that any operation causes an allocation. While short-lived, they nonetheless cause the minor heap to fill, which in turn triggers the garbage collector. To reduce the number of allocations, I initially went with a shadow stack mechanism for storing floating-point values. But assuming the CoqInterval library is representative, this was way too complicated in practice, as most stack-located values ended up being passed to nextdown and nextup before being stored in memory. So, this commit implements a different mechanism. Variants of nextdown and nextup are added, which reuse the allocation of their input argument. Obviously, this is only correct if there is no other reference to this argument. To ensure this property, the commit only uses these opcode during a peephole optimization. If two primitive operations follow one another, then the second one can reuse the allocation of the first one, since it never had time to even reach the stack. For CoqInterval, this divides the number of allocations due to floating-point operations by about two. The following snippet is made 4% faster by this commit (and 13% faster if we consider only the floating-point operations). From Coq Require Import Int63 BinPos PrimFloat. Definition foo n := let eps := sub (next_up one) one in Pos.iter (fun x => next_down (add x eps)) two n. Time Eval vm_compute in foo 100000000.
2020-11-13Remove floating-point comparison operators as they are no longer needed.Guillaume Melquiond
2020-11-13Turn coq_float64.h into a .c file as it is no longer needed by coq_interp.c.Guillaume Melquiond
2020-11-13Inline the functions from coq_float64.h.Guillaume Melquiond
Since the code is compiled in -fPIC mode, the compiler cannot inline the functions, due to the ABI mandating the ability to interpose visible symbols. Hiding the symbols of coq_float64.h would work, except that they float64.ml needs to reference them. (See #13124 for more details.) This commit improves performances by 7% on the following code. From Coq Require Import Int63 BinPos PrimFloat. Definition foo n := let two := of_int63 2 in Pos.iter (fun x => sub (mul x two) two) two n. Time Eval vm_compute in foo 100000000. If we consider only the floating-point operations (by ignoring the cost of the loop), the speedup is actually 30%.
2020-11-13Make callback opcodes more generic.Guillaume Melquiond
This does not make the code any slower, since Is_coq_array(accu) && Is_uint63(sp[0]) and !Is_accu(accu) && !Is_accu(sp[0]) take the exact same number of tests to pass in the concrete case. In the accumulator case, it takes one more test to fail, but we do not care about the performances then.
2020-11-13Optimize Is_accu a bit.Guillaume Melquiond
2020-11-13Remove some unused opcodes from VM.Guillaume Melquiond
2020-11-13Remove unchecked arithmetic operations from VM, as they are not used.Guillaume Melquiond
2020-11-13Optimize handling of the VM stack with respect to the GC.Guillaume Melquiond
1. There is no point in marking plain integers as GC roots. 2. There is no need to restore the stack pointer, as the stack is not allocated on the OCaml heap (contrarily to coq_env).
2020-09-22Use the closure tag for accumulators.Guillaume Melquiond
The first field of accumulators points out of the OCaml heap, so using closures instead of tag-0 objects is better for the GC. Accumulators are distinguished from closures because their code pointer necessarily is the "accumulate" address, which points to an ACCUMULATE instruction.
2020-09-22Use the same memory layout as closures for accumulators.Guillaume Melquiond
That way, accumulators again can be used directly as execution environments by the bytecode interpreter. This fixes the issue of the first argument of accumulators being dropped when strongly normalizing.
2020-09-22Modify bytecode representation of closures to please OCaml's GC (fix #12636).Guillaume Melquiond
The second field of a closure can no longer be the value of the first free variable (or another closure of a mutually recursive block) but must be an offset to the first free variable. This commit makes the bytecode compiler and interpreter agnostic to the actual representation of closures. To do so, the execution environment (variable coq_env) no longer points to the currently executed closure but to the last one. This has the following consequences: - OFFSETCLOSURE(n) now always loads the n-th closure of a recursive block (counted from last to first); - ENVACC(n) now always loads the value of the n-th free variable. These two changes make the bytecode compiler simpler, since it no longer has to track the relative position of closures and free variables. The last change makes the interpreter a bit slower, since it has to adjust coq_env when executing GRABREC. Hopefully, cache locality will make the overhead negligible.
2020-07-22Remove redundant data from VM case switch.Pierre-Marie Pédrot
No need to store the case_info, all the data is reconstructible from the context. Furthermore, this reconstruction is performed in a context where we already access the environment, so performance is not at stake. Hopefully this will also reduce the number of globally allocated VM values, since the switch representation now only depends on the shape of the inductive type.
2020-07-06Primitive persistent arraysMaxime Dénès
Persistent arrays expose a functional interface but are implemented using an imperative data structure. The OCaml implementation is based on Jean-Christophe Filliâtre's. Co-authored-by: Benjamin Grégoire <Benjamin.Gregoire@inria.fr> Co-authored-by: Gaëtan Gilbert <gaetan.gilbert@skyskimmer.net>
2020-03-27Merge PR #11102: Use the Alloc_small macro from the OCaml runtime rather ↵Maxime Dénès
than our own. Ack-by: aaronpuchert Ack-by: gadmm Reviewed-by: maximedenes Ack-by: ppedrot Reviewed-by: proux01
2020-03-18Update headers in the whole code base.Théo Zimmermann
Add headers to a few files which were missing them.
2020-03-09Do not rely on the implicit declaration of caml_minor_collection.Guillaume Melquiond
This commit also prefixes young_ptr and young_limit along the way, so as to not rely on OCaml's compatibility layer. This is a gratuitous change, since this code is only meant to be compiled with OCaml < 4.10 anyway.
2019-12-22Use the Alloc_small macro from the OCaml runtime rather than our own.Guillaume Melquiond
We cannot use caml_alloc_small because the macros Setup_for_gc and Restore_after_gc are still relevant (and critical). This means defining the CAML_INTERNALS macro, but it is a legit use and actually documented in the OCaml manual. This will help with forward compatibility with OCaml compilers, e.g., issue #10603. Unfortunately, it also means that we can no longer use #9914 to prevent memory corruption. The old macro is still used for OCaml versions prior to 4.10, as the upstream macro might process Ctrl+C when it is called.
2019-12-17failwith -> caml_failwithGuillaume Munch-Maccagnoni
2019-12-17Fatal error in VM if SIGINT was seen but no exception occured.Guillaume Munch-Maccagnoni
2019-12-17Fix signal polling for OCaml 4.10Guillaume Munch-Maccagnoni
Issue #10603
2019-12-17[VM] fix volatile declarationGuillaume Munch-Maccagnoni
2019-12-12[vm] Untabify the VM C code.Emilio Jesus Gallego Arias
This is a follow up of #11010 ; I've realized that for example in #11123 a large part of the patch is detabification as indeed the files are mixed in tabs/space style so developers are forced to do the cleanup each time they work on them. Command used: ``` for i in `find . -name '*.c' -or -name '*.h'; do expand -i "$i" | sponge "$i"; sed -e's/[[:space:]]*$//' -i.bak "$i"; done ``` Checked empty diff with `git diff --ignore-all-space`
2019-12-04[dune] Update to dune language version 2.0Emilio Jesus Gallego Arias
This is the minimal set of changes requires for Coq to build under 2.0 mode. We may likely take advantage of some more new features. Note that Dune 2.0 requires OCaml >= 4.06.0, OPAM allows to use Dune in older versions as it will install a secondary compiler.
2019-11-11Have only one dune rule calling configurePierre Roux
2019-11-01Communicate CFLAGS to dunePierre Roux
2019-11-01Fix ldshiftexpPierre Roux
* Fix the implementations and add tests * Change shift from int63 to Z (was always used as a Z) * Update FloatLemmas.v accordingly Co-authored-by: Erik Martin-Dorel <erik.martin-dorel@irit.fr>
2019-11-01Add "==", "<", "<=" in PrimFloat.vErik Martin-Dorel
* Add a related test-suite in compare.v (generated by a bash script) Co-authored-by: Pierre Roux <pierre.roux@onera.fr>
2019-11-01Make primitive float work on WindowsPierre Roux
2019-11-01Reimplement is_floatPierre Roux
is_float was relying on Obj.tag triggering a check that we are in the OCaml heap which is expensive. On extreme examples, this can lead to a global 2x speedup. Thanks to Maxime Dénès and Jacques-Henri Jourdan for their help in diagnosing this.
2019-11-01Make primitive float work on x86_32Pierre Roux
Flag -fexcess-precision=standard is not enough on x86_32 where -msse2 -mfpmath=sse is required (-msse is not enough) to avoid double rounding issues in the VM. Most floating-point operation are now implemented in C because OCaml is suffering double rounding issues on x86_32 with 80 bits extended precision registers used for floating-point values, causing double rounding making floating-point arithmetic incorrect with respect to its specification. Add a runtime test for double roundings.
2019-11-01Add next_{up,down} primitive float functionsPierre Roux
2019-11-01Implement classify on primitive floatPierre Roux