aboutsummaryrefslogtreecommitdiff
path: root/kernel/vmemitcodes.ml
AgeCommit message (Collapse)Author
2021-03-26Make it more obvious when the calling convention of APPLY changes.Guillaume Melquiond
Despite their names, APPLY1 to APPLY4 are completely different from APPLY(n) with n = 1 to 4. Indeed, the latter assumes that the return address was already pushed on the stack, before the arguments were. On the other hand, APPLY1 to APPLY4 insert the return address in the middle of the already pushed arguments.
2021-02-26Signed primitive integersAna
Signed primitive integers defined on top of the existing unsigned ones with two's complement. The module Sint63 includes the theory of signed primitive integers that differs from the unsigned case. Additions to the kernel: les (signed <=), lts (signed <), compares (signed compare), divs (signed division), rems (signed remainder), asr (arithmetic shift right) (The s suffix is not used when importing the Sint63 module.) The printing and parsing of primitive ints was updated and the int63_syntax_plugin was removed (we use Number Notation instead). A primitive int is parsed / printed as unsigned or signed depending on the scope. In the default (Set Printing All) case, it is printed in hexadecimal.
2021-01-10Remove MAKEPROD.Guillaume Melquiond
MAKEPROD is just MAKEBLOCK2(0), but one word shorter. Since this opcode is never encountered in the fast path, this optimization is not worth the extra complexity.
2021-01-10Remove SETFIELD0 and SETFIELD1.Guillaume Melquiond
The generated bytecode almost never needs to modify the field of an OCaml object. The only exception is the laziness mechanism of coinductive types. But it modifies field 2, and thus uses the generic opcode SETFIELD anyway.
2021-01-10Add a peephole optimization for PUSHFIELDS(1).Guillaume Melquiond
The PUSHFIELDS opcode is a costly one, yet lots of constructors have at most one usable argument (e.g., option, nat, positive, Z, Acc). For those constructors, PUSHFIELDS(1) is replaced by GETFIELD(0);PUSH, assuming the accu register is no longer used afterwards. Replacing one single opcode by two opcodes might seem like a pessimization, but it is not. Indeed, pattern-matching branches usually start by filling the accu register with a constructor argument or the value of a free variable or a constant. All of those offer peephole optimizations for PUSH, which means that the number of opcodes actually stay constant. Note that, for the same reason, the assumption above holds in practice: the accu register is no longer used after PUSHFIELDS.
2021-01-10Remove PUSHACC0, as it is strictly equivalent to PUSH.Guillaume Melquiond
2020-11-13Add allocation-free variants of the nextup and nextdown floating-point ↵Guillaume Melquiond
operations. Floating-point values are boxed, which means that any operation causes an allocation. While short-lived, they nonetheless cause the minor heap to fill, which in turn triggers the garbage collector. To reduce the number of allocations, I initially went with a shadow stack mechanism for storing floating-point values. But assuming the CoqInterval library is representative, this was way too complicated in practice, as most stack-located values ended up being passed to nextdown and nextup before being stored in memory. So, this commit implements a different mechanism. Variants of nextdown and nextup are added, which reuse the allocation of their input argument. Obviously, this is only correct if there is no other reference to this argument. To ensure this property, the commit only uses these opcode during a peephole optimization. If two primitive operations follow one another, then the second one can reuse the allocation of the first one, since it never had time to even reach the stack. For CoqInterval, this divides the number of allocations due to floating-point operations by about two. The following snippet is made 4% faster by this commit (and 13% faster if we consider only the floating-point operations). From Coq Require Import Int63 BinPos PrimFloat. Definition foo n := let eps := sub (next_up one) one in Pos.iter (fun x => next_down (add x eps)) two n. Time Eval vm_compute in foo 100000000.
2020-11-13Make callback opcodes more generic.Guillaume Melquiond
This does not make the code any slower, since Is_coq_array(accu) && Is_uint63(sp[0]) and !Is_accu(accu) && !Is_accu(sp[0]) take the exact same number of tests to pass in the concrete case. In the accumulator case, it takes one more test to fail, but we do not care about the performances then.
2020-11-13Turn Ksequence into a unary opcode, as its binary structure was never used.Guillaume Melquiond
2020-11-13Remove some unused opcodes from VM.Guillaume Melquiond
2020-11-13Remove unchecked arithmetic operations from VM, as they are not used.Guillaume Melquiond
2020-10-21Same little game with Projection.Pierre-Marie Pédrot
2020-10-21Deprecate the non-qualified equality functions on kerpairs.Pierre-Marie Pédrot
This allows to quickly spot the parts of the code that rely on the canonical ordering. When possible we directly introduce the quotient-aware versions.
2020-10-08Remove occurrences of Parray.reroot.Guillaume Melquiond
2020-09-22Modify bytecode representation of closures to please OCaml's GC (fix #12636).Guillaume Melquiond
The second field of a closure can no longer be the value of the first free variable (or another closure of a mutually recursive block) but must be an offset to the first free variable. This commit makes the bytecode compiler and interpreter agnostic to the actual representation of closures. To do so, the execution environment (variable coq_env) no longer points to the currently executed closure but to the last one. This has the following consequences: - OFFSETCLOSURE(n) now always loads the n-th closure of a recursive block (counted from last to first); - ENVACC(n) now always loads the value of the n-th free variable. These two changes make the bytecode compiler simpler, since it no longer has to track the relative position of closures and free variables. The last change makes the interpreter a bit slower, since it has to adjust coq_env when executing GRABREC. Hopefully, cache locality will make the overhead negligible.
2020-08-18Rename VM-related kernel/cfoo files to kernel/vmfooGaëtan Gilbert