| Age | Commit message (Collapse) | Author |
|
When using OCaml >= 4.10, function caml_process_pending_actions_exn is
called whenever the bytecode interpreter tries to apply a term. This
function exits immediately when caml_something_to_do is not set. But since
term application happens every few opcodes, even exiting immediately still
accounts between 5% and 10% of the whole interpreter. So, this commit
makes sure the function is not called unless caml_something_to_do is
already set (i.e., when the user presses Ctrl+C). This means that this
conditional branch is perfectly predicted by the processor.
On the following benchmark, this commit makes the VM 13% faster.
Time Eval vm_compute in Pos.iter (fun x => x) tt 100000000.
Note that, before OCaml 4.10, the VM code was checking the value of
caml_signals_are_pending before calling caml_process_pending_signals. So,
this commit actually fixes a regression.
|
|
It is the only place where it starts making sense in the whole codebase. It also
fits nicely there since there are other functions manipulating this type in that
module.
In any case this type does not belong to the kernel.
|
|
notations.
Reviewed-by: jfehrle
Reviewed-by: proux01
Ack-by: Zimmi48
Ack-by: SkySkimmer
|
|
Reviewed-by: ppedrot
Reviewed-by: proux01
|
|
|
|
|
|
|
|
|
|
We use a variant of skewed lists enriched over a monoid, whose purpose is
to count the number of lifts added to the substitution. This makes addition
O(1) and lookup O(log n).
|
|
|
|
- float and array values are now supported for printing and parsing in custom notations (and in notations defined using the ocaml API)
- the three constants bound to the primitive float, int and array type are now allowed anywhere inside a term to print, to handle them similar to `real` type constructors
- Grants #13484: String notations for primitive arrays
- Fixes #13517: String notation printing fails with primitive integers inside lists
|
|
They are not used there, and removing the redundance of the the case
representation requires access to the environment, so we push their use
further up.
|
|
This is presumably not usable from the surface language. But an ML module
could easily create a proof of false by passing a negative number to
Const.mkInt.
|
|
|
|
Before this commit, function mask63 was called implicitly at each step.
|
|
|
|
|
|
of cases.
|
|
cumulative inductive types
Reviewed-by: SkySkimmer
|
|
types
|
|
|
|
|
|
This is just an experiment, but makes the uses of the API easier as we
don't mess with the global state anymore.
|
|
This allows proper treatment in notations, ie fixes #13303
The "glob" representation of universes (what pretyping sees) contains
only fully interpreted (kernel) universes and unbound universe
ids (for non Strict Universe Declaration).
This means universes need to be understood at intern time, so intern
now has a new "universe binders" argument. We cannot avoid this due to
the following example:
~~~coq
Module Import M. Universe i. End M.
Definition foo@{i} := Type@{i}.
~~~
When interning `Type@{i}` we need to know that `i` is locally bound to
avoid interning it as `M.i`.
Extern has a symmetrical problem:
~~~coq
Module Import M. Universe i. End M.
Polymorphic Definition foo@{i} := Type@{M.i} -> Type@{i}.
Print foo. (* must not print Type@{i} -> Type@{i} *)
~~~
(Polymorphic as otherwise the local `i` will be called `foo.i`)
Therefore extern also takes a universe binders argument.
Note that the current implementation actually replaces local universes
with names at detype type. (Asymmetrical to pretyping which only gets
names in glob terms for dynamically declared univs, although it's
capable of understanding bound univs too)
As such extern only really needs the domain of the universe
binders (ie the set of bound universe ids), we just arbitrarily pass
the whole universe binders to avoid putting `Id.Map.domain` at every
entry point.
Note that if we want to change so that detyping does not name locally
bound univs we would need to pass the reverse universe binders (map
from levels to ids, contained in the ustate ie in the evar map) to
extern.
|
|
Reviewed-by: SkySkimmer
|
|
|
|
The Norm name was confusing enough to have introduced a few kernel
bugs, so it deserved to be changed as suggested in #13274.
Contrarily to what it seemingly meant, it was actually standing
for neutral terms rather than normal ones. I have kept the 4-letter
naming scheme for simplicity and renamed it into Ntrl.
|
|
Accumulators can grow arbitrarily large, even when well-typed. So, this
commit makes sure they are allocated on the major heap when they are too
large. If so, fields need to be filled with caml_initialize, in case they
point to the minor heap.
|
|
Ack-by: ppedrot
Reviewed-by: proux01
|
|
Reviewed-by: ppedrot
|
|
|
|
|
|
|
|
|
|
Reviewed-by: ppedrot
|
|
information constructors
Reviewed-by: SkySkimmer
|
|
Unfortunately, compilers are currently unable to optimize the nextafter
function, even in the degenerate case where the second argument is
explicitly infinite. So, this commit implements this case by hand.
On the following testcase, this gives a 15% speedup.
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let eps := sub (next_up one) one in
Pos.iter (fun x => next_down (add x eps)) two n.
Time Eval vm_compute in foo 100000000.
And when looking at the cost of just the allocation-free version of
next_down, the speedup is 1500%. Said otherwise, the latency of next_down
is now on par with the measurement noise due to cache misses and the like.
|
|
operations.
Floating-point values are boxed, which means that any operation causes an
allocation. While short-lived, they nonetheless cause the minor heap to
fill, which in turn triggers the garbage collector.
To reduce the number of allocations, I initially went with a shadow stack
mechanism for storing floating-point values. But assuming the CoqInterval
library is representative, this was way too complicated in practice, as
most stack-located values ended up being passed to nextdown and nextup
before being stored in memory.
So, this commit implements a different mechanism. Variants of nextdown and
nextup are added, which reuse the allocation of their input argument.
Obviously, this is only correct if there is no other reference to this
argument. To ensure this property, the commit only uses these opcode
during a peephole optimization. If two primitive operations follow one
another, then the second one can reuse the allocation of the first one,
since it never had time to even reach the stack.
For CoqInterval, this divides the number of allocations due to
floating-point operations by about two.
The following snippet is made 4% faster by this commit (and 13% faster if
we consider only the floating-point operations).
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let eps := sub (next_up one) one in
Pos.iter (fun x => next_down (add x eps)) two n.
Time Eval vm_compute in foo 100000000.
|
|
|
|
|
|
Since the code is compiled in -fPIC mode, the compiler cannot inline the
functions, due to the ABI mandating the ability to interpose visible
symbols. Hiding the symbols of coq_float64.h would work, except that they
float64.ml needs to reference them. (See #13124 for more details.)
This commit improves performances by 7% on the following code.
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let two := of_int63 2 in
Pos.iter (fun x => sub (mul x two) two) two n.
Time Eval vm_compute in foo 100000000.
If we consider only the floating-point operations (by ignoring the cost of
the loop), the speedup is actually 30%.
|
|
This does not make the code any slower, since
Is_coq_array(accu) && Is_uint63(sp[0])
and
!Is_accu(accu) && !Is_accu(sp[0])
take the exact same number of tests to pass in the concrete case.
In the accumulator case, it takes one more test to fail, but we do not
care about the performances then.
|
|
|
|
|
|
|
|
|
|
Otherwise, these constructs would be followed by a spurious Kreturn
opcode, when in tail position.
|
|
|
|
|
|
1. There is no point in marking plain integers as GC roots.
2. There is no need to restore the stack pointer, as the stack is not
allocated on the OCaml heap (contrarily to coq_env).
|