| Age | Commit message (Collapse) | Author |
|
|
|
of cases.
|
|
cumulative inductive types
Reviewed-by: SkySkimmer
|
|
types
|
|
|
|
|
|
This is just an experiment, but makes the uses of the API easier as we
don't mess with the global state anymore.
|
|
This allows proper treatment in notations, ie fixes #13303
The "glob" representation of universes (what pretyping sees) contains
only fully interpreted (kernel) universes and unbound universe
ids (for non Strict Universe Declaration).
This means universes need to be understood at intern time, so intern
now has a new "universe binders" argument. We cannot avoid this due to
the following example:
~~~coq
Module Import M. Universe i. End M.
Definition foo@{i} := Type@{i}.
~~~
When interning `Type@{i}` we need to know that `i` is locally bound to
avoid interning it as `M.i`.
Extern has a symmetrical problem:
~~~coq
Module Import M. Universe i. End M.
Polymorphic Definition foo@{i} := Type@{M.i} -> Type@{i}.
Print foo. (* must not print Type@{i} -> Type@{i} *)
~~~
(Polymorphic as otherwise the local `i` will be called `foo.i`)
Therefore extern also takes a universe binders argument.
Note that the current implementation actually replaces local universes
with names at detype type. (Asymmetrical to pretyping which only gets
names in glob terms for dynamically declared univs, although it's
capable of understanding bound univs too)
As such extern only really needs the domain of the universe
binders (ie the set of bound universe ids), we just arbitrarily pass
the whole universe binders to avoid putting `Id.Map.domain` at every
entry point.
Note that if we want to change so that detyping does not name locally
bound univs we would need to pass the reverse universe binders (map
from levels to ids, contained in the ustate ie in the evar map) to
extern.
|
|
Reviewed-by: SkySkimmer
|
|
The Norm name was confusing enough to have introduced a few kernel
bugs, so it deserved to be changed as suggested in #13274.
Contrarily to what it seemingly meant, it was actually standing
for neutral terms rather than normal ones. I have kept the 4-letter
naming scheme for simplicity and renamed it into Ntrl.
|
|
Accumulators can grow arbitrarily large, even when well-typed. So, this
commit makes sure they are allocated on the major heap when they are too
large. If so, fields need to be filled with caml_initialize, in case they
point to the minor heap.
|
|
Ack-by: ppedrot
Reviewed-by: proux01
|
|
Reviewed-by: ppedrot
|
|
|
|
|
|
|
|
|
|
Reviewed-by: ppedrot
|
|
information constructors
Reviewed-by: SkySkimmer
|
|
Unfortunately, compilers are currently unable to optimize the nextafter
function, even in the degenerate case where the second argument is
explicitly infinite. So, this commit implements this case by hand.
On the following testcase, this gives a 15% speedup.
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let eps := sub (next_up one) one in
Pos.iter (fun x => next_down (add x eps)) two n.
Time Eval vm_compute in foo 100000000.
And when looking at the cost of just the allocation-free version of
next_down, the speedup is 1500%. Said otherwise, the latency of next_down
is now on par with the measurement noise due to cache misses and the like.
|
|
operations.
Floating-point values are boxed, which means that any operation causes an
allocation. While short-lived, they nonetheless cause the minor heap to
fill, which in turn triggers the garbage collector.
To reduce the number of allocations, I initially went with a shadow stack
mechanism for storing floating-point values. But assuming the CoqInterval
library is representative, this was way too complicated in practice, as
most stack-located values ended up being passed to nextdown and nextup
before being stored in memory.
So, this commit implements a different mechanism. Variants of nextdown and
nextup are added, which reuse the allocation of their input argument.
Obviously, this is only correct if there is no other reference to this
argument. To ensure this property, the commit only uses these opcode
during a peephole optimization. If two primitive operations follow one
another, then the second one can reuse the allocation of the first one,
since it never had time to even reach the stack.
For CoqInterval, this divides the number of allocations due to
floating-point operations by about two.
The following snippet is made 4% faster by this commit (and 13% faster if
we consider only the floating-point operations).
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let eps := sub (next_up one) one in
Pos.iter (fun x => next_down (add x eps)) two n.
Time Eval vm_compute in foo 100000000.
|
|
|
|
|
|
Since the code is compiled in -fPIC mode, the compiler cannot inline the
functions, due to the ABI mandating the ability to interpose visible
symbols. Hiding the symbols of coq_float64.h would work, except that they
float64.ml needs to reference them. (See #13124 for more details.)
This commit improves performances by 7% on the following code.
From Coq Require Import Int63 BinPos PrimFloat.
Definition foo n :=
let two := of_int63 2 in
Pos.iter (fun x => sub (mul x two) two) two n.
Time Eval vm_compute in foo 100000000.
If we consider only the floating-point operations (by ignoring the cost of
the loop), the speedup is actually 30%.
|
|
This does not make the code any slower, since
Is_coq_array(accu) && Is_uint63(sp[0])
and
!Is_accu(accu) && !Is_accu(sp[0])
take the exact same number of tests to pass in the concrete case.
In the accumulator case, it takes one more test to fail, but we do not
care about the performances then.
|
|
|
|
|
|
|
|
|
|
Otherwise, these constructs would be followed by a spurious Kreturn
opcode, when in tail position.
|
|
|
|
|
|
1. There is no point in marking plain integers as GC roots.
2. There is no need to restore the stack pointer, as the stack is not
allocated on the OCaml heap (contrarily to coq_env).
|
|
Fix #13354
This change is very specific to array, but should not be a significant
obstacle to generalization of the feature to eg axioms if we want to later.
|
|
BTW it was incorrect (array needs an instance)
|
|
not only on subidentifiers of an identifier
Reviewed-by: Zimmi48
|
|
There was not any difference between those after the cleanup patches that
come before.
|
|
It was a hidden invariant of the code.
|
|
|
|
Polymorphic side-effects generated in monomorphic mode would be counted towards
trusted subcomponents. This would allow to make ill-typed terms pass as
legitimate by mimicking the shape of the inlining of monomorphic side-effects in
such a proof.
|
|
|
|
Instead we store that data in the native code that was generated in adapt
the compilation scheme accordingly. Less indirections and less imperative
tinkering makes the code safer.
The global symbol table was originally introduced in #10359 as a way not to
depend on the Global module in the generated code. By storing all the
native-related information in the cmxs file itself, this PR also makes other
changes easier, such as e.g. #13287.
|
|
are in custom output path
Reviewed-by: maximedenes
Reviewed-by: herbelin
|
|
|
|
No need to zip the stack if the machine has made no progress.
|
|
Reviewed-by: SkySkimmer
|
|
Reviewed-by: ppedrot
|
|
By no means a float is a neutral value. When put inside a Zprimitive context
it can trigger computation.
|
|
A partially applied primitive was considered CClosure.Norm, i.e. neutral. But
this is not true, because substituting this term as the head of an application
may trigger further reduction. In this respect, primitive functions behave like
fixpoints.
|
|
Not sure if we can get a bug from this omission.
|