| Age | Commit message (Collapse) | Author |
|
|
|
This makes sense as a step towards a more functional handling of the
state.
|
|
Latest refactorings allow us to make the signature Coq parser a
standard `Grammar.S` one; the only bit needed is to provide the extra
capabilities to the `Lexer` signature w.r.t. to comments state.
The handling of Lexer state is still a bit ad-hoc, in particular it is
global whereas it should be fully attached to the parsable. This may
work ok in batch mode but the current behavior may be buggy in the
interactive context.
This PR doesn't solve that but it is a step towards a more functional
solution.
|
|
Four types of numerals are introduced:
- positive natural numbers (may include "_", e.g. to separate thousands, and leading 0)
- integer numbers (may start with a minus sign)
- positive numbers with mantisse and signed exponent
- signed numbers with mantisse and signed exponent
In passing, we clarify that the lexer parses only positive numerals,
but the numeral interpreters may accept signed numerals.
Several improvements and fixes come from Pierre Roux. See
https://github.com/coq/coq/pull/11703 for details. Thanks to him.
|
|
Add headers to a few files which were missing them.
|
|
|
|
Rather than integers '[0-9]+', numeral constant can now be parsed
according to the regexp '[0-9]+ ([.][0-9]+)? ([eE][+-]?[0-9]+)?'.
This can be used in one of the two following ways:
- using the function `Notation.register_rawnumeral_interpreter` in an OCaml plugin
- using `Numeral Notation` with the type `decimal` added to `Decimal.v`
See examples of each use case in the next two commits.
|
|
Instead of just string (and empty strings for tokens without payload)
|
|
|
|
One can now register a quotation using a grammar rule with
QUOTATION("name:"). "name:" becomes a keyword and the token is
generated for name: followed by a an identifier or a parenthesized
text. Eg
constr:x
string:[....]
ltac:(....)
ltac:{....}
The delimiter is made of 1 or more occurrences of the same parenthesis,
eg ((.....)) or [[[[....]]]]. The idea being that if the text happens to
contain the closing delimiter, one can make the delimiter longer and avoid
confusion (no escaping). Eg
string:[[ .. ']' .. ]]
Nesting the delimiter is allowed, eg ((..((...))..)) is OK.
The text inside the quotation is returned as a string (including the
parentheses), so that a third party parser can take care of it.
Keywords don't need to end in ':'.
|
|
Tokens were having a double role:
- the output of the lexer
- the items of grammar entries, especially terminals
Now tokens are the output of the lexer, and this paves the way for
using a richer data type, eg including Loc.t
Patterns, as in Plexing.pattern, only represent patterns (for tokens)
and now have a bit more structure (eg the wildcard is represented
as None, not as "", while a regular pattern for "x" as Some "x")
|
|
In this way one can also set the current offsets in a file,
useful if you are parsing a Coq fragment within a file instead
of a full file starting from the first line.
|
|
Diff code uses the lexer to recognize tokens in the inputs, which can be
Pp.t's or strings. To add the highlights in the Pp.t, the diff code
matches characters in the input to characters in the tokens. Current
code fails for inputs containing quote marks or "(*" because the quote
marks and comments don't appear in the tokens. This commit adds a "diff
mode" to the lexer to return those characters, making the diff routine
more robust.
|
|
This makes the make-based build system stop linking to Camlp5's
gramlib and instead links to our own gramlib.
We use the style done in the packing of `Stdlib` in OCaml 4.07.
As to introduce a minimal amount of noise in history we use an
autogenerated `gramlib__pack` directory.
Co-authored-by: Gaëtan Gilbert <gaetan.gilbert@skyskimmer.net>
|
|
A few of them will be of help for future cleanups. We have spared the
stuff in `Names` due to bad organization of this module following the
split from `Term`, which really difficult things removing the
constructors.
|
|
|
|
We first load the file, then we print it as a post-processing
step. This is both more flexible and clearer.
We also refactor the comments handling to minimize the logic that is
living in the Lexer. Indeed, the main API is now living in the
printer, and complex interactions with the state are not possible
anymore, including the removal of messing with low-level summary and
state setting!
|
|
|
|
|
|
|
|
This is a second try at removing the hooks for the legacy xml export
system which can't currently be tested.
It is also not included in the API, so it should either be included in
it or this PR be applied.
|
|
|
|
We remove the camlp4 compatibility layer, and try to clean up
most structures. `parsing/compat` is gone.
We added some documentation to the lexer/parser interfaces that are
often obscured by module includes.
|
|
This is useful for debugging purposes.
|
|
8a8caba3).
- Adding cLexer.current_file to the lexer state, i.e. making it a
component of the type "coq_parsable" of lexer state (it was
forgotten in b8ae2de5 and 8a8caba3).
- Inlining save_translator/restore_translator which have now lost most
of their substance.
|
|
It was not used any more by coqdoc since b8194b22 (Dec 2010).
The table is now only part of the lexer function closure
(and only in the camlp5 case).
|
|
done by the Ppcmd_comment token) and so that lexing/parsing
side-effects are collected at the same place, i.e. in CLexer.
|
|
the state of parsable streams, so that different lexing/parsing
processes can be started independently without conflicting.
Note however that these different lexing/parsing processes cannot be
run concurrently as they still work on the same piece of global memory
(i.e. calls to entry_parse should remain atomic). To go further, one
would typically need to be able to functionally pass the lexing state
to each call to the lexer.
Note that currently the beautifier is also running in the context of a
lexer/parser state (for the mapping of location to comments).
In particular, this fixes #5102 (parsing/lexing of bullets depending on
the lexing state which was global).
|
|
|
|
Coq locations already had support for this, but were containing dummy
information. We now don't need anymore to reconstruct this information by
browsing the file when printing an error message or enriching exceptions on the
fly.
It also became easier to interface with Coq since locations emitted by the
lexer now always contain full information.
On the API side, Loc.represent disappeared and Loc.t is now exposed as record.
It is less error-prone than manipulating a tuple of 5 integers. Also,
Loc.create takes 5 arguments instead of 3 and a pair.
|
|
|