aboutsummaryrefslogtreecommitdiff
path: root/parsing/tok.ml
AgeCommit message (Collapse)Author
2020-09-11[parsing] Rename token NUMERAL to NUMBERPierre Roux
2020-03-22Centralizing all kinds of numeral string management in numTok.ml.Hugo Herbelin
Four types of numerals are introduced: - positive natural numbers (may include "_", e.g. to separate thousands, and leading 0) - integer numbers (may start with a minus sign) - positive numbers with mantisse and signed exponent - signed numbers with mantisse and signed exponent In passing, we clarify that the lexer parses only positive numerals, but the numeral interpreters may accept signed numerals. Several improvements and fixes come from Pierre Roux. See https://github.com/coq/coq/pull/11703 for details. Thanks to him.
2020-03-18Update headers in the whole code base.Théo Zimmermann
Add headers to a few files which were missing them.
2019-06-17Update ml-style headers to new year.Théo Zimmermann
2019-04-02Add parsing of decimal constants (e.g., 1.02e+01)Pierre Roux
Rather than integers '[0-9]+', numeral constant can now be parsed according to the regexp '[0-9]+ ([.][0-9]+)? ([eE][+-]?[0-9]+)?'. This can be used in one of the two following ways: - using the function `Notation.register_rawnumeral_interpreter` in an OCaml plugin - using `Numeral Notation` with the type `decimal` added to `Decimal.v` See examples of each use case in the next two commits.
2019-04-02Rename the INT token to NUMERALPierre Roux
In anticipation of future uses of this token for non integer numerals.
2019-03-31Multiple payload types in tokensPierre Roux
Instead of just string (and empty strings for tokens without payload)
2019-03-31[parsing] add keyword-protected generic quotationEnrico Tassi
One can now register a quotation using a grammar rule with QUOTATION("name:"). "name:" becomes a keyword and the token is generated for name: followed by a an identifier or a parenthesized text. Eg constr:x string:[....] ltac:(....) ltac:{....} The delimiter is made of 1 or more occurrences of the same parenthesis, eg ((.....)) or [[[[....]]]]. The idea being that if the text happens to contain the closing delimiter, one can make the delimiter longer and avoid confusion (no escaping). Eg string:[[ .. ']' .. ]] Nesting the delimiter is allowed, eg ((..((...))..)) is OK. The text inside the quotation is returned as a string (including the parentheses), so that a third party parser can take care of it. Keywords don't need to end in ':'.
2019-03-31[parsing] Split Tok.t into Tok.t and Tok.patternEnrico Tassi
Tokens were having a double role: - the output of the lexer - the items of grammar entries, especially terminals Now tokens are the output of the lexer, and this paves the way for using a richer data type, eg including Loc.t Patterns, as in Plexing.pattern, only represent patterns (for tokens) and now have a bit more structure (eg the wildcard is represented as None, not as "", while a regular pattern for "x" as Some "x")
2018-12-20Make diffs work for more input stringsJim Fehrle
Diff code uses the lexer to recognize tokens in the inputs, which can be Pp.t's or strings. To add the highlights in the Pp.t, the diff code matches characters in the input to characters in the tokens. Current code fails for inputs containing quote marks or "(*" because the quote marks and comments don't appear in the tokens. This commit adds a "diff mode" to the lexer to return those characters, making the diff routine more robust.
2018-12-10For diffs, return exactly the characters that make up STRING and FIELD tokensJim Fehrle
2018-02-27Update headers following #6543.Théo Zimmermann
2018-02-17Change references to CAMLP4 to CAMLP5 to be more accurate since we noJim Fehrle
longer use camlp4.
2017-07-04Bump year in headers.Pierre-Marie Pédrot
2016-10-05Fix incorrect token description for bullets.Guillaume Melquiond
2016-10-05Revert "Move bullet detection from lexer to parser (bug #5102)."Guillaume Melquiond
This reverts commit 466b7e69e49a5f4bba36b834a2e046f120ece07c.
2016-10-02Move bullet detection from lexer to parser (bug #5102).Guillaume Melquiond
That way, bullet detection no longer depends on a global variable indicating whether a line is starting. This causes a small change in the recognized language. Before the commit, "--++" was recognized as a bullet "--" followed by a keyword "++" when at the start of a line; now it is always recognized as a keyword "--++". This also fixes a bug in Tok.to_string as a side-effect.
2016-07-03Remove lexing of ordinal notations.Maxime Dénès
This was implemented in anticipation of a part of PR#164 that we decided not to merge.
2016-03-19Further reducing the dependencies of the EXTEND macros.Pierre-Marie Pédrot
2016-02-24Removing the METAIDENT token, as it is not used anymore.Pierre-Marie Pédrot
METAIDENT were idents of the form $foobar, only used in quotations. Note that it removes two dollars in the Coq codebase! Guess I'm absolved for the $(...) syntax.
2016-01-21Merge branch 'v8.5'Pierre-Marie Pédrot
2016-01-20Update copyright headers.Maxime Dénès
2015-12-15Adding a token "index" representing positions (1st, 2nd, etc.).Hugo Herbelin
2015-11-26Fixing the "parsing rules with idents later declared as keywords" problem.Hugo Herbelin
The fix was actually elementary. The lexer comes with a function to compare parsed tokens against tokens of the parsing rules. It is enough to have this function considering an ident in a parsing rule to be equal to the corresponding string parsed as a keyword.
2015-01-12Update headers.Maxime Dénès
2014-08-05Uncountably many bullets (+,-,*,++,--,**,+++,...).Hugo Herbelin
2013-10-22Removing some generic equalities.ppedrot
git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@16915 85f007b7-540e-0410-9357-904b9bb8a0f7
2013-06-21Revert "KEYID token makes parsing more robust in face of notations"gareuselesinge
This reverts commit 950086cd26f83aa8896dde9dc7ef5b3d87307c56. git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@16599 85f007b7-540e-0410-9357-904b9bb8a0f7
2013-06-19KEYID token makes parsing more robust in face of notationsgareuselesinge
A notation may introduce a new keyword that masks an quasi keyword (IDENT "of" for example) used in some grammar rules. KEYID "of" acepts both "of" as an ident and "of" as a keyword. Rephrasing grammar rules with KEYID makes hence them more robust in face of user notations. git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@16590 85f007b7-540e-0410-9357-904b9bb8a0f7
2012-08-08Updating headers.herbelin
git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@15715 85f007b7-540e-0410-9357-904b9bb8a0f7
2012-04-27Partial revert of r15148 in order to compile with Camlp4pboutill
+ comment correction git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@15253 85f007b7-540e-0410-9357-904b9bb8a0f7
2012-04-12Remove print call that do not use the pp mechanismpboutill
Signed-off-by: Edward Z. Yang <ezyang@mit.edu> git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@15148 85f007b7-540e-0410-9357-904b9bb8a0f7
2010-07-24Updated all headers for 8.3 and trunkherbelin
git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@13323 85f007b7-540e-0410-9357-904b9bb8a0f7
2010-06-22New script dev/tools/change-header to automatically update Coq files headers.herbelin
Applied it to fix mli file headers. git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@13176 85f007b7-540e-0410-9357-904b9bb8a0f7
2010-05-19Nicer representation of tokens, more independant of camlp*letouzey
Cf tok.ml, token isn't anymore string*string where first string encodes the kind of the token, but rather a nice sum type. Unfortunately, string*string (a.k.a Plexing.pattern) is still used in some places of Camlp5, so there's a few conversions back and forth. But the penalty should be quite low, and having nicer tokens helps in the forthcoming integration of support for camlp4 post 3.10 git-svn-id: svn+ssh://scm.gforge.inria.fr/svn/coq/trunk@13018 85f007b7-540e-0410-9357-904b9bb8a0f7