Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #9507 Describe accurately acceptable package names #9508

Merged
merged 1 commit into from
Jan 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .typos.toml
Original file line number Diff line number Diff line change
@@ -1,2 +1,7 @@
[default]
extend-ignore-re = ["(?s)(#|//)\\s*spellchecker:off.*?\\n\\s*(#|//)\\s*spellchecker:on"]

[default.extend-words]
# Extinguish false positive in cabal-package-description-file.rst. 'Nd' is a
# Unicode category, not a misspelling of 'And'.
nd = "nd"
7 changes: 5 additions & 2 deletions Cabal-described/src/Distribution/Described.hs
Original file line number Diff line number Diff line change
Expand Up @@ -168,8 +168,8 @@ reUnqualComponent = RENamed "unqual-name" $
-- currently the parser accepts "csAlphaNum `difference` "0123456789"
-- which is larger set than CS.alpha
--
-- Hackage rejects non ANSI names, so it's not so relevant.
<> RECharSet CS.alpha
-- Hackage, however, rejects non ANSI names.
<> RECharSet csAlphaNumNotDigit
<> REMunch reEps (RECharSet csAlphaNum)

reDot :: GrammarRegex a
Expand All @@ -194,6 +194,9 @@ csAlpha = CS.alpha
csAlphaNum :: CS.CharSet
csAlphaNum = CS.alphanum

csAlphaNumNotDigit :: CS.CharSet
csAlphaNumNotDigit = CS.alphanumNotDigit

csUpper :: CS.CharSet
csUpper = CS.upper

Expand Down
13 changes: 10 additions & 3 deletions Cabal-described/src/Distribution/Utils/CharSet.hs
Original file line number Diff line number Diff line change
Expand Up @@ -27,16 +27,17 @@ module Distribution.Utils.CharSet (
-- * Special lists
alpha,
alphanum,
alphanumNotDigit,
upper,
) where

import Data.Char (chr, isAlpha, isAlphaNum, isUpper, ord)
import Data.Char (chr, isAlpha, isAlphaNum, isDigit, isUpper, ord)
import Data.List (foldl', sortBy)
import Data.Monoid (Monoid (..))
import Data.String (IsString (..))
import Distribution.Compat.Semigroup (Semigroup (..))
import Prelude
(Bool (..), Bounded (..), Char, Enum (..), Eq (..), Int, Maybe (..), Num (..), Ord (..), Show (..), String, concatMap, flip, fst, otherwise, showParen,
(Bool (..), Bounded (..), Char, Enum (..), Eq (..), Int, Maybe (..), Num (..), Ord (..), Show (..), String, (&&), concatMap, flip, fst, not, otherwise, showParen,
showString, uncurry, ($), (.))

#if MIN_VERSION_containers(0,5,0)
Expand Down Expand Up @@ -229,10 +230,16 @@ alpha :: CharSet
alpha = foldl' (flip insert) empty [ c | c <- [ minBound .. maxBound ], isAlpha c ]
{-# NOINLINE alpha #-}

-- | Note: this set varies depending on @base@ version.
--
alphanumNotDigit :: CharSet
alphanumNotDigit = foldl' (flip insert) empty [ c | c <- [ minBound .. maxBound ], isAlphaNum c && not (isDigit c) ]
{-# NOINLINE alphanumNotDigit #-}

-- | Note: this set varies depending on @base@ version.
--
alphanum :: CharSet
alphanum = foldl' (flip insert) empty [ c | c <- [ minBound .. maxBound ], isAlphaNum c ]
alphanum = foldl' (flip insert) alphanumNotDigit ['0' .. '9' ]
{-# NOINLINE alphanum #-}

-- | Note: this set varies depending on @base@ version.
Expand Down
7 changes: 4 additions & 3 deletions Cabal-described/src/Distribution/Utils/GrammarRegex.hs
Original file line number Diff line number Diff line change
Expand Up @@ -194,9 +194,10 @@ mathtt d = "\\mathtt{" <<>> d <<>> "}"

charsetDoc :: CS.CharSet -> PP.Doc
charsetDoc acs
| acs == CS.alpha = terminalDoc "alpha"
| acs == CS.alphanum = terminalDoc "alpha-num"
| acs == CS.upper = terminalDoc "upper"
| acs == CS.alpha = terminalDoc "alpha"
| acs == CS.alphanum = terminalDoc "alpha-num"
| acs == CS.alphanumNotDigit = terminalDoc "alpha-num-not-digit"
| acs == CS.upper = terminalDoc "upper"
charsetDoc acs = case CS.toIntervalList acs of
[] -> "\\emptyset"
[(x,y)] | x == y -> inquotes $ mathtt $ charDoc x
Expand Down
4 changes: 2 additions & 2 deletions buildinfo-reference-generator/src/Main.hs
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ main = do
"String as in Haskell; it's recommended to avoid using Haskell-specific escapes."
, zproduction "unqual-name" reUnqualComponent $ unwords
[ "Unqualified component names are used for package names, component names etc. but not flag names."
, "Unqualified component name consist of components separated by dash, each component is non-empty alphanumeric string, with at least one alphabetic character."
, "In other words, component may not look like a number."
, "An unqualified component name consists of components separated by a hyphen, each component is a non-empty alphanumeric string, with at least one character that is not the digits ``0`` to ``9``."
, "In other words, a component may not look like a number."
]

, zproduction "module-name" (describe (Proxy :: Proxy ModuleName))
Expand Down
11 changes: 7 additions & 4 deletions buildinfo-reference-generator/template.zinza
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Field syntax is described as they are in the latest cabal file format version.

.. math::

\mathord{"}\mathtt{example}\mathord{"}
\mathord{``}\mathtt{example}\mathord{"}

* non-terminals are type set in italic:

Expand All @@ -25,13 +25,13 @@ Field syntax is described as they are in the latest cabal file format version.

.. math::

[ \mathord{"}\mathtt{1}\mathord{"} \cdots \mathord{"}\mathtt{9}\mathord{"} ]
[ \mathord{``}\mathtt{1}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]

Character set complements have :math:`c` superscript:

.. math::

[ \mathord{"}\mathtt{1}\mathord{"} \cdots \mathord{"}\mathtt{9}\mathord{"} ]^c
[ \mathord{``}\mathtt{1}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]^c

* repetition is type set using regular expression inspired notation.
Superscripts tell how many time to repeat:
Expand Down Expand Up @@ -125,7 +125,10 @@ Optional comma separated
Non-terminals
-------------

In the syntax definitions below the following non-terminal symbols are used:
In the syntax definitions below the following non-terminal symbols are used. In addition:

.. math::
{\mathop{\mathit{alpha\text{-}num\text{-}not\text{-}digit}}} = {\mathop{\mathit{alpha\text{-}num}}}\cap{[ \mathord{``}\mathtt{0}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]^c}

{% for production in productions %}
{{ production.name }}
Expand Down
15 changes: 9 additions & 6 deletions doc/buildinfo-fields-reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Field syntax is described as they are in the latest cabal file format version.

.. math::

\mathord{"}\mathtt{example}\mathord{"}
\mathord{``}\mathtt{example}\mathord{"}

* non-terminals are type set in italic:

Expand All @@ -25,13 +25,13 @@ Field syntax is described as they are in the latest cabal file format version.

.. math::

[ \mathord{"}\mathtt{1}\mathord{"} \cdots \mathord{"}\mathtt{9}\mathord{"} ]
[ \mathord{``}\mathtt{1}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]

Character set complements have :math:`c` superscript:

.. math::

[ \mathord{"}\mathtt{1}\mathord{"} \cdots \mathord{"}\mathtt{9}\mathord{"} ]^c
[ \mathord{``}\mathtt{1}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]^c

* repetition is type set using regular expression inspired notation.
Superscripts tell how many time to repeat:
Expand Down Expand Up @@ -125,7 +125,10 @@ Optional comma separated
Non-terminals
-------------

In the syntax definitions below the following non-terminal symbols are used:
In the syntax definitions below the following non-terminal symbols are used. In addition:

.. math::
{\mathop{\mathit{alpha\text{-}num\text{-}not\text{-}digit}}} = {\mathop{\mathit{alpha\text{-}num}}}\cap{[ \mathord{``}\mathtt{0}\mathord{"} \cdots \mathord{``}\mathtt{9}\mathord{"} ]^c}

hs-string
String as in Haskell; it's recommended to avoid using Haskell-specific escapes.
Expand All @@ -134,10 +137,10 @@ hs-string
\mathop{\mathord{``}\mathtt{\text{"}}\mathord{"}}{\left\{ {[\mathop{\mathord{``}\mathtt{\text{"}}\mathord{"}}\mathop{\mathord{``}\mathtt{\text{\\}}\mathord{"}}]^c}\mid\left\{ \begin{gathered}\mathop{\mathord{``}\mathtt{\text{\\}\text{&}}\mathord{"}}\\\mathop{\mathord{``}\mathtt{\text{\\}\text{\\}}\mathord{"}}\\\left\{ \mathop{\mathord{``}\mathtt{\text{\\}n}\mathord{"}}\mid\mathop{\mathit{escapes}} \right\}\\\mathop{\mathord{``}\mathtt{\text{\\}}\mathord{"}}[\mathop{\mathord{``}\mathtt{0}\mathord{"}}\cdots\mathop{\mathord{``}\mathtt{9}\mathord{"}}]\\\mathop{\mathord{``}\mathtt{\text{\\}o}\mathord{"}}[\mathop{\mathord{``}\mathtt{0}\mathord{"}}\cdots\mathop{\mathord{``}\mathtt{7}\mathord{"}}]\\\mathop{\mathord{``}\mathtt{\text{\\}x}\mathord{"}}[\mathop{\mathord{``}\mathtt{0}\mathord{"}}\cdots\mathop{\mathord{``}\mathtt{9}\mathord{"}}\mathop{\mathord{``}\mathtt{A}\mathord{"}}\cdots\mathop{\mathord{``}\mathtt{F}\mathord{"}}\mathop{\mathord{``}\mathtt{a}\mathord{"}}\cdots\mathop{\mathord{``}\mathtt{f}\mathord{"}}]\\\left\{ \mathop{\mathord{``}\mathtt{\text{\\}\text{^}\text{@}}\mathord{"}}\mid\mathop{\mathit{control}} \right\}\\\left\{ \mathop{\mathord{``}\mathtt{\text{\\}NUL}\mathord{"}}\mid\mathop{\mathit{ascii}} \right\}\end{gathered} \right\} \right\}}^\ast_{}\mathop{\mathord{``}\mathtt{\text{"}}\mathord{"}}

unqual-name
Unqualified component names are used for package names, component names etc. but not flag names. Unqualified component name consist of components separated by dash, each component is non-empty alphanumeric string, with at least one alphabetic character. In other words, component may not look like a number.
Unqualified component names are used for package names, component names etc. but not flag names. An unqualified component name consists of components separated by a hyphen, each component is a non-empty alphanumeric string, with at least one character that is not the digits ``0`` to ``9``. In other words, a component may not look like a number.

.. math::
{\left({\mathop{\mathit{alpha\text{-}num}}}^\ast_{}\mathop{\mathit{alpha}}{\mathop{\mathit{alpha\text{-}num}}}^\ast_{}\right)}^+_{\mathop{\mathord{``}\mathtt{\text{-}}\mathord{"}}}
{\left({\mathop{\mathit{alpha\text{-}num}}}^\ast_{}\mathop{\mathit{alpha\text{-}num\text{-}not\text{-}digit}}{\mathop{\mathit{alpha\text{-}num}}}^\ast_{}\right)}^+_{\mathop{\mathord{``}\mathtt{\text{-}}\mathord{"}}}

module-name
Haskell module name as recognized by Cabal parser.
Expand Down
33 changes: 24 additions & 9 deletions doc/cabal-package-description-file.rst
Original file line number Diff line number Diff line change
Expand Up @@ -313,24 +313,39 @@ describe the package as a whole:
tools require the package-name specified for this field to match
the package description's file-name :file:`{package-name}.cabal`.

Package names are case-sensitive and must match the regular expression
(i.e. alphanumeric "words" separated by dashes; each alphanumeric
word must contain at least one letter):
``[[:digit:]]*[[:alpha:]][[:alnum:]]*(-[[:digit:]]*[[:alpha:]][[:alnum:]]*)*``.
A valid package name comprises an alphanumeric 'word'; or two or more
such words separated by a hyphen character (``-``). A word cannot be
comprised only of the digits ``0`` to ``9``.

Or, expressed in ABNF_:
An alphanumeric character belongs to one of the Unicode Letter categories
(Lu (uppercase), Ll (lowercase), Lt (titlecase), Lm (modifier), or
Lo (other)) or Number categories (Nd (decimal), Nl (letter), or No (other)).

Package names are case-sensitive.

Expressed as a regular expression:

``[0-9]*[\p{L}\p{N}-[0-9]][\p{L}\p{N}]*(-[0-9]*[\p{L}\p{N}-[0-9]][\p{L}\p{N}]*)*``

Expressed in ABNF_:

.. code-block:: abnf

package-name = package-name-part *("-" package-name-part)
package-name-part = *DIGIT UALPHA *UALNUM
package-name-part = *DIGIT UALPHANUM-NOT-DIGIT *UALNUM

DIGIT = %x30-39 ; 0-9

UALNUM = UALPHA / DIGIT
UALPHA = ... ; set of alphabetic Unicode code-points
UALNUM = UALPHANUM-NOT-DIGIT / DIGIT
UALPHANUM-NOT-DIGIT = ... ; set of Unicode code-points in Letter or
; Number categories, other than the DIGIT
; code-points

.. note::

Hackage restricts package names to the ASCII subset.
Hackage will not accept package names that use alphanumeric characters
other than ``A`` to ``Z``, ``a`` to ``z``, and ``0`` to ``9``
mpilgrem marked this conversation as resolved.
Show resolved Hide resolved
(the ASCII subset).

.. pkg-field:: version: numbers (required)

Expand Down
2 changes: 1 addition & 1 deletion doc/package-concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Package names and versions
All packages have a name, e.g. "HUnit". Package names are assumed to be
unique. Cabal package names may contain letters, numbers and hyphens,
but not spaces and may also not contain a hyphened section consisting of
only numbers. The namespace for Cabal packages is flat, not
only of the digits ``0`` to ``9``. The namespace for Cabal packages is flat, not
hierarchical.

Packages also have a version, e.g "1.1". This matches the typical way in
Expand Down
Loading