Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store spaces/tabs as actives in v-type arg #1615

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions base/changes.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,12 @@ to completeness or accuracy and it contains some references to files that are
not part of the distribution.
================================================================================

2025-01-11 Joseph Wright <Joseph.Wright@latex-project.org>

* ltcmd.dtx:
Correct catcode of space/tab in v-type arg
Adjust handling of catcodes for letters in v-type arg

2025-01-03 Frank Mittelbach <Frank.Mittelbach@latex-project.org>

* lthooks.dtx (subsubsection{Updating code for hooks}):
Expand Down
7 changes: 7 additions & 0 deletions base/doc/ltnews41.tex
Original file line number Diff line number Diff line change
Expand Up @@ -278,6 +278,13 @@ \subsection{Tab character as a special}
be used in for example a \texttt{v}~specification document command without
additional steps.

\subsection{Refinement of \texttt{v}~specification category codes}

Work on verbatim argument handling has highlighted that storing
all characters as \enquote{other} (category code~12) when using a
\texttt{v}~specification in \pkg{ltcmd} was problematic. We have now
revised this to capture letters with their original cateogry code.

\subsection{Logging text command and symbol declarations}

For thirty years the documentation claimed that \cs{DeclareTextSymbol},
Expand Down
14 changes: 8 additions & 6 deletions base/doc/usrguide.tex
Original file line number Diff line number Diff line change
Expand Up @@ -35,15 +35,15 @@
\usepackage{url}

\title{\LaTeX\ for authors\\ current version}
\author{\copyright~Copyright 2020--2024, \LaTeX\ Project Team.\\
\author{\copyright~Copyright 2020--2025, \LaTeX\ Project Team.\\
All rights reserved.%
\footnote{This file may be distributed and/or modified under the
conditions of the \LaTeX{} Project Public License, either version 1.3c
of this license or (at your option) any later version. See the source
\texttt{usrguide.tex} for full details.}%
}

\date{2024-11-17}
\date{2025-01-12}

\NewDocumentCommand\cs{m}{\texttt{\textbackslash\detokenize{#1}}}
\NewDocumentCommand\marg{m}{\arg{#1}}
Expand Down Expand Up @@ -837,10 +837,12 @@ \subsection{Using the verbatim argument types}

Some additional details that may be useful for those with more \TeX{}
knowledge: do not worry if this does not make sense to you! Spaces and tabs are
stored as active characters. In Unicode engines, all other characters are of
type \enquote{other}. In $8$-bit engines, the ASCII characters other than tab
and space are of type \enquote{other}, and non-ASCII characters are active. As
such, token-based comparisons are likely to fail unless set up properly.
stored as active characters. In $8$-bit engines, non-ASCII characters are
\enquote{active}, whilst other than the letters a--zA--Z, ASCII characters are
\enquote{other}. In Unicode engines, non-ASCII codepoints will be either
letters or \enquote{other}, based on the standard \LaTeX{} settings derived
from Unicode data. For token-based comparisons, it is likely that the active
spaces and tabs should be replaced: this can be done conveniently by expansion.

\subsection{Typesetting verbatim-like material}

Expand Down
55 changes: 42 additions & 13 deletions base/ltcmd.dtx
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@
%%% From File: ltcmd.dtx
%
% \begin{macrocode}
\def\ltcmdversion{v1.2f}
\def\ltcmddate{2024-12-25}
\def\ltcmdversion{v1.2g}
\def\ltcmddate{2025-01-11}
% \end{macrocode}
%
%<*driver>
Expand Down Expand Up @@ -3663,20 +3663,37 @@
% \end{macro}
%
% \begin{macro}{\@@_grab_v_aux_catcodes:}
% \changes{v1.2g}{2025-01-11}{Store spaces and tabs as active chars}
% \begin{macro}{\@@_grab_v_aux_abort:n}
% The approach for short verbatim arguments is to make the end-line
% character a macro parameter character: this is forbidden by the
% rest of the code. Then the error branch can check what caused the
% bail out and give the appropriate error message.
% \begin{macrocode}
%<latexrelease>\IncludeInRelease{2025/06/01}{\@@_grab_v_aux_catcodes:}%
%<latexrelease> {Active~spaces~and~tabs}
\cs_new_protected:Npn \@@_grab_v_aux_catcodes:
{
\cs_set_eq:NN \do \char_set_catcode_other:N
\dospecials
\char_set_catcode_active:n { `\ }
\char_set_catcode_active:n { `\^^I }
\bool_if:NTF \l_@@_long_bool
{ \char_set_catcode_other:n { \tex_endlinechar:D } }
{ \char_set_catcode_parameter:n { \tex_endlinechar:D } }
}
%<latexrelease>\EndIncludeInRelease
%<latexrelease>\IncludeInRelease{2020/10/01}{\@@_grab_v_aux_catcodes:}%
%<latexrelease> {Active~spaces~and~tabs}
%<latexrelease>\cs_new_protected:Npn \@@_grab_v_aux_catcodes:
%<latexrelease> {
%<latexrelease> \cs_set_eq:NN \do \char_set_catcode_other:N
%<latexrelease> \dospecials
%<latexrelease> \bool_if:NTF \l_@@_long_bool
%<latexrelease> { \char_set_catcode_other:n { \tex_endlinechar:D } }
%<latexrelease> { \char_set_catcode_parameter:n { \tex_endlinechar:D } }
%<latexrelease> }
%<latexrelease>\EndIncludeInRelease
\cs_new_protected:Npn \@@_grab_v_aux_abort:n #1
{
\@@_grab_v_group_end:
Expand All @@ -3703,26 +3720,38 @@
%
% \begin{macro}{\@@_grab_v_aux_put:N}
% \changes{v1.2d}{2024/03/21}{Collect \cs{endlinechar} as \cs{obeyedline}}
% Storing one token in the collected argument. Most tokens are
% converted to category code $12$, with the exception of active
% characters, and spaces (not sure what should be done for those).
% \changes{v1.2g}{2025-01-11}{Simplify catcode handling}
% Storing one token in the collected argument: everything as-is except
% for end-of-lines, with \cs{exp_not:N} to handle actives.
% \begin{macrocode}
%<latexrelease>\IncludeInRelease{2024/06/01}{\@@_grab_v_aux_put:N}%
%<latexrelease> {Endlines~as~\obeyedline}
%<latexrelease>\IncludeInRelease{2025-06-01}{\@@_grab_v_aux_put:N}%
%<latexrelease> {Use~more~std~catcodes}
\cs_new_protected:Npn \@@_grab_v_aux_put:N #1
{
\tl_put_right:Nx \l_@@_v_arg_tl
{
\token_if_active:NTF #1
\int_compare:nNnTF {`#1} = \tex_endlinechar:D
{ \exp_not:N \obeyedline }
{ \exp_not:N #1 }
{
\int_compare:nNnTF {`#1} = \tex_endlinechar:D
{ \exp_not:N \obeyedline }
{ \token_to_str:N #1 }
}
}
}
%<latexrelease>\EndIncludeInRelease
%<latexrelease>\IncludeInRelease{2024/06/01}{\@@_grab_v_aux_put:N}%
%<latexrelease> {Endlines~as~\obeyedline}
%<latexrelease>\cs_new_protected:Npn \@@_grab_v_aux_put:N #1
%<latexrelease> {
%<latexrelease> \tl_put_right:Nx \l_@@_v_arg_tl
%<latexrelease> {
%<latexrelease> \token_if_active:NTF #1
%<latexrelease> { \exp_not:N #1 }
%<latexrelease> {
%<latexrelease> \int_compare:nNnTF {`#1} = \tex_endlinechar:D
%<latexrelease> { \exp_not:N \obeyedline }
%<latexrelease> { \token_to_str:N #1 }
%<latexrelease> }
%<latexrelease> }
%<latexrelease> }
%<latexrelease>\EndIncludeInRelease
%<latexrelease>\IncludeInRelease{2020/10/01}{\@@_grab_v_aux_put:N}%
%<latexrelease> {Endlines~as~\obeyedline}
%<latexrelease>\cs_new_protected:Npn \@@_grab_v_aux_put:N #1
Expand Down
52 changes: 26 additions & 26 deletions base/testfiles-ltcmd/github-0876.tlg
Original file line number Diff line number Diff line change
@@ -1,48 +1,48 @@
This is a generated file for the LaTeX2e validation system.
Don't change this file in any respect.
The token list contains the tokens:
> b (the character b)
> a (the character a)
> r (the character r)
> (blank space )
> b (the character b)
> a (the character a)
> r (the character r).
> b (the letter b)
> a (the letter a)
> r (the letter r)
> (active character=macro:-> )
> b (the letter b)
> a (the letter a)
> r (the letter r).
<recently read> }
l. ...bar+
^^M
The token list contains the tokens:
> (blank space )
> b (the character b)
> a (the character a)
> r (the character r)
> (active character=macro:-> )
> b (the letter b)
> a (the letter a)
> r (the letter r)
> \obeyedline (control sequence=\protected macro:->\par )
> b (the character b)
> a (the character a)
> r (the character r).
> b (the letter b)
> a (the letter a)
> r (the letter r).
<recently read> }
l. ...bar+
^^M
The token list contains the tokens:
> b (the character b)
> a (the character a)
> r (the character r)
> b (the letter b)
> a (the letter a)
> r (the letter r)
> \obeyedline (control sequence=\protected macro:->\par )
> b (the character b)
> a (the character a)
> r (the character r).
> b (the letter b)
> a (the letter a)
> r (the letter r).
<recently read> }
l. ...bar+
^^M
The token list contains the tokens:
> \obeyedline (control sequence=\protected macro:->\par )
> b (the character b)
> a (the character a)
> r (the character r)
> b (the letter b)
> a (the letter a)
> r (the letter r)
> \obeyedline (control sequence=\protected macro:->\par )
> b (the character b)
> a (the character a)
> r (the character r).
> b (the letter b)
> a (the letter a)
> r (the letter r).
<recently read> }
l. ...bar+
^^M
10 changes: 2 additions & 8 deletions base/testfiles-ltcmd/ltcmd003.lvt
Original file line number Diff line number Diff line change
Expand Up @@ -24,16 +24,10 @@
I~got~'\token_to_str:N\foo
\IfBooleanT #1 {*}
\IfNoValueF {#2} {[\tl_to_str:n{#2}]}
-#3-
-[See~tokenlist~analysis~below]-
\tl_to_str:n { {#4} }
}
\IfNoValueF {#3}
{
\exp_args:No \tl_if_eq:nnF
{ \tl_to_str:n {#3} }
{#3}
{\ERROR}
}
\tl_analysis_show:n {#3}
}
\foo+\foo+{}
\foo+% ^^+^^+
Expand Down
Loading
Loading