expl3
from day one. A basic
expl3
function name such as \foo:nn
shows how many unmodified braced
arguments it takes: so called n
-type arguments. We can then create variants,
which can lead to expansion only once (o
-type), to the value of a variable
(V
-type) or to the value retrieved by constructing the name of a variable and
then finding the value (v
-type). We can do the same with single-token
(N
-type) arguments, which are often themselves functions and can be given as a
constructed name (c
-type).
How about exhaustively expanding all of the tokens in an argument? To date,
that has been handled by x
-type expansion. This uses \edef
behind-the-scenes, so the experienced TeX programmer will see that it cannot
itself work in an expansion context. Using \edef
also has the side effect that
#
tokens need to be doubled in the input.
\expanded
A little while ago now, the LaTeX Team arranged for a ‘new’ primitive
\expanded
to be added to the major TeX
engines. This works almost in the same
way as \edef
except that it is itself expandable and it does not require #
tokens to be doubled. Using this primitive, we added e
-type expansion to
expl3
, and have used it for creating variants of expandable functions.
That left us with two almost-identical variants and a tricky task giving an
explanation of which to use, as there are places we want e
-type expansion even
if the underlying function isn’t expandable (where that #
doubling business is
an issue). In particular, with a bit of care for a few edge cases, it turns out
that everything that is set up for x
-type expansion can be converted to
e
-type. That includes things like \cs_set_nopar:Npx
, where when you look
closely we should have called it \cs_set_nopar:Npe
from the word go: there’s
no #
doubling as this is just \edef
renamed.
So we’ve now made the decision to pivot toward e
-type expansion across the
board. We’ll be retaining the (now deprecated) x
-type variants that are
already in expl3
, but the documentation and all new variants will only be
e
-type. Once the new release is out, package authors are encouraged to move
all of their x
-type usage to e
-type. The timeframe will of course depend on
the stability approach of individual package authors: for siunitx
, I’m simply
going to step the minimum required expl3
release and be done with it, but
others may be more cautious.
What is important here is that almost all users should see minimal impact:
provided the installed expl3
core files match, there should be no obvious
change for end users. What we gain, though, at the code level is a lot more
consistency and clarity of design choice.
That leaves one additional variant: f
-type, which is almost like e
-type
but stops at the first non-expandable token. It exists largely as we needed
something expandable before we had e
-type, but it still has a few edge use
cases. So we won’t be dropping it, but in almost all cases code using f
-type
expansion can move to e
-type. Again, I’ve done that in siunitx
and will do a
sweep over the expl3
core soon. So we can expect to see a move to almost no
use of f
-type other than some specialist low-level places.
A key driver in the tidy up here is that we would like to provide as far as
possible pre-defined variants for the core expl3
functions. That means having
some way of avoiding a combinatorial explosion: the more variants we need, the
more this is an issue. So we are aiming to get the ‘core’ set to n
, V
, v
and e
, and N
and c
, with o
and f
where they are required. The latest
expl3
release fleshes out more pre-defined variants for this set, and we
expect that to grow a little more as we try to standardise more functions around
this core set.
The keen expl3
programmer is likely wondering what they need to do in detail.
Working on the basis that you are already requiring the expl3
release with
these changes (2023-10-10), then
x
-type variants provided by expl3
have now got a matching e
-type, so
you can simply change the naming unless …#
tokens in an argument, in which case you also need to
undouble themx
-type variants with e
-type ones for internal codex
-typeAs an example of the point on #
tokens, you might currently have something like
\use:x
{
\cs_new:Npn \exp_not:N \mypkg_foo:w ##1 \c_colon_str ##2 \c_underscore_str
{
% Code using ##1 and ##2
}
}
which would need to change to
\use:e
{
\cs_new:Npn \exp_not:N \mypkg_foo:w #1 \c_colon_str #2 \c_underscore_str
{
% Code using #1 and #2
}
}
Of course, if there are no doubled #
to worry about, it’s really just a
search-and-replace. So we can all now get on and use the ‘Jag’!
siunitx
has supported uncertainty values in
numbers. Uncertainties are a key piece of information about a lot of scientific
values, and so it’s important to have a convenient way to present them.
The most common uncertainty we see is one that is symmetrical, a value
plus-or-minus some number, for example 1.23 ± 0.04. This could be a standard
deviation from repeated measurement, or a tolerance, or derived some other way.
Luckily for me, the source of such a value doesn’t matter: siunitx
just
needs to be able to read the input, store it and print the output. For both
reading and printing, siunitx
has two ways of handling these symmetrical
uncertainties
In version 3 of siunitx
, I took that existing support and added a
long-requested new feature: rounding to an uncertainty. That means that if you
have something like 1.2345 ± 0.0367 and ask to round to one place, the
uncertainty is first rounded (to 0.04), then the main value is rounded to the
same precision (to 1.23).
Building on that, v3.1 added the idea of multiple uncertainties. These come up
in some areas (astronomy is one, particle physics another) where there are clear
sources of distinct uncertainty elements. Supporting multiple uncertainties also
means supporting descriptions for them: if you are dividing up uncertainty,
you likely want to say why. So in v3.1, you can say 1.23(4)(5)
or 1.23 ± 0.04
± 0.05
, and set up the descriptors, and have something like 1.23 ± 0.04 (sys)
± 0.05 (stat)
get printed. I’ve not had any feedback yet on this new feature:
fingers-crossed that means it all works 100%!
Now, for v3.3, I’ve looked at another long-standing request: asymmetric
uncertainties. For this release, I’ve kept this area simple, as it’s one I know
less about. There’s just a ‘compact’ input form, and one (compact) output form.
So we can input 1.23(4:5)
and get in TeX terms $1.23^{+0.04}_{-0.05}$
typeset. Asymmetric and symmetric uncertainties can be intermixed, and you can
have multiple asymmetric ones. I’m hoping this feature gets picked up by users,
and that I get some idea of what to do next. I suspect there might be
alternative output formats requested, and I wonder whether a ‘long’ input form
1.23 + 0.04 - 0.05
will be asked for: I’ve not done that yet as it’s more
tricky if the user misses one part out!
Hopefully, with the introduction of asymmetric uncertainty support, siunitx
covers just about all types of uncertainty in scientific data: aiming to be a
comprehensive (SI) units package, after all!
café
, it is made up of four codepoints:
So we could in XeTeX/LuaTeX use a simple mapping to grab one character at a time
and do stuff with it. However, that’s not always the case. Take for example
Spın̈al Tap
. The dotless-i is a single codepoint, but there is not a codepoint
for an umlauted-n. Instead, that is represented by two codepoints: a normal n
and a combining umlaut. As a user, it’s clear that we’d want to get a single
‘character’ here. So there’s clearly more work to do.
Luckily, this is not just a TeX problem and the Unicode Consortium have thought about it for us. They provide a data file and rules that describe how to divide input into graphemes: ‘user perceived characters’. So ‘all’ that is needed is to examine the input using these rules, and to divide it up so that ‘characters’ stay together.
For pdfTeX, there’s an additional wrinkle: it uses bytes, not codepoints, and so if we use a naïve TeX mapping, we would divide up any codepoint outside the ASCII range into separate bytes: not good. Luckily, the nature of codepoints is predictable: all that is needed is to examine the first byte and collect the right number of further bytes to re-combine into a valid codepoint.
This work isn’t something the average end user wants to do. Luckily, they don’t
have to as the LaTeX team have looked at this and created a suitable set of
expl3
functions to do it: \text_map_function:nN
and \text_map_inline:nn
.
So for example we can do
\ExplSyntaxOn
\text_map_inline:nn { Spın̈al ~ Tap } { (#1) }
\ExplSyntaxOff
and get
(S)(p)(ı)(n̈)(a)(l)( )(T)(a)(p)
in any TeX engine (assuming we are set up to print the characters, of course).
Taking a more ‘serious’ example (And one that is going to use LuaTeX for font
reasons), we might want to map over Bangla text. It’s easy to do that with the
expl3
function \tl_map_inline:nn
, but it gives very odd results. In
contrast, \text_map_inline:nn
divides up the characters correctly.
\documentclass{article}
\usepackage{fontspec}
\newfontface\harfbengali
{NotoSansBengali-VariableFont_wdth,wght.ttf}[Renderer=HarfBuzz,Script=Bengali]
\begin{document}
\harfbengali
\ExplSyntaxOn
ন্দ্রকিন্দ্র
\par
\text_map_inline:nn{ন্দ্রকিন্দ্র}{(#1)}
\par
\tl_map_inline:nn{ন্দ্রকিন্দ্র}{(#1)}
\ExplSyntaxOff
\end{document}
which gives (You’ll need Noto Sans Bengali available to make this work locally.)
So, as you can see, mapping to ‘real’ text is easy with expl3
: you just need
to know that the tools are there.
$...$
\(...\)
\begin{math} ... \end{math}
The last version is clearly far too verbose for routine use, but the first and second approaches have a much less clear-cut division.
Plain TeX uses the $...$
construct exclusively, and that means many
experienced (La)TeX users simply use this without any further consideration.
There are good arguments in favour of the syntax, most obviously that this
switches directly into math mode (it uses the underlying TeX idea of category
codes with no macro expansion required). On the other hand, it lacks any
possiblity of matching begin and end points.
LaTeX’s \(...\)
syntax was introduced by Lamport early in the development of
the format. Using separate begin and end marks means that is does allow error
detection in the editor, and it also is linked visually to LaTeX’s display math
\[...\]
approach. (More on that below.)
So which one to use? Experience suggests that whilst Lamport made many good
decisions in the design of LaTeX’s input syntax, \(.,..\)
wasn’t the best of
them. The number of times that pair-matching is helpful simply doesn’t compete
with the extra complexity of the input. At the same time, there’s no difference
in the results between the two syntax, so there isn’t a downside to using
$..$
. So I (and the majority of the current LaTeX team) favour using $..$
.
I think it’s important to contrast with (unnumbered) display math mode. There,
the LaTeX \[...\]
syntax is the officially-supported approach, and the plain
TeX $$..$$
is not. For display math, there are significant differences in
what can be done using \[...\]
compared to directly switching to TeX’s
display math mode using $$..$$
, and so the situation is clear: use \[...\]
.
siunitx
v3.1. One area that I’ve
now been able to commit is improvements to handling complex values.
In v2, you could give complex values in the normal argument to \num
or \SI
.
I removed that for v3, and of course that was not entirely popular. Instead, I
introduced dedicated commands, \complexnum
and \complexqty
. Part of the
reason for that was that it makes the implementation of \num
and \qty
/\SI
easier. But the other was that I wanted to address polar form, and that really
didn’t look viable if it was mixed in with the normal numerical argument type.
I’ve now committed a
change
that introduces support for polar form in siunitx
. So what happens now is if
you give a value such as \num{10:30}
, it’s treated as a magnitude and an
angle. The latter has a setting to determine if it’s regarded as being in
degrees or radians. The package can then typeset the result in a similar form,
using the \angle
symbol between the two parts. You can also set up to convert
between the classical (Cartesian) and polar forms of the value. So hopefully
this shows why I wanted to separate out complex numbers: they need special
handling, and now they get it.
siunitx
on the v3.0.x branch. These have addressed quite a few minor bugs: I expected
to have to do a bit of work since the shift from v2 was quite major.
Things are now settling down: the open issues I’ve had recently are mainly
on the border of feature requests, and there don’t seem to be additional
changes I’ve introduced by accident. With the TeX
Live freeze coming up, now looks like an excellent
time to turn my thoughts to siunitx
v3.1. The plan there is to deal with two
areas:
Of course, that doesn’t rule out further bugs to be fixes in the v3.0 branch: I will continue to fix things that come up there.
Depending on how the big issues go, I might manage a v3.1 release in the summer (June-August).
]]>I consulted with my favourite duck internet buddy, Paulo
Cereda, and he pointed my to the rather
flexible Hamilton theme. You’ll
see I’m tweaking it a bit, so there will be minor changes over time, but I think
it looks good: balances off between not being totally plain with the fact I have
zero design ability!
Paulo himself doesn’t have a blog, but he’s part of the excellent Island of TeX, most famous for arara. Paulo tells me they don’t use Jekyll, but rather Zola, but that didn’t stop him helping me :)
]]>siunitx
was out in
May, after the TeX
Live 2021 DVD. That means it’s been picked up
primarily by more active users: people who install TeX between the ‘fixed’ DVD
releases (or who use MiKTeX). It also didn’t initially
appear on Overleaf, as they take a while to test TeX
Live images before making them public.
I’ve been making maintenance releases between May and now, and have reached
v3.0.36, picking off small (or less small) issues I’d missed initially. At the
same time, Overleaf now have a TeX Live 2021 image (currently featuring
siunitx
v3.0.23). So I now have an increasing number of ‘normal’ users: people
who don’t want to deal with testing, and just want their documents to work.
What I notice is that increased usage hasn’t raised any truly major issues. Yes, there have been corrections (see the ChangeLog for the detail), but they were mainly at the level of predictable issues: places that I’d not explored quite enough. I hope Overleaf will consider an in-place update to somewhere around the latest release: whilst the issues have been minor in the grand scheme, it would be good to get a reasonably bug-reduced version out there (I’m not claiming bug-free)!
So I’m seeing the release as in the end quite a big success: I’ve addressed the issues I knew about, got better testing, have cleaner interfaces and am already offering new features. My mind is therefore turning to v3.1: I have a list of issues to consider that I’d like to take for that release, plus I could pick off some others. I might of course not tackle all of these: I’m thinking starting over the Christmas period and looking to release in March/April 2022. By then of course we might be at v3.0.50, so it will also help to ‘reset’ the patch level!
]]>So it was quite interesting to be talking yesterday at a chemistry conference
(the ACS Fall 2021 Meeting) about siunitx
. I’d
been invited by Stuart
Chalk to a
session on units and data reuse: much more like metrology/computer science than
my usual day-to-day wet chemistry!
It was good to see that many of the things I do in siunitx
fit into wider
efforts by people who do day-to-day work on units. The idea of logical mark-up
for unit input, the ability to decompose units into parts and the realities of
less-than-ideal input from users were all there. Hopefully, siunitx
will help
with the work being done by groups such as DRUM (Digital Representation of
Units of Measure) to make
information more computer-readable. I’ll also be looking at
QUDT for inspiration about the real technical detail
of the myriad of units in real use.
I also managed to get in a few comments about some LaTeX work that’s important
for data reuse more widely: tagged PDFs and tex4ht
.
So it was a pretty productive use of an evening!
siunitx
out, I am as expected
getting quite a few questions about moving from v2. In the main, this is quite
easy as there is a decent amount of compatibility code. Here, I’ll pick out
a few cases where you might want some adjustments.
One thing that people sometimes need is to work with the latest version but
allow their input to work with the older version: that’s particularly true if
you work with people using Overleaf, as it will be
some time before they update to v3. You can of course just stick to the v2
interfaces, but if you’d prefer to have v3 if possible, then you will need to
define \qty
and \unit
(and maybe others) conditionally. I’d recommend doing
that using
\usepackage{siunitx}
\ifdefined\qty\else
\ifdefined\NewCommandCopy
\NewCommandCopy\qty\SI
\else
\NewDocumentCommand\qty{O{}mm}{\SI[#1]{#2}{#3}}
\fi
\fi
\ifdefined\unit\else
\ifdefined\NewCommandCopy
\NewCommandCopy\unit\si
\else
\NewDocumentCommand\unit{O{}m}{\si[#1]{#2}}
\fi
\fi
That then leaves options, but almost always these should be set in the preamble, so are a ‘one shot’. You can of course add to my tests above to know which version is in use, and set selectively.
For people who’ve been using products or complex numbers in \SI
in v2, one
could use a similar approach to the above to ‘keep’ the functionality by setting
it equivalent to the new \qtyproduct
or \complexqty
commands: of course, if
you want both then you’ve got to make bigger changes. For example, to retain
the ability to use products in \SI
, you’d use
\usepackage{siunitx}
\ifdefined\qtyproduct\else
\ifdefined\DeclareCommandCopy
\DeclareCommandCopy\SI\qtyproduct
\else
\DeclareDocumentCommand\SI{O{}mm}{\qtyproduct[#1]{#2}{#3}}
\fi
\fi
at the cost that the code is a bit slower than \qty
for input without
products. Complex values would be handed the same way, just changing the command
you use as a ‘replacement’.
In v2, \litre
and \liter
produced different output: that was not the best
interface decision. So in v3 they are the same, but that means of course that
you might see a change. Luckily, you can set the output you want and get the
same in both v2 and v3.
\DeclareSIUnit\litre{l}