xparse: optional arguments (at the end)

For many years, the LaTeX team have been wondering about a subtle question: how do we deal with spaces before optional arguments. It’s easy enough if we know there are more mandatory arguments to look for:

\foo{first}  [optional]  {second}

should be treated in the same way as

\foo{first}[optional]{second}

However, when the optional argument comes last, it’s a bit more tricky. It’s easy enough to have

\foo{first}[optional]
\foo{first}  [optional]

both do the same thing, but there is one classic LaTeX2e case to worry about. When you load amsmath, you’ll find that

\begin{align}
   a  & b \\
  [c] & d \\
\end{align}

doesn’t treat [c] as the optional argument to \\: spaces are not skipped here. (This is pretty sensible for mathematics: it’s quite possible to have something in square brackets.)

To date, we’ve handled this by simply not allowing any spaces before optional arguments when they come ‘at the end’ (after any mandatory ones). However, that means it applies to all cases, which we’ve thought for some time was not ideal: most of the time, spaces here are likely fine.

For the next release of xparse, we’ve revisited this area and introduced a new ! modifier for optional arguments. Unmodified optional arguments will now allow spaces, with the ! modifier preventing this. Thus we can now describe \\ as having xparse set up

\DeclareDocumentCommand{\\}{!s !o}{<code>}

meaning no spaces are allowed, whilst most other commands will now allow spaces. This should affect very few end user documents, but does make for a better long-term approach.

LaTeX2e: UTF-8 as standard

Stability is a key idea when making changes to the LaTeX2e kernel: users need to know that they’ll always get the same output for the same (valid) input. At the same time, the world moves on and we need to respond to real-world use. To date, the LaTeX kernel has stuck to purely ‘classical’ 7-bit input support ‘out of the box’, meaning that with pdfTeX you need to load inputenc to properly use any extended characters. For English speakers that doesn’t really show up, but almost everyone else needs

\usepackage[<encoding>]{inputenc}

at least unless they are using XeTeX or LuaTeX. When there were many competing input encodings, having to load one specifically made more sense, but today UTF-8 is the standard, and (almost) all new documents should be using it.

For the next LaTeX2e release, the team are changing this position, and UTF-8 will be understood as default: https://github.com/latex3/latex2e/issues/24. For most users, this will be entirely transparent: they’ll already be using inputenc. Of course, there will be a few users who need to make adjustments, most obviously those who have relied on the default settings for the upper-half of the 8-bit range (they will need \usepackage[latin1]{inputenc}).

Testing is underway and a few areas are still being addressed. The aim is to make life easier for all LaTeX users: feedback is most welcome.

LaTeX2e kernel development moves to GitHub

The LaTeX team have two big jobs to do: maintaining LaTeX2e and working on LaTeX3 (currently as new packages on top of LaTeX2e). For quite a while now the LaTeX3 code has been available on GitHub as a mirror of the master repository. At the same time, the core LaTeX2e code was also available publicly using Subversion (SVN) via the team website. At least in the web view, the latter has always been a bit ‘Spartan’, both in appearance and in features (only the most recent revision could be seen).

Coupled to viewing the code for any project is tracking the issues. For LaTeX2e, the team have used GNATS for over twenty years. GNATS has served the team well, but like the web view is Subversion is showing its age.

We’ve now decided that the time is right to make a change. Eagle-eyed users will already have spotted the new LaTeX2e GitHub page, which is now the master repo for the LaTeX kernel. We’ve not yet frozen the existing GNATS database, but new bugs should be reported on GitHub. (For technical reasons, the existing GNATS bugs list is unlikely to be migrated to GitHub.)

Frank Mittelbach (LaTeX team lead developer) has written a short article on the new approach, which will be appearing in TUGboat soon. As Frank says, we hope that most users don’t run into bugs in the kernel (it is pretty stable and the code has been pushed pretty hard over the years), but this new approach will make reporting that bit easier and clearer.

Accompanying the move of LaTeX2e to GitHub, the LaTeX3 Subversion repository has also been retired: the master location for this is also now on GitHub. So everything is in a sense ‘sorted’: all in one place.

Of course, the team maintain only a very small amount of the LaTeX ‘ecosystem’: there are over 5000 packages on CTAN. To help users know whether a bug should be reported to the team or not, we have created the latexbug package.  An example using it:


\RequirePackage{latexbug} \documentclass{article} \begin{document} Problems here \end{document}

will give a warning if there is any code that isn’t covered by the team (and so should be reported elsewhere). We hope this helps bugs get to the right places as easily as possible.


I handled most of the conversion from Subversion to Git, and I’d like to acknowledge SubGit from TMate Software for making the process (largely) painless. As LaTeX is an open source project, we were able to use this tool for free. We used SubGit for the ‘live’ mirroring of LaTeX3 to GitHub for several years, and it worked flawlessly. The same was true for the trickier task of moving LaTeX2e: the repo history had a few wrinkles that we slightly more difficult to map to Git, but we got there.

TeX Live 2017 Pretesting

Eager TeX users will have noticed that a few days ago TeX Live 2016 updates were frozen for ever. We now have the pretest available for TeX Live 2017. As always, using pre-release software is not without risk, but as you can install it in parallel with the older releases there is not a big problem. The LaTeX team will be updating a few things on CTAN to go into the new release, and I’ll probably mention some of that in future posts. A quick look over the changes tells us that there are minor (and perhaps not-so-minor) engine changes to explore: I’m particularly keen to try out the new XeTeX math mode approach, using HarfBuzz.

Standard font loading in LaTeX2e with XeTeX and LuaTeX

The LaTeX Project have been making efforts over the past few years to update support in the LaTeX2e kernel for XeTeX and LuaTeX. Supporting these Unicode-enabled engines provide new features (and challenges) compared to the ‘classical’ 8-bit TeX engines (probably pdfTeX for most users). Over recent releases, the team have made the core of LaTeX ‘engine-aware’ and pulled a reasonable amount of basic Unicode data directly into the kernel. The next area we are addressing is font loading, or rather the question of what the out-of-the-box (text) font should be.

To date, the LaTeX kernel has loaded Knuth’s Computer Modern font in his original ‘OT1’ encoding for all engines. Whilst there are good reasons to load at least the T1-encoded version rather than the OT1 version, using an 8-bit engine using the OT1 version can be justified: it’s a question of stability, and nothing is actually out-and-out wrong.

Things are different with the Unicode engines: some of the basic assumptions change. In particular, there are some characters in the upper-half of the 8-bit range for T1 that are not in the same place in Unicode. That means that hyphenation will be wrong for words using some characters unless you load a Unicode font. At the same time, both LuaTeX and XeTeX have changed a lot over recent years: stability in the pdfTeX sense isn’t there. Finally, almost all ‘real’ documents using Unicode engines will be loading the excellent fontspec package to allow system font access. Under these circumstances, it’s appropriate to look again at the standard font loading.

After careful consideration, the team have therefore decided that as of the next (2017) LaTeX2e release, the standard text font loaded when XeTeX and LuaTeX are in use will be Latin Modern as a Unicode-encoded OpenType font. (This is the font chosen by fontspec so for almost all users there will no change in output.) No changes are being made to the macro interfaces for fonts, so users wanting anything other than Latin Modern will continue to be best served by loading fontspec. (Some adjustments are being made to the package to be ready for this.)

It’s important to add that no change is being made in math mode: the Unicode maths font situation is not anything like as clear as the text mode case.

There are still some details being finalised, but the general approach is clear and should make life easier for end users.

TeX on Windows: TeX Live versus MiKTeX revisited

On Windows, users have two main choices of TeX system to install: TeX Live or MiKTeX. I’ve looked at this before a couple of times: first in 2009 then again in 2011. Over the past few years both systems have developed, so it seems like a good time to revisit this. (I know from my logs that this is one of the most popular topics I’ve covered!)

The first thing to say is that for almost all ‘end users’ (with a TeX system on their own PC just for them to use), both options are fine: they’ll probably notice no difference between the two in use. It’s also worth noting that there is a third option: W32TeX. I’ve mentioned this before: it’s popular in the far East and is where the Windows binaries for TeX Live come from. (There’s a close relationship between W32TeX and TeX Live, with W32TeX more ‘focussed’ and expecting more user decisions in installing.)

Assuming you are going for one of the ‘big two’, what is there to think about? For most people, it’s simply:

  • Both MiKTeX and TeX Live include a ‘full’ set of TeX-related binaries, including the engines pdfTeX, XeTeX, LuaTeX and support programs such as BibTeX, Biber, MakeIndex and Xindy.
  • The standard installer for MiKTeX installs ‘just the basics’ and uses on-the-fly installation for anything else you need; the standard install for TeX Live is ‘everything’ (about 4.5 Gb!). Which is right for you will depend on how much space you have: you can of course customise the installation of either system to include more or less of the ‘complete’ set up.
  • MiKTeX has a slightly more flexibly approach to licensing than TeX Live does: there are a small number of LaTeX packages that MiKTeX includes that TeX Live does not. (Probably the most obvious example is thesis.)
  • TeX Live has a Unix background so the management GUI looks slightly less ‘standard’ than the MiKTeX one.
  • TeX Live has a strict once-a-year freeze,which means that to update you have to do a fresh install once a year. On the other hand, MiKTeX versions change only when there is a significant change and otherwise ‘roll onward’.

So the decision is likely to come down to whether you want auto-installation of packages. (If you do go for MiKTeX on a one-user PC, choose the ‘Just for me’ installation option: it makes life a lot simpler!)

For more advanced users there are a few more factors you probably want to consider

  • TeX Live was originally developed on Unix and so is available for Linux and on the Mac (and other systems) as well as Windows; MiKTeX is a Windows system so is (more-or-less) Windows-only. So if you want exactly the same set up on Windows and other operating systems, this of course means you need to use TeX Live.
  • Both systems have graphical management tools as well as command line interfaces. They have a lot in common, but they are not identical (in particular, MiKTeX tends to emulate TeX Live command line interfaces, but the reverse is not true).
  • The engine binaries in TeX Live are (almost) never updated other than in the yearly freeze period, meaning that for a given release you know which version of pdfTeX, etc., you’ll have: MiKTeX is more flexible with such updates. (At different times, one or other of the systems can be more ‘up to date’: this is not necessarily predictable! The W32TeX system often has very up-to-date testing binaries.)
  • The two systems differ slightly in handling how local trees are managed (places to add TeX files that are not controlled by the TeX system itself). TeX Live automatically expects <installation root>/texmf-local to hold system-wide ‘local’ additions and <user root>/texmf to hold per-user additions, whereas MiKTeX has no out-of-the box locations, but does make it easier to add and remove them from the command line. MiKTeX also makes it easy to add multiple per-user trees, whereas for TeX Live there’s more of an assumption that all user additions will be added in one place. (This makes it easier in MiKTeX to add/remove local additions by altering a setting in the TeX system rather than deleting files.)
  • TeX Live has a team doing the work; MiKTeX is a one-man project. This cuts both ways: you know exactly who is doing everything in MiKTeX (Christian Schenk), and he’s very fast, but there is more ‘spread’ in TeX Live for the work.
  • For people wanting to step quickly between different versions of TeX system, the fact that TeX Live freezes once a year makes life convenient (I have TeX Live 2009,2010, 2011, 2012, 2013, 2014, 2015 and 2016 installed at present, plus MiKTeX 2.9 of course!) You can switch installations by adjusting the PATH or by choosing the appropriate version from your editor, so have a ‘fall back’ if there is an issue when you update.
  • TeX Live has build-in package backup during maintenance updates.

Dependencies

There’s been some recent discussion on the TeX Live mailing list about recording dependencies for (La)TeX packages. This is a good idea but means that package authors need to think about their dependency situation. So I thought a few words on this would be helpful, at least from the point of view of the most common case: LaTeX packages.

It’s pretty easy to accumulate \RequirePackage lines in your source, but if you are serious about giving a useful set of dependencies you need to know what each one is for. In many ways the rule is easy: require each package you use. What makes that more complicated is that you might use features which are available when you load package X but are actually provided by package Y. For example, if you load my siunitx package, it loads array so means that you can do for example

\begin{tabular}{>{$}l<{$}}

So how do you tell what your ‘real’ dependencies are? The usual rule is that you check the documentation: does it say that package X itself provides the features you use? In the case above, siunitx doesn’t document that syntax extension for tabular: it’s documented by array. So if you wrote a package that uses siunitx but also needs to use features from array you should

\RequirePackage{array}
\RequirePackage{siunitx}

This means that even if at some future stage there’s a change in the internals of a package you load, things should still all work.

If you want to track down where stuff might be coming from, you can always \listfiles to get a full overview of your current package use (starting from a small example).

There are a few places were packages are so closely linked you might not have to list them both. The most obvious is TikZ/pgf: the two are different ‘layers’ of the same set up but are documented together, so if you load TikZ you can assume pgf. Of course, there is no harm in listing both!

LaTeX2e and e-TeX

LaTeX2e was released in 1994 and since then the LaTeX3 Project have been committed to keeping it working smoothly for users. That means balancing up keeping the code stable with fixing bugs and adding new features.

Back in 2003 the team announced that the e-TeX extensions would be used by the kernel when they were available. The new primitives offered by e-TeX make many parts of TeX programming easier and  often there’s no way in ‘classical’ TeX to get the same effect. As e-TeX was finalised in 1999, starting to use it seriously in around 2004 meant most people had access to them.

Since then, the availability and use of e-TeX has spread, and almost all users have them available. Indeed, the standard format-building routines for LaTeX have included them for many years. There are also a lot of packages on CTAN that use e-TeX, most obviously any using the expl3 programming language that the LaTeX3 Project have created.

The team had always meant to say at some stage that e-TeX was now required, and indeed thought we had until I checked over the official newsletters! So as of the next LaTeX2e release, scheduled for the start of 2017, the kernel will only build if e-TeX is enabled. For this release, we are likely to add a test for e-TeX but no actual use directly in the kernel, though in the future there will probably be more use of the extensions.

pgfplots: Showing points as just error bars

Presenting experimental work in a clear form is an important skill. For plotting data, I like the excellent pgfplots package, which makes it easy to put together consistent presentations of complex data. At the moment, I’d doing some experiments where showing the error bars on the raw data is important, but at the same time to show fit lines clearly. The best style I’ve seen for this is one where the data are show as simple vertical bars which have length determined by the error bars for the measurements. The fit lines then stand out clearly without overcrowding the plot. That style isn’t built in to pgfplots but it’s easy to set up with a little work:

\documentclass{standalone}
\usepackage{pgfplots}

% Use features from current release
\pgfplotsset{compat = 1.12}

% Error 'sticks'
\pgfplotsset{
  error bars/error mark options = {draw = none}
  % OR more low-level
  % error bars/draw error bar/.code 2 args = {\draw #1 -- #2;} 
}

\begin{document}
\begin{tikzpicture}
  \begin{axis}
    [
      error bars/y dir      = both,
      error bars/y explicit = true,
    ]
    \addplot[draw = none] table[y error index = 2]
      {
        0   0.023 0.204
        1   0.956 0.332
        2   4.234 0.552
        3   8.764 0.345
        4  17.025 0.943
        5  27.201 2.445
      };
    \addplot[color = red, domain = 0:5, samples = 100] {x^2};
  \end{axis}
\end{tikzpicture}
\end{document}

Demo
My demo only has a few data points, but this style really shows it’s worth as the number of points rises.