TUG2018: Day Two

The second day of TUG2018 picked up with a few announcements for those us here at IMPA, before we moved on to the business end.

Early morning session

Frank Mittelbach started the day’s proceedings, talking about his doc package for literate programming. He explained the background, what works and more importantly what didn’t. The success of doc as a standard make change challenging, but at the same time there is a need for updates. He then laid out goals for a new version: back-compatibility, new mark-up and out-of-the-box hyperref support. He showed us the features for creating new mark up. There are some wrinkles, for example that hyperref support still has to be manually activated. Frank wrapped up by pointing to the testing version, and gave us a likely release date (for TL’19).

I then gave my first talk of the day, looking at expl3 concepts related to colour and graphics. I outlined the LaTeX2e background, what is happening with the LaTeX2e drivers and then moved on to my expl3 experiments. First I talked about colo(u)r, and the idea of colour expressions as introduced by xcolor. These are trivial to work out in expl3 due to the expandable FPU we have. I then looked at creating graphics, particularly how I’ve been inspired by pgf/TikZ. I showed how I’ve used the fact that pgf has a clear structure, and mapped that to expl3 concepts. I showed some examples of the existing drawing set up, and where I’ll be going next.

After coffee

We returned after coffee for a short talk from Boris Veytsman on tackling an apparently simple issue: putting leaders level with the first line of a long title! He showed that this is a non-trivial requirement, and how as a contractor he has to explain this to his customers. He then showed how he solved the issue, leading to a lively discussion about other possible approaches.

I then came back for my second talk of the day, I talked about siunitx. I started by explaining the history of the package, starting with the initial comp.text.tex post that led to its creation. I outlined the core features, present from version 1, and why I’ve re-written now twice. I finished by promising a first alpha version of version 3: that’s available here.

Frank then returned for a morning of symmetry, talking about compatibility requirements. He talked about the historical situation, starting from Knuth’s introduction of TeX and taking us through the development of LaTeX, PDF support and Unicode engines. He then moved on to look at the LaTeX2e approach to compatibility, starting with the 1994 approach, fixltx2e. He explained how that was intended to work, and why it didn’t. The new approach, latexrelease, tackles the same problems but starts with the idea that it applies to both the kernel and to packages. Frank covered the idea of rollback in packages, and how this works at the user and developer levels. Frank finished off with some thoughts about the future, and the fact that most new users probably pick up these ideas without issue.

After lunch

Our conference Chair, Paulo Ney de Souza, took the first slot after lunch to speak about how he’s approached a major challenge, managing the abstracts for the upcoming ICM2018 meeting. His talked ranged over topics such as citation formatting, small device output, production workflows and dealing with author preambles. He covered the wide range of tools his team have assembled to automate PDF creation from a heterogeneous set of sources. His wide-ranging talk was a tour de force in automated publication.

After a brief break, we moved to Tom Hejda (who TeX-sx users know as yo’), on his tool yoin. He explained that his current workflow for producing journal issues is currently a mix of a range of tools, and this is likely not long-term sustainable. He then moved to showing how yoin can be used to compile both the master file for an issue and, as required, each article within it.

The last talk of the day was from Joachim Heinze, formerly of Springer. He talked about journal publishing, and how online accessibility of publications has changed the landscape for publishers. He gave an entertaining look into this world, posing the question ‘Where is the information we have lost in data?’.

With the formal business done, some of the group remained at IMPA for a workshop on R and Knitr, led by Boris Veytsman. Later, we all met up again for the conference dinner at Rubaiyat Rio.

TUG2018: Day one

Most of the foreign delegates for TUG2018 met up by last night at the conference hotel, and chats over breakfast continued. Then it was down to the minibus to head to the venue, IMPA.

Opening session

After a brief introduction from the conference chair, Paulo Ney de Souza, the floor was handed to Roberto Ierusalimschy to start us with a bang: an overview of Lua development. He gave us an insight into how Lua grew from early beginnings, and how it got picked up by games developers: a really big part of Lua’s importance. He then gave us an insight into the two key aspects of Lua’s success: the ability to embed and extend the language. That’s led to Lua being embedded in a range of applications, particularly games but also devices as varied as cars and routers. We had a lively question session, ranging from Unicode support to what might have been done differently.

We then moved on to Eduardo Ochs, talking about using Lua as a pre-parser to convert ‘ASCII art’ into complex mathematical diagrams. He explained the pre-history: the origin of the ASCII art as comments to help understand complex TeX code! After a summary of the original pre-processor, he showed how using Lua(TeX), the processing can be done in-line in the file with no true pre-processing step. He showed how this can be set up in an extensible and powerful way.

Morning post-coffee

After the coffee break (plus cake), we reconvened for three talks. Mico Loretan started focussing on his package selnolig. He started by showing us examples of ‘unfortunate’ ligatures in English words, and how they can appear when suppressed by babel and by selnolig. He then focussed in on the detail: what a ligature is, why they are needed and how different fonts provide them. He moved on to detail why you need to suppress ligatures, in particular where they cross morpheme boundaries. Mico then gave us a very useful summary of how the linguistics work here and how they need to link to typography. After showing us the issues with other approaches, he moved on to detail of how selnolig uses LuaTeX callbacks to influence ligatures ‘late’ in processing. His rule-based interface means that ligatures can be suppressed for whole classes of words.

I spoke next, focussing on l3build. I gave a brief overview of LaTeX testing, from the earliest days of the team to the current day. I covered why we’ve picked Lua for our current testing set-up, what works and what (currently) doesn’t.

Paulo Cereda then talked about his build tool, arara. He started with an overview of other tools, before explaining how arara is different: it’s a ‘no-guesswork’ approach. He showed us the core, simple, syntax, before moving on to a timeline of releases to date. He summed up the new features in version 4.0, before moving to a series of live demonstrations. These started with simple ideas and moved on to new, complex ideas such as conditionals and taking user input. He then finished by looking to the future, both of arara and of araras (parrots).

After lunch

We started back after lunch with a couple of slides from Barbara Beeton, sadly absent from the meeting, presented by TUG President Boris Veytsman.

Will Robertson then took the podium. He started with some non-TeX thoughts on questions he gets as an Australian. His koala pictures were particularly fun. His talk then moved to his work with the Learning Management System (LMS) used by his employer. This system (Canvas) has a programmable API for controlling information made available to students. He laid out the issues with the documentation he had: a very large, unmaintainable word processing document. Will talked about various tools for creating HTML from LaTeX, the workflow he has chosen, and then showed more detail on the system he is using, LaTeXML. He then expanded on how using LaTeXML plus scripting, he can populate the LMS in a (semi)automated way, making his work more efficient.

The second speaker in the ‘Australian panel’ session was Ross Moore. Ross started with a demo of why tagging PDFs is needed: making the information accessible not just to people but widely to the computer, to allow re-use in alternative views. He expanded on the drivers for this, in particular legal requirements for accessible documents.

After afternoon break

Our next talk came in remotely from Sandro Coriasco. He started by outlining the team involved in this work, focussed on making material accessible to the blind. The aim of their work has been targetted at mathematical formula, generating ‘actual text’ which can then be used by screen readers or similar. He then showed that this makes additional useful information available to e.g. screen readers.

We then had a non-TeX talk: Doris Behrendt on GDPR. She started by looking at the EU Official Journal on the GDPR, and we had an excursion into the font used for typesetting (Albertina). She then gave details of the regulations, along with a number of extremely amusing examples of how people have approached them.

Presentations over, the TUG AGM took place, concluding the formal business of the day.

TUG2018 Preview

The TUG2018 meeting starts tomorrow in Rio de Janerio, Brazil, and the delegates have begun to collect together (many of us are staying at the Everst Rio Hotel). I’ll be trying to write up notes each day to summarise the talks, discussions, etc., but you’ll also be able to watch live. There’s also a chat room on TeX StackExchange dedicated to the meeting.

Informal discussions are already ongoing (the LaTeX team members have been hard at it since breakfast), so it should be a productive time.

The TeX Frequently Asked Question List: New hosting

The TeX Frequently Asked Question (FAQ) List has been a fixture of the TeX world for many years. It started out as a regular column in the (now dormant) UK-TUG journal Baskerville, before being taken up as an essentially one-person project by Robin Fairbairns.

Since Robin’s retirement, the FAQ have remained available online but essentially maintenance has been ‘in hibernation’. That’s largely because the structure of the sources was tricky: they were designed to be typeset and to give HTML output following scripted conversion. For the ‘new’ team (currently David Carlisle, Stefan Kottwitz, Karl Berry and me) looking after the material, that’s been tricky as we are not editing the sources directly on the server (Robin’s old set up).

To keep the FAQ up-to-date and easy-to-maintain, the sources have been converted to Markdown to allow them to be used in a GitHub Pages set up. The traditional http://www.tex.ac.uk website now redirects to texfaq.org, which will be the canonical site address. You can also go ‘directly’ to the GitHub Pages site, texfaq.github.io. (There are a few final adjustments to make, so at the moment you might get redirected from texfaq.org to texfaq.github.io.)

The aim remains to have a curated set of FAQ, not growing too big and staying authoritative. Of course, the core team appreciate help making that the case: you can access the material on GitHub to log issues or make suggestions for change.

texdoc.net

After a bit of an interruption in service, the great texdoc.net is back on-line. For those not familiar, it provides a web interface to the texdoc command from TeX Live/MiKTeX. So you can type in (or link to) a package name and see the current docs: for example http://texdoc.net/pkg/siunitx. The updated site now has TeX Live 2018: the current release of the major TeX system for Linux (which is what the system runs). Many thanks to Stefan Kottwitz for his continued work on this and DANTE for supporting the cost of running the site.

BibTeX futures

Those people on the LaTeX-L list will have spotted a pretty important mail today: bibtex futures: url, doi, ?, posted by Karl Berry.

The key questions raised there are focussed in two areas: whether the core BibTeX .bst files should support url and doi data (and if so how), and whether any efforts should be made to support Unicode data. These are important questions, and I’d encourage everyone to take a look, and to contribute. If anyone wants points raising, but is not subscribed to the list, drop me a mail or leave a comment and I’ll forward it.

xparse: optional arguments (at the end)

For many years, the LaTeX team have been wondering about a subtle question: how do we deal with spaces before optional arguments. It’s easy enough if we know there are more mandatory arguments to look for:

\foo{first}  [optional]  {second}

should be treated in the same way as

\foo{first}[optional]{second}

However, when the optional argument comes last, it’s a bit more tricky. It’s easy enough to have

\foo{first}[optional]
\foo{first}  [optional]

both do the same thing, but there is one classic LaTeX2e case to worry about. When you load amsmath, you’ll find that

\begin{align}
   a  & b \\
  [c] & d \\
\end{align}

doesn’t treat [c] as the optional argument to \\: spaces are not skipped here. (This is pretty sensible for mathematics: it’s quite possible to have something in square brackets.)

To date, we’ve handled this by simply not allowing any spaces before optional arguments when they come ‘at the end’ (after any mandatory ones). However, that means it applies to all cases, which we’ve thought for some time was not ideal: most of the time, spaces here are likely fine.

For the next release of xparse, we’ve revisited this area and introduced a new ! modifier for optional arguments. Unmodified optional arguments will now allow spaces, with the ! modifier preventing this. Thus we can now describe \\ as having xparse set up

\DeclareDocumentCommand{\\}{!s !o}{<code>}

meaning no spaces are allowed, whilst most other commands will now allow spaces. This should affect very few end user documents, but does make for a better long-term approach.

LaTeX2e: UTF-8 as standard

Stability is a key idea when making changes to the LaTeX2e kernel: users need to know that they’ll always get the same output for the same (valid) input. At the same time, the world moves on and we need to respond to real-world use. To date, the LaTeX kernel has stuck to purely ‘classical’ 7-bit input support ‘out of the box’, meaning that with pdfTeX you need to load inputenc to properly use any extended characters. For English speakers that doesn’t really show up, but almost everyone else needs

\usepackage[<encoding>]{inputenc}

at least unless they are using XeTeX or LuaTeX. When there were many competing input encodings, having to load one specifically made more sense, but today UTF-8 is the standard, and (almost) all new documents should be using it.

For the next LaTeX2e release, the team are changing this position, and UTF-8 will be understood as default: https://github.com/latex3/latex2e/issues/24. For most users, this will be entirely transparent: they’ll already be using inputenc. Of course, there will be a few users who need to make adjustments, most obviously those who have relied on the default settings for the upper-half of the 8-bit range (they will need \usepackage[latin1]{inputenc}).

Testing is underway and a few areas are still being addressed. The aim is to make life easier for all LaTeX users: feedback is most welcome.

beamer developments

I’ve been looking after beamer for a few years, largely ‘by accident’ (this seems to happen quite a lot). Relatively recently, I moved the code from BitBucket to GitHub, largely because there’s a slow drift there for LaTeX projects. The advantage of that is the chance to pick up additional help.

Eagle-eyed readers will have noticed that over the last few months there have been a lot of beamer check-ins from Louis Stuart. He’s doing excellent work on tackling a lot of tricky beamer bugs, and I hope this will mean a better user experience. Of course, changing a complex product like beamer does have some risks, and so it’s also important to get the release set up working smoothly. To that end, I’ve migrated from some custom Makefile structures to using l3build (with some new features in the latter to help). That should mean a more regular release schedule.  It also means we can integrate testing into the coding: currently there is just the one test, but I’d welcome additions!

LaTeX2e kernel development moves to GitHub

The LaTeX team have two big jobs to do: maintaining LaTeX2e and working on LaTeX3 (currently as new packages on top of LaTeX2e). For quite a while now the LaTeX3 code has been available on GitHub as a mirror of the master repository. At the same time, the core LaTeX2e code was also available publicly using Subversion (SVN) via the team website. At least in the web view, the latter has always been a bit ‘Spartan’, both in appearance and in features (only the most recent revision could be seen).

Coupled to viewing the code for any project is tracking the issues. For LaTeX2e, the team have used GNATS for over twenty years. GNATS has served the team well, but like the web view is Subversion is showing its age.

We’ve now decided that the time is right to make a change. Eagle-eyed users will already have spotted the new LaTeX2e GitHub page, which is now the master repo for the LaTeX kernel. We’ve not yet frozen the existing GNATS database, but new bugs should be reported on GitHub. (For technical reasons, the existing GNATS bugs list is unlikely to be migrated to GitHub.)

Frank Mittelbach (LaTeX team lead developer) has written a short article on the new approach, which will be appearing in TUGboat soon. As Frank says, we hope that most users don’t run into bugs in the kernel (it is pretty stable and the code has been pushed pretty hard over the years), but this new approach will make reporting that bit easier and clearer.

Accompanying the move of LaTeX2e to GitHub, the LaTeX3 Subversion repository has also been retired: the master location for this is also now on GitHub. So everything is in a sense ‘sorted’: all in one place.

Of course, the team maintain only a very small amount of the LaTeX ‘ecosystem’: there are over 5000 packages on CTAN. To help users know whether a bug should be reported to the team or not, we have created the latexbug package.  An example using it:


\RequirePackage{latexbug} \documentclass{article} \begin{document} Problems here \end{document}

will give a warning if there is any code that isn’t covered by the team (and so should be reported elsewhere). We hope this helps bugs get to the right places as easily as possible.


I handled most of the conversion from Subversion to Git, and I’d like to acknowledge SubGit from TMate Software for making the process (largely) painless. As LaTeX is an open source project, we were able to use this tool for free. We used SubGit for the ‘live’ mirroring of LaTeX3 to GitHub for several years, and it worked flawlessly. The same was true for the trickier task of moving LaTeX2e: the repo history had a few wrinkles that we slightly more difficult to map to Git, but we got there.