A LaTeX format beyond LaTeX2e

The question of why LaTeX3 development is not focussed on LuaTeX came up yesterday on the TeX-sx site. I’ve added an answer there covering some of the issues, but I thought that something a bit more open-ended might also be useful on the same topic.

Before I look at the approaches that are available, it’s worth asking why a format is needed beyond LaTeX2e. There are a few reasons I feel it’s needed, but a few stand out.

The first, strangely, is stability. LaTeX2e is stable: there will be no changes other than bug fixes. That means that a document written 10 or more years ago should still give the same output when typeset today. That sounds great, but there is an issue here. While the kernel is stable, packages are not, and the limitations of the kernel mean that there are a lot of packages. So for a lot of real documents, stability in the kernel does not mean that they will still work after many years, at least without some effort. So we need a kernel which provides a lot more of the basics, and perhaps new approaches to providing stable code.

Secondly, and related, is the fact that most real documents need a lot of packages, and that is a barrier to new users. Again, stability is great but not if it means we don’t continue to attract new people to the LaTeX world. I think that the LaTeX approach is a good one, so that is important to me. So I feel that we need a format which works well and provides a lot more functionality as standard.

Thirdly, there are some fundamental issues which are hard to address, such as inter-paragraph spacing, the placement of floats and better separation of design from input. There all need big changes in LaTeX, and it’s not realistic to hope to bolt such changes on to LaTeX2e and have everything continue to work.

All of that tells me we need a new kernel. So the question is how to achieve that. There are at least four programming approaches I’ve thought about.

Two are closely related: stick with TeX macro programming and cross-engine working, but make things more systematic. Perhaps the simplest way to do this is to adopt an approach similar to the etoolbox package, and to essentially add to the structures already available. The more radical approach in the same area is to do what the LaTeX3 Project have to date, and define a new programming language from the ground up using TeX macros.  There are arguments in favour of both of these approaches: I’ve done some experiments with a more etoolbox-like method for creating a format. My take here is that if you really want something more systematic than LaTeX2e then you do have to go to something like the LaTeX3 method: dealing with expansion with names like \csletcs gets too unwieldy as you try to construct an entire format.

Moving to a LuaTeX-only solution, and doing a lot of the programming in Lua, is the method that the ConTeXt team has decided on. This brings in a proper programming language without any direct effort, but leaves open some issues Using Lua does not automatically solve the challenges in writing a better format, and using LuaTeX does not mean not that there is no TeX programming to do. So a LuaTeX-only approach would still need some TeX work.

Finally, there is the argument for parsing LaTeX-like input in an entirely new way. In this model, you don’t use TeX at all to read the user’s input: that’s done by another language, and TeX is only involved at all when you do the typesetting. That sound challenging, and the big issue here is finding someone who has the necessary programming skills (I certainly do not).

Of the four approaches, it seems to me that from where we are now, the LaTeX3 approach is not so bad. If you were starting today with no code at all, and not background in programming expl3 or Lua, you might pick the LuaTeX method. That’s not, however, where we are: there is experience of expl3 available, and there is also code written (but in need of revision). Of course, the proof of that will be in delivering a working LaTeX3 format: on that, back to work!

Changes to The LaTeX Community

Many readers will be familiar with The LaTeX Community, an online forum for LaTeX advice that has been running now for a few years. The site was set up in 2007 by Sven Weigand, and has grown since then to over 56 000 posts.

Running such a big site is clearly not easy, and requires some expertise at the ‘back end’. Sven has recently handed over running the site to Stefan Kottwitz, a moderator on the forum since soon after it was started. Stefan has moved the site itself to a new machine, and is now improving the features. Highlights include

  • Syntax highlighting for code blocks
  • The ability to mark up inline code
  • A quick method to link to package homepages on CTAN

Stefan’s work rate in helping users is amazing, and I’m sure that he’ll continue to add new ideas to The LaTeX Community!

Programming LaTeX3: Integers and integer expressions

In the last entry, I talked about token list variables. As we’ve seen, these can be used to hold basically anything, but at the cost that there is no internal structure. I’ve also hinted that LaTeX3 provides a number of richer data types. One that we will need sooner rather than later is the int type for storing integers. At the same time, we can look more widely at what are called integer expression: calculations which work with whole numbers.

Storing integers

Based on what we have already seen with token lists, it should be no surprise that we can create and set int variables with function names you might be able to guess:

\int_new:N \l_my_a_int
\int_set:Nn \l_my_a_int { 1 + 1 }
\int_show:N \l_my_a_int % => '2'

Creating and setting the variable should seem easy enough here, but you might wonder about the result of showing the content here: it’s not what we put in. That’s because LaTeX3 treats the second argument of \int_set:Nn as an integer expression: something to be evaluated to give an integer.

Integer expressions

All LaTeX3 functions which work with integers are set up to evaluate integer expressions, so it’s important to understand what they do. Expressions can use the standard arithmetic operations +, -, * (times) and /, plus parentheses. There are also some functions available for additional more complicated mathematical operations (for example \int_mod:nn to calculate the remainder on division).

More significantly, we can include other functions which themselves yield integers. For example, we’ve seen that it’s possible to work out the length of a token list, which is an integer:

\int_set:Nn \l_my_a_int { \tl_length:n { Hello } * 2 } % => 10

We can’t use any function here: there are some restrictions. Clearly we need to get an integer out, but the functions also need to be expandable: that will be the topic of the next post!

Integer conditionals

A key use of integers is in conditionals. Earlier, we saw that conditionals in LaTeX3 are defined so that we have distinct true and false branches to follow. That applies to integer conditionals in exactly the same way as anything else

\int_new:N \l_my_b_int
\int_set:Nn \l_my_b_int { 7 }
\int_compare:nTF { 1 = \l_my_a_int }
  { TRUE }
  { FALSE }
\int_compare:nNnTF { \l_my_a_int } = { \l_my_b_int }
  { TRUE }
  { FALSE }

You might wonder what is going on here: there are two different conditionals, both of which do a comparison. Well, there are two types of integer conditionals. The first type works out where the comparator is, and so only requires three arguments. The second type has to be given the two integer expressions to compare separately. It’s a bit more awkward to read, but the latter version is faster (it’s closer to the underlying TeX). You can pick whichever one you prefer: as I work on low-level code, I go for speed!

Closely related to conditionals are loops, and again these come pre-defined.

\int_zero:N \l_my_a_int % Hopefully obvious!
\int_while_do:nn { \l_my_a_int < 10 }
    \int_use:N \l_my_a_int \\
    \int_incr:N \l_my_a_int

Hopefully most of this code is clear: we zero the counter, then loop until it reaches 10. For each loop, I’ve printed (used) the value directly, then incremented it by one. (There are a whole family of these functions, with do_while in addition to while_do and nNn versions as for conditionals.)

Integer expressions beyond \int_ functions

Integer expressions are not limited to \int_ functions. Indeed, we’ve already seen one in \prg_replicate:nn. This illustrates a general point: anywhere that LaTeX3 expects an integer, it’s coded to accept integer expressions.

One function that I can’t miss out here is \int_eval:n, which just works out the value of the expression and leaves it in the input. It underlies a lot of the higher-level use of integer expressions, and we are certain to meet it later.