Progress update on siunitx version 2

Progress on siunitx version 2 has not been as fast as I’d like. There are a few reasons. First, my “TeX time” has been squeezed since the start of the year as I’m doind a couple of Open University courses. Some of the “TeX time” I do have has also been taken up with jobs for UK-TUG, mainly the website and membership renewals. Finally, when I looked at the wish-list for version 2 it turned out to be rather more complex than I’d remembered!

The net result is that I need to revise my initial timetable. I’m still aiming for release before the end of the year, so I’d like to get a testing version finished by early autumn (late September or early October). My key aim is to get some internal structures improved, so that I can build new features over time (even if not everything makes version 2.0).

Improving LaTeX for the user

I’ve been discussing some points about the future of (La)TeX with various people, and some key issues come to mind. Most LaTeX users do not want to meddle with the internal parts of LaTeX or TeX. In an ideal world, I suspect most users would like to need little beyond the correct document class to get things “just right” in their layout. Perhaps a few simple settings, but really little more than that.

With the correct packages and class loaded, you can do many things in LaTeX. However, you really shouldn’t need to load specific support for hyperlinks, T1 encoding, basic font changing, creating new float types and so on in 2009 (let alone 2010, 2011, etc.). The efforts of the LaTeX3 team have to date focussed on programming and to a lesser extent document design. How things will work at the user level is much less clear.

I’d suggest that a real focus on getting something for users would be the best way forward. This might mean less improvement internally, but I’d think that a LaTeX kernel which could do everything in The LaTeX Companion would be pretty successful, even with few changes “under the hood”. This would mainly be a re-coding excercise from existing packages, which in a way is similar to what I’ve tried to do with siunitx. Much of the basics in siunitx are taken from other packages (at least in terms of user interface), but it brings several ideas together in one place. The same idea could easily be applied to the kernel. Of course, this might leave some of the clever ideas for LaTeX3 out of the code at this stage, but I’d hope would get momentum behind a more regularly updated system.

One particular area to think about is fonts. With both XeTeX and LuaTeX able to handle system fonts directly, the basic LaTeX system seems very antiquated. At present, LaTeX3 only requires e-TeX, not LuaTeX (in contrast to ConTeXt Mark IV). Should the LaTeX team say something like:

For current testing purposes, only the e-TeX extensions are needed, but this is likely to change. XeTeX or LuaTeX will be required to run the release version of LaTeX3 with full functionality.

I’d say yes, as I think that it’s time to move on from complex font installation and usage restrictions. I’d also be very tempted to say that LaTeX3 will assume UTF-8 input unless otherwise specified (as both XeTeX and LuaTeX are native UTF-8 systems).

This type of approach will make LaTeX easier to use, and I’d hope to see it arrive! After all, users are the TeX community.

LaTeX3: Key points

A comment on one of my other posts raises the important issue of what the key targets are for LaTeX3. Only the team can answer this, but reading the project webpage, the mailing list and the code, you can get some ideas.

  1. A well-designed, consistent and documented coding system. This is the most complete part of the current code base. The expl3 module provides a lot of low-level coding methods, which deal much better with control of expansion than using plain TeX or LaTeX2e (no long \expandafter runs). As I’ve said in another post, there are ideas that you can take from this for current work.
  2. A much larger and more functional kernel. The current LaTeX kernel provides only a limited number of functions and customisation possibilities. This is one reason for the very large number of LaTeX packages around. The new kernel will need to cover a large amount of what goes on in packages under LaTeX2e. This is much more the ConTeXt model, with the core team providing most of the basics.
  3. Internal macros taking a clear and fixed number of arguments. Currently, a lot of coding and design decisions are mixed together. User macros take optional arguments and various complex ways are used to send these through to the underlying functions. The idea for LaTeX3 is that there will be a “glue” layer, which converts from user to internal syntax. At the internal level, each function will have a strictly fixed set of arguments, and the glue layer will provide those from user input.
  4. Separation of design from code (and day-to-day use). The current kernel has design decisions all over the place: in the kernel, in classes, in packages and in documents. The LaTeX3 team are aiming for a system where functions are general, and make no assumptions about design. The glue layer described in (3) then brings in design decisions, which again are separated from user macros. Exact details are still some way off (although template shows a way of doing things).
  5. User syntax decoupled from internal design. It seems likely that the main user “interface” to LaTeX3 will remain a file starting \documentclass and ending \end{document}. However, the ideas in (3) and (4) mean that it should be possible to use a different glue layer to typeset completely different user input with the LaTeX3 kernel. This should allow LaTeX3 to take on entirely different, but structured, input formats (most obviously XML-based).
  6. New ideas for complex layouts. A lot of work is going into areas such as grid typesetting and complex float handling. This is intimately bound up with the output routine, and so LaTeX3 will provide a totally revised system in this area.

There are a lot of challenges there, and progress in different areas is variable. A lot of the work, of course, is in the thinking stage rather than writing the code. I’d say that all of these ideas are good things, and so it only remains to implement them all!

Taking good practice from LaTeX3

LaTeX3 provides a well thought out low level programming environment. There is a lot the (La)TeX programmer can learn about good coding practice from the current experimental code, and I’m trying to use this in siunitx version 2. There are lots of little points that I could highlight, but I’ll pick a few out:

  • Naming internal functions in a systematic way, which means longer but more logical names, such as \si@num@out@uncert@int, a function in siunitx 2 for processing the output of the integer part of an uncertainty in a number. Logical names make for easier to follow code.
  • Using lots of small functions, rather than long ones with complex nesting.
  • Adding @aux to a function name to show an auxiliary part of another macro, rather than some complex vowel substitution using @.
  • Creating new functions for expansion control, rather than having \expandafter runs all over the place.

I’m also taking some of the other ideas in siunitx version 2. For example, where possible I’m aiming only to use \edef where it is really needed (unknown levels of expansion). If something needs to be expanded a known number of times, I’m going for controlled expansion instead. I’ve also really like the idea of switches that don’t use \iffalse and \iftrue to work: much less risk of problems.

Coding in public

I see that the biblatex package has been updated to version 0.8c, and that there is now a SourceForge page for the package. The later is just holding bug and feature tracking for the package, and no actual code. Placing development code in a public place means that your commits (or lack of them) are there for all to see, so I can see why keeping things private is attractive. Of course, just because nothing happens in public doesn’t mean nothing is happening, and in any case these things are done as a “hobby”.

Units in ConTeXt

Looking to gather ideas, I was looking at the ConTeXtunits” module. The approach taken there is to create free-standing macros, such as \Second or \Candela, and to use them to build up a string of units. Like some of the LaTeX solutions, this means that the user has to maunally include symbols (such as \Times) to get the formatting right. On the other hand, ConTeXt uses a glossary-like method for defining units (something I’ve thought about for siunitx in the past). I’ll certainly be thinking about something like that for siunitx version 2.

Units outside of \SI and \si

A recent discussion on the LaTeX Community forums has raised the point that siunitx currently defines a lot of short unit macros outside of the scope of the \SI and \si macros. I did this to match the functionality of the unitsdef package, but I’m thinking of revising this for version 2. My plan is to change the default behaviour so that these macros will only work inside \SI and \si (plus the s column). There will be an option to get unitsdef-like behaviour (which I don’t think is actually very good).

Looking further ahead, I suspect I’ll be more definite about things when I write a LaTeX3 version of siunitx. There, backward-compatiblity is not an issue, so I’ll be free to do what is most sensible for the long term.