Archive for the ‘LaTeX3’ Category
Fixed point calculations in TeX
It’s well-known that TeX is good at integer arithmetic, and does not provide any primitive functions for real number calculations. Experienced TeX programmers will know that you can make use of dimen registers to do real number calculations at speed, at the cost of accuracy (as TeX truncates dimensions at five decimal places). However, this is not exactly ideal and can be a bit awkward to use.
An alternative for LaTeX users is to use the fp package (which has been around since LaTeX 2.09). This is a powerful package, but can be rather slow. Part of the reason is that it allows 18 digits either side of the decimal point: a wide range of numbers, but almost certainly overkill in most cases. For applications that really need large numbers (such as pgfplots) input using floating points (such as 1.23e20) is probably better. Floating point calculations make life a bit more complex again, but can be done in TeX (see for example pgfmath).
For LaTeX3 I’ve been working on what would be a sensible approach: real number functions are needed in various places. At the moment, the plan is to support fixed-point numbers with nine digits either side of the decimal place, so up to 999999999.999999999. That should be enough for most application, and at some stage supporting floating-point numbers might well be added. To date there is only basic arithmetic (add, subtract, multiply an divide) in the new code, but the plan is to add trigonometric and logarithmic functions. I’m sure that there will be other functions to add: I’ll be interested to see what is asked for.
One area that I’ve been working is overall performance compared to the fp package. The biggest single gain is of course moving from 18 plus 18 digits to 9 plus 9, which makes quite a big difference on it’s own. However, there are various places inside the code where there are opportunities to save time. First, as LaTeX3 requires ε-TeX I’ve exploited the \numexpr primitive where it makes a difference (mainly where it cuts down on the number of assignments needed). At the same time, there are places where using delimited macros is faster than actually doing mathematics! The exact performance gains depend on what exactly you are doing, but it is possible to draw some comparisons. Doing lots of repeated calculations it’s possible to get some feel for the difference between fp and the new LaTeX3 module. On my system 100 000 additions take 31.6 s with fp and 6.0 s with l3fp, while for 20 000 divisions it takes 64.8 s with fp and 4.1 s using l3fp. Quite some speed enhancement, and I think enough to justify using the new code!
siunitx version 2 released
After many months of work, I’m pleased to announce that I’ve just sent version 2 of siunitx to CTAN. Many readers will be familiar with the package and some of the development process. Here, I’ve put together a summary as ‘release notes’ for the new version.
A comprehensive (SI) units package
Typesetting values with units requires care to ensure that the combined mathematical meaning of the value plus unit combination is clear. In particular, the SI units system lays down a consistent set of units with rules on how these are to be used. However, different countries and publishers have differing
conventions on the exact appearance of numbers (and units).
The siunitx package provides a set of tools for authors to typeset numbers and units in a consistent way. The package has an extended set of configuration options which make it possible to follow varying typographic conventions with the same input syntax. The package includes automated processing of numbers and
units, and the ability to control tabular alignment of numbers.
Version 2
Over the past two years siunitx has developed to include many features not originally foreseen when development began. While it has been possible to add a range of new features, some of the underlying limitation of the version 1 code have made this difficult. At the same time, renewed effort by the LaTeX Team on the development of LaTeX3, and in particular the expl3 programming system, has offered a more robust method to create the internal structure of siunitx. As a result, version 2 of siunitx has been almost completely re-written internally.
As well as fixing a number of bugs and limitations in the original release, version 2 is also much better written to work quickly. As a result, most users should see performance enhancements with this new release of siunitx.
As part of the revision of siunitx, the option system and user macros have been completely re-thought. The options now have longer, descriptive names and also a much clearer range of input values. The options which in version 1 took either a key word or a literal value have been replaced by ones which take literals only: in some cases this means that advice has been added to the documentation on how to get particular output effects.
Moving from version 1 to version 2
Depending on how you use siunitx, there may be very little to do to move to version 2. The new version includes a compatibility support file, meaning that loading siunitx using:
\usepackage[load-configurations = version-1]{siunitx}
should mean that existing documents compile with very few changes.
There are some changes to standard settings between version 1 and version 2, which may lead to some alterations in documents. At the same time, a small number of the features of siunitx version 1 which I feel did not work cleanly have been dropped. At present, some of these are scheduled to be re-examined for
inclusion in later releases of siunitx.
While there is a back-compatibility layer for users upgrading, it is strongly recommended that documents are updated to use the new option names and functions. The new approach has been chosen as it is an improvement on the previous version, and in the longer term this layer may be removed.
Installation
Most users will obtain siunitx as part of their TeX distribution. MiKTeX 2.8 should include siunitx version 2 after a short delay (a few days after CTAN upload). For TeX Live users, there will be a slight delay as the package will appear in updated form in TeX Live 2010 but not TeX Live 2009 (which is frozen).
For users who wish to install siunitx themselves, the package is available as a pre-extracted zip file, siunitx.tds.zip. Simply unzip this in your local texmf directory and run ‘texhash’ to update the database of file locations. Version 2 of siunitx requires up to date versions of the LaTeX3 packages expl3 and xpackages. These are also available from CTAN in ready to install format (as expl3.tds.zip and xpackages.tds.zip), and can be installed in the same way if necessary.
If you want to unpack the dtx yourself, running ‘tex siunitx.dtx’ will extract the package whereas ‘latex siunitx.dtx’ will extract it and also typeset the documentation. Typesetting the documentation requires a number of packages in addition to those needed to use siunitx. These should all be available in a complete TeX Live 2010 or MiKTeX 2.8 installation.
Development code and bug database
In order to help users see what is happening, and also to allow me to work efficiently, the development code for siunitx is available on the code hosting site BitBucket.
You can download the very latest code from there: of course, this may or may not work properly depending on exactly what I have added to the code.
The BitBucket site includes an issue tracker, where you can report bugs or make feature requests. I also add bugs to the database from e-mails I get from users. Filling in the bug database helps to make sure that I do not forget things, and also helps other users see what issues are known.
If you want to contribute code to siunitx, you can of course send patches directly to me. Alternatively, the code is hosted using the revision control system Mercurial, which was chosen as it is decentralised and is easy to install on a range of operating systems (I use MacOS X, Windows XP, Windows 7 and Ubuntu!). I’m happy to explain to potential contributors how Mercurial works for developing siunitx.
Roadmap for future releases
The bug database already includes a number of feature requests which are marked to be looked at for version 2.1. The current intention is that the next few months will be devoted to bug fixes in this release (v2.0), with moves to add features for v2.1 beginning in the late summer. I anticipate that v2.1 will be released toward the end of 2010.
It is likely that not all of the features currently marked as to be looked at for v2.1 will be fully working by the time it is released. At the same time, there are some longer term areas which may also need attention. Version 2.2 of siunitx is therefore planned, but with no current list of features marked for inclusion. This version is likely to appear in Spring 2011.
One longer term aim is to include LuaTeX support in siunitx, so that the entire package can work much more rapidly with LuaTeX than when using TeX macros alone. This is not likely to happen until next year (2011), but is in the bug database and is part of the longer term development plan for siunitx.
The internals of siunitx
Currently, the only documented interface to any of the functionality of siunitx is via the key-value control system and functions described in the manual. The internal code of the package is not documented, and there is therefore no guarantee of stability of internal functions. While it is common for users to have to modify the internals of LaTeX2e packages as part of their documents, this is not good programming practice and is not encouraged for siunitx, or indeed in general.
If there is a user function that you require that is not available using the documented tools, please either e-mail or report a bug in the database. One of the general aims of siunitx is to provide a proper documented interface for all of the
functions of the package. I am therefore very happy to add interfaces to internal processes as necessary.
Programmers should note that siunitx is coded using the LaTeX3 ‘expl3′ programming system. This looks somewhat different to traditional TeX or LaTeX programming. Details of the programming environment are documented as part of the expl3 bundle. Currently, none of the internal functions or interfaces are documented, and so are not meant for use outside of siunitx. Other programmers wanting to make use of internal siunitx functions are encouraged to get in contact with me. This will enable me to ensure that the parts of siunitx which are needed by others are documented and are not changed without consultation.
From \newcommand to \NewDocumentCommand
Following on from my last post, I thought it might be useful to give some simple example of how the xparse package works and why it’s useful. I want to do this to show end users of LaTeX how it can replace \newcommand, so the example will not involve anything too complex, code-wise.
First, why would you want to use xparse’s \NewDocumentCommand in place of \newcommand? First, \NewDocumentCommand can make macros that take a mixture of arguments that \newcommand cannot. With \newcommand, you can make a macro that takes a number of mandatory arguments, or ones where the first argument is option and in square brackets, but that is it. Anything else then needs the use of TeX programming or internal LaTeX macros: not really helpful for end users. The second thing is that \newcommand macros are not ‘robust’. This shows up where you need to \protect things, which can be very confusing. Macros created with \NewDocumentCommand are robust, and this means that they work more reliably.
I’m going to illustrate moving from \newcommand to \NewDocumentCommand with a series of simple examples. For all of them, you need to load the xparse package:
\usepackage{xparse}
Macros with no arguments
The simplest type of macro is one with no arguments at all. This isn’t going to show off xparse very much but is is a starting point. The traditional method to do this is
\newcommand\NoArgs{Text to insert}
which becomes
\NewDocumentCommand\NoArgs{}{Text to insert}
That does not look too bad, I hope. Notice that I’ve got an empty argument in the xparse case: this is where the arguments are listed, and with \NewDocumentCommand there always has to be a list of arguments, even if it is empty. That’s a contrast with the \newcommand approach, where we only need to mention arguments when there are any.
One or more mandatory arguments
The most common type of argument for a macro is a mandatory one. With \newcommand, we’d give a number of arguments to use:
\newcommand\OneArg[1]{Text using #1}
\newcommand\TwoArgs[2]{Text using #1 and #2}
\NewDocumentCommand is a bit different. Since it can work with different types of argument, each one is give separately as a letter. A mandatory argument is ‘m’, so we’d need
\NewDocumentCommand\OneArg{m}{Text using #1}
\NewDocumentCommand\TwoArgs{mm}{Text using #1 and #2}
This is still pretty similar to \newcommand: the useful stuff starts when life gets a little more complicated.
One of more optional (square brackets) arguments
To really get something clever out of xparse, the arguments need to be a little more varied than I’ve show so far. Let’s look at optional arguments, which LaTeX puts in square brackets. If I want the first argument to be optional, then LaTeX can help me
\newcomand\OneOptOfTwo[2][]{Text with #2 and perhaps #1}
\newcomand\OneOptOfThree[3][]{Text with #2, #3 and perhaps #1}
If I want anything else, I’m on my own (so no more \newcommand examples!). First, let’s do the two example using xparse. An optional argument in square brackets, which works like a \newcommand one, is ‘O’ followed by {}:
\NewDocumentCommand\OneOptOfTwo{O{}m}%
{Text with #2 and perhaps #1}
\NewDocumentCommand\OneOptOfTwo{O{}mm}%
{Text with #2, #3 and perhaps #1}
How about two optional arguments? It’s pretty obvious:
\NewDocumentCommand\TwoOptOfThree{O{}O{}m}%
{Text with #3 and perhaps #1 and #2}
What if we want something as a default value for the optional argument? With \newcommand, that would be
\newcommand\OneOptWithDefault[2][default]%
{Text using #1 (could be the default) and #2}
which would become
\NewDocumentCommand\OneOptWithDefault{O{default}m}%
{Text using #1 (could be the default) and #2}
The same idea applies to each optional argument: whatever is in the braces after the O is the default value.
More complex optional arguments
You might be wondering why we need the ‘{}c after ‘O’ when there is no default value: why not just ‘o’? Well, there is ‘o’ as well. Unlike \newcommand, \NewDocumentCommand can tell the difference between an option argument that is not given and one that is empty. To do that, it provides a test to see if the argument is empty:
\NewDocumentCommand\OneOptOfTwoWithTest{om}{%
\IfNoValueTF{#1}
{Do stuff with #2 only}
{Do stuff with #1 and #2}%
}
Don’t worry if you forget to do the test: the special marker that is used here will simply print ‘-NoValue-’ as a reminder!
Two types of optional argument
Sometimes you might want two different optional arguments, and be able to tell which is which. This can be done by using something other than square brackets, often using angle brackets (‘<’ and ‘>’). We can do that using the letter ‘d’ (or ‘D’ if we give a default).
\NewDocumentCommand\TwoTypesOfOpt{D<>{}O{}m}%
{Text using #1, #2 and #3}
What input syntax does this make? Let’s look at some examples
\TwoTypesOfOpt{text} % One mandatory
\TwoTypesOfOpt[text]{text} % A normal optional
\TwoTypesOfOpt<text>{text} % A special optional
\TwoTypesOfOpt<text>[text]{text} % Both optionals
How did that work? The first two characters after the ‘D’ are used to find the optional argument, so in this case ‘<’ and ‘>’.
Finding stars or other special markers
Another common idea in LaTeX is to use a star to indicate some special version of a macro. Creating those with \newcommand is difficult, but it is easy with \NewDocumentCommand
\NewDocumentCommand\StarThenArg{sm}{%
\IfBooleanTF#1
{Use #2 with a star}
{Use #2 without a star}%
}
Here, ‘s’ represents a star argument. You’ll see that it ends up as #1, while the mandatory argument is #2. You’ll also see that there needs to be a test to see if there is a star (\IfBooleanTF). This doesn’t mention stars as the test can be used for other things.
Summing up
There is more to xparse than I’ve mentioned here, but I hope that this is a useful flavour of what it can be used for. To get more flexibility there is a bit more to think about compared to \newcommand, but the overall consistency is hopefully worth it.
EPS graphics with PDF(La)TeX
One issue a lot of people find confusing with (La)TeX is the rules about which types of graphic files work with which engines. EPS files are fine when going via the DVI route, but do not work with direct PDF creation. The solution is to turn the EPS files in PDFs, and the problem goes away. However, there is then the question of how to do the conversion.
For most documents, having to convert every file by hand is not a sensible choice. The next nearest thing is the epstopdf package, which will do the same thing but from within a LaTeX run. However, it needs \write18 enabled, and this is not always desirable. More importantly, a lot of people who struggle with the graphics problem do not know how to turn on \write18 anyway. A good way around has been added to the latest version of TeX Live, which is currently in the final testing stages. TeX Live 2009 has some restricted \write18 functions enabled as standard, and also has a version of epstopdf “built in”. The result is that EPS files are automatically converted to PDF files, in a transparent manner. Of course, this only happens if the PDF does not also exist! At the moment, this feature is not in MiKTeX 2.8, so it is one reason to favour TeX Live 2009 even on Windows.
There are places where epstopdf will not help: for example, when using psfrag or pstricks. There, the best solution will either be auto-pst-pdf or pstool. Both are written by Will Robertson, and both need \write18 enabled to work. pstool is more efficient (it only re-creates graphics as needed), but for some cases on auto-pst-pdt will work. Will has documented both packages very well, so the best way to learn about them is to have a read of the documentation.
notes2bib two
I’ve just send a new version of my notes2bib package to CTAN. notes2bib lets you include the text of notes in the body of a file, but have them appear in the bibliography: for chemists, this is pretty common. The new version is a re-working of the existing code plus some ideas I explored with an experimental version called xnotes2bib. I’ve re-jigged the options to make them more descriptive, and some of the macros have been renamed. My original choices were not always the best. The biggest change, though, is internally, where I’ve recoded everything using expl3, the coding base for LaTeX3. That means that users will need to install a couple of support packages, but I hope means that the code should make a bit more sense (at least to me!). The only way expl3 will get tested is if people use is, so I’m prepared to have a go and hope everything works. So far, all looks good.
Speed versus clarity
Programming complex functions in TeX tends to involve a lot of manipulation of variables. At the low level, TeX doesn’t provide a lot of variable types, and so most things are done using macros. This means that a lot of array or record-type work is done by using large numbers of macros and giving them appropriate names. This is fast, as there is no need to recover data from a larger structure, but doesn’t make for easy to read code.
One aim of LaTeX3 is to provide better programming tools built on TeX, and this includes higher-level variable support. If you take a look at the expl3 manual, you’ll find that there is support for ideas such as property lists an sequence stacks. These are, of course, implemented using macro or token registers, but provide some of the more abstract ideas that other programming languages provide for variables. The pay-off is in speed: there will always be a cost to adding layers on top of the TeX basics.
I’ve been worrying about how to use the new structures as I work on siunitx version 2. My original code uses lots of macros to store things, but the result is that the code is difficult to read and even harder to change. My earlier attempts at improving the code stuck with this approach, but rationalised it into something more systematic. Better than version 1, but I’ve not been entirely happy with the results. I’m now trying again, this time using the high-level variables of LaTeX3 to do things. There is going to be a cost in speed, but I hope that making the code readable will be reward enough. I’m also spotting ways to improve the flow of code, so that there should be less to actually do in the new version. Looking to the future, making the code more maintainable should more than compensate for a little loss of speed. In the end, I’m sure I’ll be asked for more features, and having data structures I can rely on is essential for doing that.
Moving siunitx to LaTeX3 coding
If you take a look at the development version of siunitx, you will find that I’m getting on with moving the code over to using LaTeX3 coding internally. That does not mean that the package will not work using LaTeX2ε: everything still works nicely in a normal LaTeX document! However, it does mean that I can use the new coding ideas to make my life easier, and hopefully the code more robust and a little faster.
I’ve also moved to using my own LaTeX3 keyval package for option handling. I recently added it to the main LaTeX3 code base, so I know that it will be available if LaTeX3 is! Following the pattern for other LaTeX3 keyval settings, I’m changing the original key naming plan somewhat. I’ll probably post more detail on this once things are working a little better. The idea is to use more descriptive key names that then current version of siunitx, for example:
\sisetup{
group-digits = true,
round-mode = places,
round-places = 3,
separate-uncertainty = true
}
There will be a delay of a week as I’m on holiday, then I hope to get the experimental code doing almost everything that the current release version does. There is going to be a bigger delay as I sort out one idea that has been requested several times. I want to add some floating point stuff, but to do that I need to write support for that for LaTeX3. I have some ideas, but it will take a while to actually do the work.
Keyval methods in LaTeX3
I’ve been working for a while on a method to provide a reasonably powerful method for creating keyval input for LaTeX3. This has been going under the working title “keys3”, but yesterday I took the plunge and added the code to the LaTeX3 development repository. For the moment, this can only be accessed via SVN, but if the rest of the team are happy with the idea, it will appear in the next snap shot that is sent to CTAN.
The ideas in the new module (now called l3keys) have been inspired by the pgfkeys package. By using keyval methods to create keys, the idea is to make life a lot easier for the programmer. However, things are somewhat modified compared to the pgfkeys package, mainly to try to keep the input syntax simple but powerful enough for most uses. For example, there are separate functions for creating keys and setting them, an idea that all other keyval packages use. So a typical setup block might look like:
\keys_define:nn { module } {
key-one .set:N = \l_module_one_tl, % Store input in variable
key-one .value_required:,
key-two .bool:N = \l_module_two_bool, % Either true or false
key-three .code:n = { Code using an argument #1 },
key-three .default:n = text
}
while keys are set using:
\keys_set:nn { module } {
key-one = Value,
key-two = true,
key-three
}
Hopefully, the ideas are flexible and clear enough for people to get working with. One thing to notice: the key names are hyphenated. It looks like that will be the “style guide” suggestion for LaTeX3 keys. At some point, I’ll look at writing a TUGBoat article on how everything works.
Progress on siunitx version 2
I’ve been doing quite a bit of thinking about siunitx version 2. My original plan was to write everything in standard LaTeX2e code, but the more I thought about things the more it has not looked like such a good idea. siunitx needs a lot of programming tools, and these are almost all available in the experimental LaTeX3 code which is now approaching stability. At the same time, there are still some difficult issues to solve for siunitx, and I’m going to need some more support stuff, I think. So I’ve revised my ideas on how to progress.
Currently, I’m moving the code I have already re-written to use LaTeX3 conventions internally. I’m then going to try to solve the remaining issues for version 2. The aim is to have a package which will run on LaTeX2e, using LaTeX3 internals, so that I can eventually create a LaTeX3-only edition with the same interface ideas. I’m going to try to crack on with things, but it will take a while to solve the remaining issues!
LaTeX3: xparse
The next step for LaTeX3 development is to revise the two existing “xpackages” which are available to make a link between the code level and the user: xparse and template. Of the two, the xparse package is by far the easier to understand.
In LaTeX2e, you can either use the LaTeX method to create new commands:
\newcommand*\mycommand[2][]{%
code goes here, using #1 and #2
}
or you can do things yourself using a mixture of TeX primitives and LaTeX internal functions (and ignoring the issue of default value):
\def\mycommand{%
\@ifnextchar[{%
\@mycommand
}{%
\@mycommand[]%
}%
}
\def\@mycommand[#1]#2{%
code goes here, using #1 and #2
}
This does not make for code which is easy to alter to reflect different input syntax (for example XML), or to change internal functions without knowing how the user input works. The idea of xparse is to separate out the user syntax from the internal code. Currently, exactly what tools need to be available is still being decided. The current version of xparse would expect the following syntax:
\DeclareDocumentCommand \mycommand { o m } {
code goes here, using #1 and #2
}
The idea is that each argument is represented by a letter, for example o for an optional argument, m for a mandatory one or s for an optional star. This makes it possible it see immediately from the definition how many arguments are needed (one for each letter). It also means that the internal functions, which implement things, can be separated totally from the user part of the system. In that way, internal functions can have a fixed number of arguments, and leave xparse to supply them. So if the internal function needs to be changed, it does not matter how it is used, or vice versa.
That all sounds very good, but there are some outstanding issues. For example, handling verbatim arguments is not straight-forward. There are a couple of possible approaches to this. Either stick to a simple system, and accept that not everything can be done in the same way, or make the system more flexible but complicated. I’m currently in favour of the first approach: almost every user function is simple (especially as we have the ε-TeX extensions and so can \scantokens our way out of a lot of problems). I’d be interested to hear what other people think would be useful.