Some TeX Developments

Coding in the TeX world

Building biblatex-biber

with 17 comments

The biblatex-biber project provides probably the best way for biblatex users to switch to pure UTF-8 bibliography information. However, getting it to build can cause problems: biblatex-biber is a Perl programme, and needs various downloads from CPAN. I thought it would therefore be useful to put some simple recipes here, explaining what I’ve done to get a working biblatex-biber on Windows, Ubuntu and Mac OS X.  I’m assuming that the latest biblatex-biber release has been downloaded and unzipped somewhere, and that the Command Prompt/Terminal/Shell is open in that directory (folder).

Mac OS X 10.6 (Snow Leopard)

Mac OS X comes with Perl installed, so life is relatively easy. At the Terminal, you need to run the cpan script as root:

sudo cpan

The cpan script has its own prompt, but this is very similar to the Terminal one. First, I updated cpan itself and installed a helper module with

install CPAN
reload cpan
install YAML

That done, the various requirements for biblatex-biber can be installed, using the single call

install Data::Dump List::AllUtils Readonly Text::BibTeX Readonly::XS

For the more cautious person (such as me), each install instruction can be given on a separate line: this keeps things a bit more controlled. I accepted the standard settings, except when asked about installing items that were only needed for testing, where I said no.

Once cpan has done all of the installing, you can leave it by typing

exit

So now back at the Terminal prompt, a few simple instructions

perl Build.PL
./Build
sudo ./Build install

That put biblatex-biber onto the path for all users: everything then worked correctly.

Ubuntu 9.10

Once again, Perl is installed as standard in Linux distributions: I’m using Ubuntu as a representative system. Before starting cpan, there is an additional step, which is to install an extra Ubuntu package. So at the Terminal, you need to do

sudo apt-get install libxml2-dev

This is needed as otherwise you get some very odd errors in a bit. Now, essentially the same recipe works as for the Mac. First run cpan running and update and install YAML. Then there is a long list of items to install

install Data::Dump List::AllUtils Readonly Text::BibTeX Readonly::XS XML::Writer XML::LibXML File::Slurp

which can again be done one at a time, for the more cautious.

After exiting cpan, the same three lines at the Terminal should work as in the Mac section.

perl Build.PL
./Build
sudo ./Build install

Windows (XP, Vista and 7)

To date, my attempts to build biblatex-biber on Windows (using Strawberry Perl) have failed as I can’t get the Perl module Text::BibTeX to install. This is supposed to be optional, but without it biblatex-biber does not seem to work, although I do get it to build. Luckily, there is a self-contained binary for Windows available from the project site. This includes its own Perl system, so there is no need to get Perl set up before trying it. Everything seems to work for me with this version. Any ideas on what is necessary would be helpful!

Written by Joseph Wright

January 23rd, 2010

Posted in LaTeX, biblatex

Tagged with , ,

Asking for help

with 2 comments

I get quite a few e-mails asking for help with my packages, and also spot a few questions in various public places (comp.text.tex and The LaTeX Community, mainly). I’m always happy to help where I can, and it always makes my life a bit easier if the question arrives with all the information to start with.

One of the things that get repeated by almost everyone trying to help out users is the need for a minimal example. For LaTeX, that typically looks something like:

\documentclass{article} % Or perhaps memoir, beamer, ...
\usepackage{...}
\begin{document}
Example here
\end{document}

ConTeXt examples are not dissimilar, while plain TeX ones need very little at all (but I get very few of these). It’s much faster to supply everything to start with than to send a snippet with key details left out. It’s surprising how many things people simply assume are obvious, when they are not (“Surely everyone uses package wibble”).

Another useful thing to do in advance of sending a query is to send a log file. Again, LaTeX users are normally best advised to include \listfiles in their input, so the kernel makes a neat listing of everything in use (hopefully including the versions).

Often, I get description of either how things should look or why they are wrong, rather than an actual example. If at all possible, a “reference rendering” is much easier to follow. That can be something done in a drawing programme, using some alternative (but awkward) TeX code, or a screenshot of something.  Describing things in words (especially if the questioner is not a native English speaker) can be a recipe for a long and painful process.

For questions posted in public, it’s always best to drop me a line so I spot it: I do my best but sometimes I miss things. For places such as comp.text.tex, that can simply be a CC on the posting, or for forums a quick note that says

Hello,

I posted a question about your <whatever> package:

http://some-url-here/

Please take a look.

will make sure that the issue gets me interest.

On the other hand, questions direct to me obviously get straight to the point but miss out on the public record part of a posting to Usenet. The same issues do tend to pop up more than once! Sometimes there is good reason for avoiding public postings: I get questions including unpublished material about achemso, for example. So I’d encourage anyone with an issue they think is general (such as a bug in one of my releases) to post something in public if possible. I always try to follow up postings as well as e-mailing people directly as well.

As I say, I’m always happy to try to help. I hope that most of the time I succeed.

Written by Joseph Wright

January 21st, 2010

Posted in General

achemso: Cross-referencing to floats

without comments

I’ve had a a few questions about the achemso bundle concerning creating references such as “Figures 1 to 3” automatically. The problem has been that I’ve been using the varioref package to turn \ref{some-fig} in “Figure 1” automatically, rather than having to put Figure~\ref{some-fig}. Unfortunately, varioref does not handle multiple references in an automated fashion.

For version 3.4 of achemso (which I’ve just sent to CTAN), I’ve switched to the cleveref package. This means that you can now write something like \ref{fig-1,fig-2,fig-3} and get “Figures 1 to 3” without further effort. I hope that this makes life much easier for users, although it does mean that there is another package required to use achemso. I think that this is the right balance: I’m sure users will let me know if I’ve got it wrong.

Written by Joseph Wright

January 16th, 2010

Posted in LaTeX, achemso

Tagged with ,

PDF Version and file size

with 9 comments

The PDF format has evolved over the years as Adobe have released new versions of their Acrobat and Reader software. New ideas have been added to the file format, and as a result there are different versions of the PDF format. If you take a look at a PDF in Adobe Reader, you can see which version the file is using in the Document Properties information. Of course, files using the newer versions of the PDF format need a suitable viewer, be that Reader or something else.

This is relevant to TeX users as PDF tends to be the target format, either directly or via DVI files, for many users. Tools such as pdfTeX are not tied to one version of the PDF specification. For example, when creating a PDF directly with pdfTeX the \pdfminorversion primitive can be used to set the PDF version to 1.3, 1.4 or 1.5.

Why would you want to do this? Well, obviously the newer versions bring new features. A particularly significant one is the compression of non-stream objects. The detail of these objects is not really important, but they relate to items such as links within documents. Version 1.5 of the PDF specification allows these to be compressed, which can make quite a difference to the resulting file size. For example, I did a trial run with the siunitx manual, and by adding the lines

\pdfminorversion=5
\pdfobjcompresslevel=2

resulted in reducing the file size from around 700 KiB to around 550 KiB, a saving of roughly 20 %.

There is some discussion ongoing at the moment on the TeX Live mailing list about possibly changing the default PDF version produced by tools such as pdfLaTeX, XeLaTeX, etc. The current standard setting is version 1.4, which makes larger files but does have the advantage of being readable by a wider range of viewers. On the other hand, PDF version 1.5 was first released in 2003, and there is pretty good support for it in most of the well-known readers. As long as switching to version 1.5 also enables the compression, this looks like a good idea: just moving to version 1.5 without using the features available seems a bit odd to me.

There are times where you need to use PDF version 1.4 (for example for archive-type PDFs), but for those you also need to check other features of the PDF. So I feel that the change looks like a good idea, provided there is a good way to set the version to something else.

Written by Joseph Wright

January 11th, 2010

Posted in General

Tagged with , ,

Chemistry journals: publishers support of LaTeX

with 6 comments

As the author of the achemso bundle (for supporting submissions to the American Chemical Society), I get a few queries about the support various publishers provide for LaTeX. Unlike more physics-focussed journals, the chemistry journals never typeset directly from authors LaTeX sources.  As a result, the acceptance of LaTeX material from authors is rather less popular, and tends to be patchy. So I thought I’d summarise things as I currently understand them.

American Chemical Society (ACS)

As I said above, I’ve written the achemso bundle specifically for submissions to the ACS. However, while the central office are happy to host a copy on their website and so on, the ACS don’t officially support the bundle. That means, in practice, that some journals are happier with LaTeX submissions than others. Each journal has its own office, and so I hear different things from people submitting to different journals. It also means that I have to pick up the requirements of each office based on feedback via authors, rather than getting any formalised instructions. There are mistakes in the achemso bundle, and there are also requirements that I don’t know about. So feedback is always useful (good or bad).

Royal Society of Chemistry (RSC)

The RSC have rather less information about LaTeX on their website than the ACS. They do mention TeX, but only very briefly. I’ve written some BibTeX styles, and a very basic article template, which are available in the rsc bundle. I’ve had a bit of feedback on these, and I hope that they at least provide a starting point for writing a submission to the RSC in LaTeX. More generally, I think the best advice is to check with the editorial office for the relevant journal before writing anything, and to stick to the basic LaTeX article class when you do.

Wiley

As with the RSC, Wiley don’t have a lot of LaTeX information. What they do say is that they only accept PDF submissions: you can’t send your source. They also say to stick to the plain article class, and basically to keep things simple.

Elsevier

Elsevier have recently had a new class written for journal submissions, elsarticle. From what I can make out on their site, you can use this for most of their journals, which should include the chemistry ones. As this has actually been written for them to order, I imagine that Elsevier is the best place to be sending LaTeX submissions to. Hopefully other publishers will see that they have made life easier for their authors and will take note.

Written by Joseph Wright

January 5th, 2010

Posted in LaTeX

Tagged with

pgfplots v1.3

with 2 comments

A new version of the very useful pgfplots package has been released. pgfplots provides a very handy interface on top of pgf/TikZ to generate print-quality plots without too much effort. As many readers will know, pgf works with both DVI and PDF output methods, making pgfplots very handy for generating plots without worrying about other content.

For me, the stand-out new feature in v1.3 of pgfplots is the ability to automatically reverse the axes. As a chemist, I need to do this as convention dictates that some types of data are displayed with the x axis running from high values to low ones. So for me not having to do this by hand is a really significant reason to upgrade. There are lots of other new features as well: I see that the manual now includes a number of 3D surface style graphs, which many people like.

If you are plotting data in TeX, the pgfplots should be very high on your list of packages to consider.

Written by Joseph Wright

January 4th, 2010

Posted in General

Tagged with , ,

siunitx performance (again)

without comments

My previous post mentioned some efforts to improve the performance of the siunitx parser. I’ve now committed an entirely new version of the parsing code to the repository. I’ve also done my best to speed up the rest of the package. The speed you see very much depends on the type of input involved. With my test file, I got a time of roughly 3 minutes using version 1 of siunitx, about 2 minutes before improving version 2, and about 1.5 minutes after. For comparison, doing no processing at all take the time down below 10 seconds for the same file (roughly 700 pages of repeated input). For the interested reader, an siunitx snapshot TDS-style zip is available.

One thing that I find interesting in all of this is that even before optimising the code, version 2 still worked faster than version 1, even though it does more things. A lot of that is because the pre-built looping material in expl3 does a much better job than my own attempts in version 1 of siunitx. Of course, good programmers will always use fast loops, but for the rest of us I think this shows how a sensible tool-kit can bring benefits “behind the scenes”.

Written by Joseph Wright

January 4th, 2010

Posted in LaTeX, siunitx

Tagged with , ,

siunitx performance

without comments

I had an e-mail today about using siunitx when there are a lot of calls to the package. As you might expect, things can get a bit slow, and the person who contacted me felt that things get rather too slow. There are differences between the current release version of siunitx and the development code (version 2), and I’ve also added a few features to help speed things up where appropriate using version 2. So I thought I’d put a bit of information on the comparison in the public domain.

First, a baseline is not to use siunitx at all, and to simply test everything by hand. For that, I tried the simple test file:

\documentclass{article}
\usepackage{siunitx,xparse}
\ExplSyntaxOn
\DeclareDocumentCommand \repeated { m m }{
  \prg_replicate:nn {#1} {#2}
}
\ExplSyntaxOff
\begin{document}

\repeated{10000}{$1.23\,\text{m}$ }

\end{document}

This repeats the same text 10 000 times: boring but handy for testing. Using the command-line time program, I get an overall time of 1.714 s for this.

A very slight change of the file lets me test with siunitx version 1:

\documentclass{article}
\usepackage{siunitx,xparse}
\ExplSyntaxOn
\DeclareDocumentCommand \repeated { m m }{
  \prg_replicate:nn {#1} {#2}
}
\ExplSyntaxOff
\begin{document}

\repeated{10000}{\SI{1.23}{\metre} }

\end{document}

With the latest release version of siunitx (v1.3g), I get a time of 80.878 s for this on the same system.

In siunitx version 2, I’ve recoded all of the loops and parsing code, and so things are faster using the standard settings: 58.944 s. With the very latest code (SVN 243), I’ve included two options to make things move faster: parse-numbers and parse-units. Of course, these do mean that you get less of the power of siunitx. But for many people they might be useful. Turning both parsing systems off, the time needed for the test file drops to 14.975 s (just turning off the number parser gives a time of 18.803 s).

I may take another look at trying to improve the performance of the number parser. The problem is at least in part that making the code faster will either mean making some of it less powerful or, more likely, a lot harder to read and maintain. I hope that for most people, most of the time, the performance is acceptable. Of course, at some point I’ll try to do some Lua-based code for the parsers, at least. But that won’t help for most users now.

Ultimately, there is a limit to how fast things can work. Whether the performance hit of using siunitx is worthwhile is something is down to users. I think it’s worth it, as the better logic in the mark-up more than makes up for the extra time required. But then I would say that!

Written by Joseph Wright

December 28th, 2009

Posted in LaTeX, siunitx

Tagged with ,

siunitx: Getting the micro symbol right

with one comment

I get a few e-mails about siunitx and the micro symbol. People tend to be surprised that the symbol ‘sticks’ to a look very much like Computer Modern. The reason is that picking a proper upright (not italic) μ is not so easy in TeX.You don’t get one in Computer Modern, so siunitx takes one from the TS1 (text support) set in the absence of a better plan. I’ve set up some auto-detection for a few obvious alternatives (such as the upgreek package), but that doesn’t really work for XeTeX users.

XeTeX users are likely to load system fonts, and I’d hope be using UTF-8 input. That makes it hard to auto-detect what they are doing, but should make life easier for them to get things right. A lot of more comprehensive fonts include Greek letters in the main font, so getting the μ right is simple:

\sisetup{
  mathsmu = \text{μ},
  textmu  = μ
}

or for people testing version 2 of siunitx:

\sisetup{
  maths-micro = \text{μ},
  text-micro  = μ
}

There may be a bit of testing required: this will not work if, for example, you are using the Latin Modern font.

Written by Joseph Wright

December 22nd, 2009

Posted in LaTeX, siunitx

Tagged with , ,

TeXworks v0.3 snapshot

with one comment

Jonathan Kew has posted new ‘snapshots’ for the experimental (v0.3) trunk of TeXworks. As usual, these are for Windows and the Mac, with Linux users to compile themselves from the SVN (not usually difficult). Looking through the change list, it looks like mainly small refinements rather than any big changes. Everything looks like it’s working, which is the main thing!

Written by Joseph Wright

December 20th, 2009

Posted in TeXworks