Exploring ChemFig: Going further

In the first two parts of this short series, I’ve looked at some ChemFig basics and improving the settings used to get to publication-quality appearance. In this final part, I want to look as some more complex effects. I’m going to keep using the customisations I made in part two, so the demos here all use them in the preamble.

Decorating bonds

Chemists don’t only use simple line bonds: we use bold, dashed and wavy lines a lot. ChemDraw has all of these set up ‘out of the box’:

ChemFig does not have a simple input syntax for them, unlike = for a double bond or ~ for a triple bond. However, it does let us customise bond appearance: the basic syntax we need is to put [,,,,<settings>] after the bond to be customised (there are four commas here as ChemFig has other settings to alter bonding). The settings are TikZ commands, and it’s possible to set up these customisations as styles, which is better than doing everything by hand.

First, we need two settings from ChemDraw: the thickness to use for a bold bond and the spacing in a hashes. I’ll want to use these a few times, so save them with readable names

\newcommand*{\bondboldwidth}{0.22832 em} %'Bold Width'
\newcommand*{\bondhashlength}{0.25737 em} % 'Hash Spacing'

Bold, hashed and dashed bonds are then easy to set up

\tikzset{
  bold bond/.style = {line width = \bondboldwidth},
  dash bond/.style =
    {dash pattern = on \bondhashlength off \bondhashlength},
  hash bond/.style =
    {
      dash pattern = on \bondwidth off \bondhashlength,
      line width   = \bondboldwidth
    },
}

Wavy bonds are a bit more tricky. TikZ has a ‘decorations’ library including the idea of a ‘snake’ line, but this is not quite right. Instead, I’ll use a ‘real’ sine wavy as described on the TeX-sx site.
At the same time, I want to pick up something ‘internal’ from ChemFig: the inter-atom spacing, which we set using \setatomsep. That’s stored in the macro CF@atom@sep, which I want as wavy bonds should have an integer number of repetitions along a standard-length bond:

\tikzset{
  wavy bond/.style =
    {
      decorate,
      decoration =
        {
          complete sines,
          amplitude   = \bondboldwidth,
          post length = 0 pt,
          pre length  = 0 pt,
          % Use the atom spacing: saved 
          segment length = 
            \the\dimexpr\csname CF@atom@sep\endcsname/5\relax
        }
    }
}

Okay, so how does this all look? The document input is not so bad

\chemfig{
  *6((-[,,,,hash bond])-
  -(-[,,,,wavy bond])
  -(-[,,,,dash bond])-
  -(-[,,,,bold bond])-)
}

and gives result

If you look really carefully, you’ll see that this highlights an issue. The bond junctions are just flat ‘ends’, which does not show very much for the single bonds but does where the bold bond meets the ring. If you compare with ChemDraw, you’ll see that it does not make the same error: the bonds ‘run in’ to each other. I’ve not found a way to solve that, unfortunately.

Into three dimensions

Chemical structures exist in three dimensions, and it’s very common to show this using wedged bonds, invented by Cram.

ChemFig let’s us use < in place of - for a filled wedged bond, with <| for a hollow one and <: for a dashed (backward) one. So the input we want here is

\chemfig{
  *6((<)-(<:)-(<|)-(>)-(>:)-(>|)-)
}

If you try that with no setting changes, the bonds are too wide at the ends. That’s controlled by \setcrambond, which has three parameters: the width of the bond, the thickness of hash lines and the hash line spacings. ChemDraw seems to set the wider end to the width of a bold bond plus two normal bonds, so I used

\setcrambond
  {\the\dimexpr \bondwidth * 2 + \bondboldwidth \relax}
  {\bondwidth}{\bondhashlength}

and got

Here, it’s clear that the issue with bond joins shows up a lot more than the earlier cases: it’s still reasonably subtle, but definitely shows up more strongly.

Schemes and so on

I’m focussing here on drawing individual structures, but should mention schemes and compound numbering. In the MyChemistry article I’ve already linked to, there is quite a bit about this, using a combination of ChemFig (for the schemes) and chemnum for the numbering. What I will say is that it works well provided you don’t have complex alignment needs: one of the tricky parts of creating a good-looking scheme is deciding exactly what to line up!

The other quick note I’d add on schemes is that the arrow width really should match that of bonds. So I’d use

\setarrowdefault{,,line width = \bondwidth}

in my preamble to have everything match.

Conclusions

With a bit of effort with the settings, ChemFig can produce quality output, and can get quite a bit ‘right’ (although there are a few gaps). However, as I said in the first part, I won’t be abandoning ChemDraw any time soon. I deliberately picked something reasonably straight-forward for my tests, and the sort of thing I do in my research work would be a lot harder to draw and maintain using ChemFig. In particular, I don’t fancy trying to show up three-dimensional affects (for example a benzene ring going ‘into’ the page) using a text-based approach.

So what could I recommend ChemFig for? First, the most obvious case is for people who don’t have a copy of ChemDraw. There are other graphical editors, but none of the free ones are as good as ChemDraw. So if you want high-quality output without paying, this looks the best approach I’ve seen. It also looks good for creating stand-alone documents (using ChemDraw means needing graphics files). That does look useful for me for teaching, where the structures will be in general not so complicated and where it will be perhaps better to have only a single .tex file. There’s also the fact that drawing using TikZ means that the font match using ChemFig is exact: no need to try to measure up different fonts by eye. So there are uses for ChemFig, and it’s certainly an interesting package. Now all we need is someone to write a ChemDraw to ChemFig converter!

Exploring ChemFig: Customising appearance

In my previous post, I looked at the basics of using the ChemFig package to create chemical structures. I finished that post with a structure that is complete but which I think does not look great compared with the reference version I created in ChemDraw. (There’s a MyChemistry entry that looks at similar customisation: worth a look!)

Atom placement

The first issue to tackle is the placement of atom labels. ChemFig ‘detects’ atoms, so that the labels are correctly centred relative to bonds. However, that does not work with numbered R-groups, as the numbers need to be ‘ignored’ for alignment purposes. This is a pretty common requirement, so ChemFig provides a way to ‘split’ labels, using |:

\chemfig{
  *6(-(-R|^2)=-
    (-=[::-60]N-*6(=(-R|^3)-=(-R|^4)-=(-R|^3)-))
  =(-OH)-(-R|^1)=)
}

which gives the output

To see the difference here, look at for example R2 here compared to the version in the previous post: it’s subtle, but it is there!

Atom spacing and bond width

The standard settings for ChemFig share a ‘feature’ with those for ChemDraw: they don’t look very good! As I said in the previous post, I use the Royal Society of Chemistry’s template for my structures, as I think they look much better. The template uses 7 pt text, and so the lengths, etc. all match that size. For use in LaTeX, I want things to be more flexible so wanted to convert the values into em (i.e. relative dimensions based on font size).

There are three key dimensions used by both ChemDraw and ChemFig to set how bonds look: the bond length, the line width and the gap between lines when drawing double bonds. There is also a ‘margin’ used between atom labels and bonds, so the two don’t touch. After a bit of work doing the calculation (using to the LaTeX3 FPU), I found that

\setdoublesep{0.35700 em}  % 'Bond Spacing'
\setatomsep{1.78500 em}    % 'Fixed Length'
\setbondoffset{0.18265 em} % 'Margin Width'
\newcommand{\bondwidth}{0.06642 em} % 'Line Width'
\setbondstyle{line width = \bondwidth}

was the right set up. The comments are the ChemDraw names for settings, and I’ve set the line width as a command as it turns out I’ll want it again for some more advanced things to be covered in the next post.

The central double bond

Changing the bond spacing shows up another issue: the central double bond is not right. Rather than bond to the ‘middle’ of the double bond, we want the chain to choose one ‘side’. That can be done using either _ or ^, depending on which side is required. I decided to match the ChemDraw version using

\chemfig{
  *6(-(-R|^2)=-
    (-=^[::-60]N-*6(=(-R|^3)-=(-R|^4)-=(-R|^3)-))
  =(-OH)-(-R|^1)=)
}

Atom font

The final thing to adjust to get this example right is the font used for atom labels: the convention is to use sanserif. ChemFig prints text using the \printatom command, which is set up to ensure math mode and \mathrm. Thus the simplest approach is

\renewcommand*{\printatom}[1]{\ensuremath{\mathsf{#1}}

Like many people, I use the excellent mhchem to write in-line chemical equations, so I wanted to use the \ce (or faster \cf) command for printing atoms. My initial attempt failed, with an internal error. A quick e-mail to the ChemFig author led to a fix

\makeatletter
\def\CF@node@content{%
  \expandafter\expandafter\expandafter
    \printatom\expandafter\expandafter\expandafter
      {\csname atom@\number\CF@cnt@atomnumber\endcsname}%
    \ensuremath{\CF@node@strut}%
}
\makeatother

followed by

 \renewcommand*{\printatom}[1]{{\sffamily\cf{#1}}}

leads to the final input

\documentclass{article}
\usepackage{chemfig}
\usepackage[version=3]{mhchem}
\makeatletter
\def\CF@node@content{%
  \expandafter\expandafter\expandafter
    \printatom\expandafter\expandafter\expandafter
      {\csname atom@\number\CF@cnt@atomnumber\endcsname}%
    \ensuremath{\CF@node@strut}%
}
\makeatother
\setdoublesep{0.35700 em}  % 'Bond Spacing'
\setatomsep{1.78500 em}    % 'Fixed Length'
\setbondoffset{0.18265 em} % 'Margin Width'
\newcommand{\bondwidth}{0.06642 em} % 'Line Width'
\setbondstyle{line width = \bondwidth}
\renewcommand*{\printatom}[1]{{\sffamily\cf{#1}}}
\begin{document}
\chemfig{
  *6(-(-R|^2)=-
    (-=^[::-60]N-*6(=(-R|^3)-=(-R|^4)-=(-R|^3)-))
  =(-OH)-(-R|^1)=)
}
\end{document}

and output

I’d say that is pretty good: I’d be happy to use this in a publication (although drawing the kind of structures I do my research with would be a challenge!).

In the final part of this series, I’m going to look at some other things that are needed for chemical structures but which don’t show up in the demo I’ve used. We’ll see that many can be done, but there will be one or two outstanding challenges.

Exploring ChemFig: Basics

Drawing chemical structures is one of the most important parts of my job. For me, although I love using LaTeX, the best tool for doing this is graphical: ChemDraw. There are a few reasons why I favour using ChemDraw over other approaches. Most importantly of all it produces the best output I know of (although ChemDoodle is pretty close). Complex structures are hard enough to produce and edit with a graphical tool, and the challenge of using a text-based approach makes this even more tricky. Finally, it’s what my colleagues use, so there is some realism involved.

On the other hand, you always need to be ready to try new approaches, so I’ve been meaning for a while to look at the new-ish ChemFig package, which is based on TikZ. I’m starting as a lecturer next month, so with some teaching material to prepare as an incentive I’ve decided to take another look at ChemFig. I’m going to take two or three posts to look at how I’ve got on. I won’t spoil the conclusions, but I think it’s worth saying now that I won’t be moving from ChemDraw just yet for my research work!

The target

As a first target, I decided to try to reproduce a structure I’m going to need to draw for some practical hand-outs. My favoured settings for ChemDraw are those used by the Royal Society of Chemistry, which are set up for 7 pt text to match 9 pt body text in two-column journals. I’ll be coming back to these settings a bit more in the second part of this mini-series, but for the moment let’s see what the result looks like:

The aim is to get this ‘right’, working out first how to get the structure correct using ChemFig, then get the finer points of the appearance right. In this post, I’ll tackle the basic connectivity, and in the next one how to match the appearance.

Rings and chains

As you’d expect, the ChemFig manual covers how to produce structures in some detail. Here, I’m going to look very briefly at the syntax needed to get us started. Rather than repeat myself multiple times, I’m using a simple LaTeX document

\documentclass{article}
\usepackage{chemfig}
\begin{document}
% Content here
\end{document}

for all of this.

The basic command we are going to need is \chemfig, which takes a single argument: a description of the structure required. As you might expect, this can take a bit of getting used to. For example, a benzene ring is

\chemfig{*6(-=-=-=)}

which comes out as

The syntax here is reasonably clear: * makes a ring, 6 means it’s a six-membered ring and -=-=-= is the bonding pattern in the ring.

If we just wanted a linear structure, we could omit the ring part with \chemfig{-=-=-=} giving

Decorating the ring

Adding substituents is not too hard once you work out that the first position on the ring is not the bottom but is the lower of the two left-hand atoms, and that the sequence runs anti-clockwise. The parenthesis in the ring part above might give you a clue that they are used to define groups inside the structure. So the left-hand ring we want is written

\chemfig{*6(-(-R^2)=-(-)=(-OH)-(-R^1)=)}

and gives

Hopefully the pattern is reasonably clear: you need to have a - inside the parentheses to have the bond coming off the ring, and can use ^ for superscripts in the usual TeX way.

Completing the structure

The same scheme applies to constructing the rest of the molecule: you can put one ring as a substituent on another, and can have an atom in a chain simply by including the atom name ‘in place’. However, there’s a slight issue, as

\chemfig{
  *6(-(-R^2)=-
    (-=N-*6(=(-R^3)-=(-R^4)-=(-R^3)-))
  =(-OH)-(-R^1)=)
}

is not quite right:

As you can see, the bond angle in the chain part is wrong: ChemFig does not ‘auto-stagger’ things. Of course, this is a pretty basic requirement, so there is a syntax to set the angle of a join: [::-60] will set the relative angle to 60 degrees clockwise, and all will then be well.

\chemfig{
  *6(-(-R^2)=-
    (-=[::-60]N-*6(=(-R^3)-=(-R^4)-=(-R^3)-))
  =(-OH)-(-R^1)=)
}

That completes the connectivity we want, and as you can see the input is starting to look a bit frightening (see my comment at the start of the post). It’s also not great looking compared with the ChemDraw reference version: in the next post, I’ll see how that can be addressed.

Tracking chemical compounds with chemcompounds

As a chemist, one of the things I want to do is track compound numbers (which are normally given as bold numbers, 1, 2, etc.). The traditional way to do that is by hand, which works but does require some concentration. Recent versions of ChemDraw have included an add-in for Word to do things automatically, and of course there is LaTeX support for the same idea.

In LaTeX there is a choice between two packages for tracking what is what. First, there is the bpchem package. It provides for the idea of subdivisions, so you can have 1a, 1b, 1c and so forth. However, I find the interface in bpchem is a bit awkward. The alternative is the chemcompounds package. It has a very easy to use approach to tracking, but does not have built-in support for subdivisions. So I’ve been working on how to achieve this easily in some stuff I’m writing at the moment. It turns out to be quite easy when you think about it.

The first stage is of course to load the package.

\usepackage[noimplicit]{chemcompounds}

I’ve decided to go with the option to turn off automatically creating new compound references, which means I have to declare each one separately. This requires a block of declarations in the preamble, but I actually find this easier than doing things ad hoc. The subdivisions I want are all about R groups (chemists will understand!). So I’ve started by setting up some simple R group letters (I have a family of compounds, and so it makes sense to use the same letter for the same R group in each case):

\declarecompound[a]{Mes}
\declarecompound[b]{iPr}

Hopefully you can see how this works: the optional argument sets up the label that will print, and the mandatory one is the label I’ll use to refer to the compound.
Then I need to set up the general compounds (the ones that will be 1, 2 and so on). I can let chemcompounds do the numbering, so this is easy:

\declarecompound{imidazole}
\declarecompound{pincer:salt}
\declarecompound{pincer:carbene}

The last stage in the preamble is to create the subdivided compounds. Rather than have to track the numbers and letter myself, I’ve found that I can simply refer back to the existing labels:

\declarecompound[\compound{imidazole}\compound{Mes}]
  {imidazole:Mes}
\declarecompound[\compound{imidazole}\compound{iPr}]
  {imidazole:iPr}
\declarecompound[\compound{pincer:salt}\compound{Mes}]
  {pincer:salt:Mes}
\declarecompound[\compound{pincer:salt}\compound{iPr}]
  {pincer:salt:iPr}
\declarecompound[\compound{pincer:carbene}\compound{Mes}]
  {pincer:carbene:Mes}
\declarecompound[\compound{pincer:carbene}\compound{iPr}]
  {pincer:carbene:iPr}

In the document body, things are now very easy. I just use the \compound macro. So for the general case I’ll have

\compound{imidazole}

(printing say 4) whereas for a single case I might have

\compound{imidazole:Mes}

(printing say 4a). This keeps my source easy to follow (I don’t have to remember numbers and letters, only labels), and avoids mistakes on my part.

Royal Society of Chemistry TeX Template

A while ago I talked about the variation between different chemistry publishers in their LaTeX support. Looking for something on the Royal Society of Chemistry site today I find that the people at Physical Chemistry Chemical Physics have created an updated template for TeX users. I’d say that is good news: remember of course that the journals are not typeset from the TeX source.

Chemistry journals: publishers support of LaTeX

As the author of the achemso bundle (for supporting submissions to the American Chemical Society), I get a few queries about the support various publishers provide for LaTeX. Unlike more physics-focussed journals, the chemistry journals never typeset directly from authors LaTeX sources. As a result, the acceptance of LaTeX material from authors is rather less popular, and tends to be patchy. So I thought I’d summarise things as I currently understand them.

American Chemical Society (ACS)

As I said above, I’ve written the achemso bundle specifically for submissions to the ACS. However, while the central office are happy to host a copy on their website and so on, the ACS don’t officially support the bundle. That means, in practice, that some journals are happier with LaTeX submissions than others. Each journal has its own office, and so I hear different things from people submitting to different journals. It also means that I have to pick up the requirements of each office based on feedback via authors, rather than getting any formalised instructions. There are mistakes in the achemso bundle, and there are also requirements that I don’t know about. So feedback is always useful (good or bad).

Royal Society of Chemistry (RSC)

The RSC have rather less information about LaTeX on their website than the ACS. They do mention TeX, but only very briefly. I’ve written some BibTeX styles, and a very basic article template, which are available in the rsc bundle. I’ve had a bit of feedback on these, and I hope that they at least provide a starting point for writing a submission to the RSC in LaTeX. More generally, I think the best advice is to check with the editorial office for the relevant journal before writing anything, and to stick to the basic LaTeX article class when you do.

Wiley

As with the RSC, Wiley don’t have a lot of LaTeX information. What they do say is that they only accept PDF submissions: you can’t send your source. They also say to stick to the plain article class, and basically to keep things simple.

Elsevier

Elsevier have recently had a new class written for journal submissions, elsarticle. From what I can make out on their site, you can use this for most of their journals, which should include the chemistry ones. As this has actually been written for them to order, I imagine that Elsevier is the best place to be sending LaTeX submissions to. Hopefully other publishers will see that they have made life easier for their authors and will take note.

LaTeX and Dalton Transactions

For once I have a post which combines TeX directly with my job. I’ve just received a copy of the proofs for an article in the chemistry journal Dalton Transactions (the article has DOI 10.1039/b907982c). At the top of each page I spotted

/usr/local/teTeX/share/texmf/tex/latex/techbooks/als/rsc/base2006/rsc2006v1.cls
(2004/07/27 v1.0 Standard LaTeX document class for RSC Journals)

The great irony is that although the journal (along with many others in chemistry) is typeset in LaTeX, they don’t accept LaTeX submissions! I’d love to get hold of that class file and have a look: pretty much no chance, unfortunately.

Submission template for the RSC

I’ve just uploaded a new version of my rsc package to CTAN. There are a few improvements to the BibTeX styles the package provides (mciteplus is still supported, but is no longer mandatory), but the main change is that I’ve added a short template to the bundle. I get the occasional e-mail seeking advice about writing papers to submit to the RSC, so it seemed like a good idea to provide something a bit more formalised than the odd hint to individuals.

Of course, I don’t know what the RSC want, but I’ve got a pretty good idea about what most chemistry paper drafts look like. I’ve also got the work I’ve done on achemso to go from. The basic points are to keep it simple and not to expect “publication ready” formatting. I think this confuses a lot of people who come from a more physics-based background. A lot of physics journals typeset stuff directly from authors’ drafts, and so print-ready templates are common. On the other hand, in chemistry papers tend to be submitted in Word format and are extensively altered by the publishers. So there is no real need for print-ready material when submitting to chemistry journals.

Hopefully, the clues I’ve provided in the rsc bundle will make life a little easier for prospective authors.