Beyond BibTeX: the first biber beta release

A notice in my inbox from François Charette alerted me yesterday to the first beta release of biber. This is a cross-platform (Perl) replacement for BibTeX for biblatex users. By moving on from BibTeX, there are a number of advantages. First, the problems inherent in the BibTeX code (no Unicode support, memory limitations and so on) are removed. The need for something beyond BibTeX is a well-known problem.

More importantly, biber comes with an experimental XML file format as a replacement for the .bib file type. The limitations of the .bib approach are more subtle than those of BibTeX itself. A lot of problems stem from the simplicity of the .bib format. This severely limits how much detail can be given for complex data types. For example, there is no good way to give multiple publishers and locations in the .bib format:

publisher = {{First Company} and {Second Company}},
location = {Town and OtherTown}

So which place goes with which publisher? We might assume the first with the first, the second with the second, but why not both locations for both publishers or something more complex? By starting from the biblatex experience, biber can build in this type of data from the start. Particularly in arts subjects, this looks like a great idea.

One important question is publicity. Within the LaTeX world, biblatex is still not that widely used. This is of course partly as it is still in beta. However, for biber to succeed both it and biblatex need publicity outside of LaTeX. One obvious route is to talk to the people who develop things like JabRef, BibDesk and so on. These programmes use the .bib format to store data, but can be used without ever going near LaTeX. Other ideas are I’m sure welcome!

Over all, I’m very excited by biblatex and biber. It’s clear that biblatex is a major advance for LaTeX users, and anything that makes life easier is welcome.

11 thoughts on “Beyond BibTeX: the first biber beta release

  1. For all its limitations, bibtex (the format) has the great advantage that its widely used. I’d say that in my experience it’s – despite its limitations – probably the most common exchange format (maybe .ris has a similar degree of proliferation). While a replacement for bibtex (the program) is certainly long overdue we wont gain much if it uses a niche format. So getting in contact with all kind of people and institutions dealing with bibliographic data is extremely important IMO.

  2. This is one of the really key points. Of course, biber reads .bib format and gives the correct output, so there is no loss of backward compatibility. My impression is that the BibLaTeXML format is most useful for people who can’t currently get their data into the BibTeX format without issues: mainly people in the arts and humanities. If the current formats are no good to people, they probably have little “investment” in them in any case (as I understand it a lot of LaTeX bibliographies for the humanities have to be done more or less by hand).

  3. This is not meant as an objection to your post, I guess we basically agree. I just want to analyze some problems we face here, since several levels are involved here:

    – The original BibTeX format is extremely limited in terms of entry types and fields. With the original bibtex it’s quite impossible to create a decent bibliography for humanities or languages besides English. – The good thing is though, that the format is extremely simple and you can easily add additional fields without any problem. If the respective bst style or bibliographic app can’t read it it simply gets ignored. By this way, new bibtex styles have added all kind of new fields and types over the years. BibLaTeX also extends the traditional format and adds a plethora of new entry types and fields which already solve many of the traditional problems with BibTeX.

    – One important advantage of the simplicity of the bibtex format is that an app like BibDesk or JabRef can easily be adapted by the user to work with biblatex’s additions. I heavily rely on BibDesk and have zero problems using it with BibLaTeX since from the perspective of the file format, BibLaTeX really just uses traditional BibTeX files. Unfortunately, the “flatness” of the BibTeX format also means that some problems simply can’t be solved (as the example you mention). Certain hierarchical structures are just beyond the scope of the BibTeX format. This is where biber and BibLaTeXML (potentially) come in.

    – But if we assume that biber and BibLaTeXML can actually solve all or most of the problems of the traditional BibTeX format, we have another problem: While BibDesk et al. can easily handle the bibtex version of BibLaTeX they know nothing about BibLaTeXML. While I’m sure that they could be adapted this would mean much more work than using the BibTeX version of BibLaTeX (which can basically be used out of the box). And once we dabble with hierarchical fields, there are also not so easy questions regarding the user interface (how are hierarchies best presented in an interface?). As I said I heavily rely on BibDesk; if I actually had to choose between using BibLaTeX’s bibtex version together with BibDesk or BibLaTeX with no proper editor, I would always go for the first option; I guess many users would feel the same. To make a long story short: If a new file format wants to succeed in this area, support from the bibliography managers is the key.

  4. Simon, thanks for the very detailed comment: a good analysis of why BibTeX is hard to beat. My feeling is that in areas where BibTeX is “good enough” (sciences, for example) it will be very hard to displace unless there is a massive reason. (I can see how this *might* happen, but perhaps the topic for a separate post.) On the other hand, if BibTeX is not providing sufficient tools (for humanities and so on) then there is more drive to move across. I guess part of the question is “Are people in humanities using JabRef/BibDesk/whatever at the moment anyway?” If they are not (due to the limitations) then they might be easier to convince with a better format.

    However, editing XML by hand is not most people’s idea of a good time, so some kind of visual editor is still needed. For the moment, I’m sticking with JabRef and using the on-the-fly conversion. At least this means I get UTF8 entries with no issues. Once the BibLaTeXML format is sorted out, then *if* it looks good then support in open-source management tools is a must. (I don’t see closed-source people being very interested, unfortunately.)

  5. From my own experience, most people in humanities don’t use any kind of bibliographic app. I’m sure this varies greatly between disciplines and universities, but at least in my surroundings bibliographic apps are still an exception (and if they use some kind of software, many use that digital piece of crap called Endnote – “because it’s the standard”).

  6. The advantage of an XML format is quite a bonus though – it’s not just like having another arbitrary file format competitor to .bib. XML has widespread support in terms of APIs, tools, libraries and knowledge of how to use it. BibTeX .bib files never had that. So an XML file format has a huge head start over just another file format which, if you want to use it, you have to write your own parser. Nobody will have to do that.

    It’s a very good thing that the Biber author provides RNG and W3C schemas for the format. Potentially, with enough abstraction and modular interface design, Biber could read any standard XML bib format which nicely separates the format and biblatex oriented processing requirements. It’s already heading in that direction since Biber can parse .bib files and also biblatexml which are seperate code-paths internally but which converge on a common internal data structure.

  7. On the topic of gaining acceptance, Biber needs to be a “drop-in” replacement in another way other than generating good .bbl files. It should probably also output STDOUT/STDERR messages in the same format as BibTeX as things like AucTeX (and other tools I suspect) depend on this output format, parsing it in various ways. I am currently modifying my local copy of Biber to do this to see what’s required to make Biber drop-in to AucTeX so that the various functions which parse BibTeX output can be used as-is.

  8. I think most people who’ve looked at how to go beyond the .bib format have proposed some kind of XML-based solution. The key question seems to be how to get enough people to use the same solution (you can get .bib files from lots of database sources, but each XML export is different). As you say, in principle biber should be able to handle different XML formats, but most people want all of their own references in one or more files in the same format. So getting them there is important.

    As to the type of messages that written to the terminal, this is of course to be considered. But with biblatex the data comes from latex, not bibtex, in any case. So I’m not sure what messages are really needed.

  9. You’re right – the specific schema people use will be important nbut not as important as it was with BibTex as this only parses one, very specific format. Biber can already do two and since one of them is XML-based it’s a little easier to add support for more. It was the jump from .bib to .xml that was the major leap. I can imagine Biber using only biblatexml as the XML input of choice and possibly others will provide some XSLT or whataver to pre-process other commonly used XML formats into this. This provides some pressure to converge. Having Biber parse everything doesn’t provide that sort of pressure so is probably not such a good idea now you mention it.

    On the message topic – it’s true that with biblatex, bibtex/biber do less than bibtex usually does but they still validate the bib data in certain ways (repeated keys, missing keys etc.) and this is reported by bibtex on STDOUT. AucTeX certainly parses this output and uses it and it’s fairly easy to make Biber produce the same messages which avoids having to write code for AucTeX to integrate Biber nicely. I am sure other TeX tools also parse BibTeX errors/warnings in the same way so I think this is a worthwhile exercise in order to smooth the migration.

  10. oh, just by the way : there ARE some people in the humanities using these kinds of stuffs — those who were doing sciences before, for example !

Leave a Reply