I made a pact with myself this year that I would actually spend the time to learn things and to do things The Right Way™ when it came to physics and math. The requirement to get a grade stands in direct opposition to my desire to learn however. For instance, in my 3rd year "modern physics" lab, the expectation is for us to use MS Excel for "data analysis" and MS Word to write up our report (we can include hand-drawn graphs, equations, and diagrams as well, pasted into the printed document with tape and glue if need be). References are managed by hand. This teaches me almost no skills that I can apply in anything else I will ever do for the rest of my life as a physicist. Carleton is part of the ATLAS collaboration that Large Hadron Collider (LHC) at CERN... real physicists working with the ATLAS collaboration (and others) use ROOT for data analysis (it sucks, but it is what it is and it is used for real work). Reports and papers (for publications and/or communication) are produced with the LaTeX document preparation system. References are managed using BibTeX. Figures are all done in Encapsulated PostScript format (EPS). Documents are shared in PDF format. Cutting and pasting paper cutouts into a printed MS Word document just isn't going to help me be a professional physicist. In defense of my 3rd year lab, it is the best that it can do because the students in that class will never have had any training in real tools by the time they get there and need to perform and do writeups on their experiments.
So... I am using LaTeX (I'd used it extensively in the 80s, but I got pretty rusty) for generating the document (it typesets equations beautifully and they are easy to "type in"), Inkscape for illustrations (it draws in Scalable Vector Graphics (SVG) format and exports beautifully to EPS), I'm converting images (PNG and JPG) to EPS format using ImageMagick's "convert" program, I'm using ROOT for all my data analysis (exporting graphs and stuff as EPS, and capturing text as required for inclusion in the document), I'm using Zotero to manage my references and export them to BibTeX format, and am converting the output of LaTeX into PDF format using the "dvipdf" conversion software. The result is a visually appealing report with proper diagrams and graphs and equations, professional data analysis, and a well managed and properly formatted references section. Every single piece of software I use is also free.
But... as with any set of tools (whether free or commercial), they are developed for the way someone or some group of people thought they should work, and have the features that support the outputs that give the biggest bang for the (development) buck (hey, time is money, even when the software is free). Today's problem is I had never used Zotero to export to BibTeX format before. I have used it extensively and successfully, as a fully integrated citation management tool, with LibreOffice, but not with BibTeX. In my searches, I'm apparently not alone with the issues I had, but as is the danger with open source projects, nobody has stepped up to address the issue I was having. What issue? As with any reference management strategy/system, one will generally want to refer to a source using some abbreviation while editing a document, and then have that reference placed into the bibliography of the document automagically by some software. So far so good with Zotero→BibTeX, it does all those things wonderfully.
The problem? The tag Zotero generates (that I need to reference in my document as I'm inserting the citation) is bollocks. For instance, the tag for the TDS3000 Series Oscilloscope User Manual that I want to cite is the arcane string: "tektronix_inc._tds3000_????". Whut? It is the author name, the title of the document (the "short title" I entered, in this case... it uses the full title if a short title is not specified), and the year (none was specified in the manual, so I had to leave it blank, and "????" was inserted by the Zotero→BibTeX translator module) concatenated with underscores ("_") and with spaces replaced by underscores. So, the assumption here is that I will have a copy of the BibTeX database open (or printed) beside me, and any time I want to insert a citation into my LaTeX document, I need to do the same: find the title (or short title, if specified), author, and year (or use "????" if there is none), replace all spaces with underscores, and then concatenate the three with underscores and use that as my label. In a word: no. In two words: hell no! Onto the 'net to find a solution and... lots of bitching, but no solution. The way I want to work is to assign a mnemonic to the reference so I can easily recall it and use that mnemonic as my tag. For instance, in this specific case, I want to use the tag "tds3000". I had stored that name in the "short title" field in the Zotero entry for the manual, but there was no way for me to tell the Zotero→BibTeX translator to use it instead. In fact, when I went to look, people had made changes to the tags, but by modifying the Javascript source code! Sigh... so my lab report was going to involve rewriting the tools I was using? So be it!
The following is detailed just to remind myself what to do... if you don't care about the technical details (and made it this far even), the story part of this post is over and you can get on with your life now ;). For posterity's sake, here's what I did... First, to find the source file. On Windoze XP, which I was using, it is stored in "C:\Documents and Settings\<my user name>\Application Data\Mozilla\Firefox\Profiles\<unique identifier for my Firefox data folder>\zotero\translators\BibTeX.js". In my case, I copied it over to Linux to edit (and copy it back to test... I have shared network drives at home that make this easy), but any text editor on Windows would work as well (as long as it allowed for raw editing, e.g. Wordpad). The first thing was to identify where the tag gets its format from. Easy enough: a variable called "citeKeyFormat" set to "%a_%t_%y"... "author", "title", "year". Okay, question: what is the format code to use the "short title" instead? Search the Internet... nothing... rummage in the source code... find the function "citeKeyConversions()" which will return the values for "%a", "%t", and "%y"... and absolutely nothing else. There is no support for any other fields. Sigh... so... start to figure out how to add support for using a "%s" format code to the "citeKeyConversions()" function. But what is the name of the "short title" field even? I tried a few things that didn't work, and eventually found a web page that had my answer: Zotero Item Types (and jumping to the "document" document type, which is what I specified in my reference entry). The answer? The field is called "shortTitle". So...
And now, the question: was the hours of frustration worth it? The answer: yes it is. My goal is to learn how to use a set of tools that will allow me to produce quality science and reports that I can share with other scientists around the world knowing that they will be in a format that is accepted by those I wish to be reading it. It also means that every time I go to produce a new document, it will be more and more about the content and not its production as I learn how to effectively use all the tools required to produce my results and publish them. If it means rolling up my sleeves (well, I'm wearing my gnauga shirt, so it's already short sleeved) and digging into the code of those tools to make them work the way I want, so be it. This two hours will save me endless frustration going forward and more than pay me back in hours saved. The only thing hanging over my head now is the question: should I make the changes to the official translator (and Zotero itself so the translator can be configured from the options page rather than having to modify the code to do it) and publish my changes? Is this even the right way to do it? For instance, another user wrote a modification that used the "tags" fields for a reference entry to specify the BibTeX item tag... if you prepended "bibtex:" to a tag, it would use the string that came after that as the citation tag to use in LaTeX. Dunno. That will have to wait until the summer to answer as I am more than slammed with the school work I have now...
Oh, one last update, I have found out that I will definitely be hired as a Research Assistant again this summer (money, yay!), and that I will be going to Fermilab for three weeks in May as part of a team doing a beam test there (there is still a chance the beam test will be cancelled, there are a lot of technical and political issues that need to be resolved for it to happen). More details to follow in a later post :).
So... I am using LaTeX (I'd used it extensively in the 80s, but I got pretty rusty) for generating the document (it typesets equations beautifully and they are easy to "type in"), Inkscape for illustrations (it draws in Scalable Vector Graphics (SVG) format and exports beautifully to EPS), I'm converting images (PNG and JPG) to EPS format using ImageMagick's "convert" program, I'm using ROOT for all my data analysis (exporting graphs and stuff as EPS, and capturing text as required for inclusion in the document), I'm using Zotero to manage my references and export them to BibTeX format, and am converting the output of LaTeX into PDF format using the "dvipdf" conversion software. The result is a visually appealing report with proper diagrams and graphs and equations, professional data analysis, and a well managed and properly formatted references section. Every single piece of software I use is also free.
But... as with any set of tools (whether free or commercial), they are developed for the way someone or some group of people thought they should work, and have the features that support the outputs that give the biggest bang for the (development) buck (hey, time is money, even when the software is free). Today's problem is I had never used Zotero to export to BibTeX format before. I have used it extensively and successfully, as a fully integrated citation management tool, with LibreOffice, but not with BibTeX. In my searches, I'm apparently not alone with the issues I had, but as is the danger with open source projects, nobody has stepped up to address the issue I was having. What issue? As with any reference management strategy/system, one will generally want to refer to a source using some abbreviation while editing a document, and then have that reference placed into the bibliography of the document automagically by some software. So far so good with Zotero→BibTeX, it does all those things wonderfully.
The problem? The tag Zotero generates (that I need to reference in my document as I'm inserting the citation) is bollocks. For instance, the tag for the TDS3000 Series Oscilloscope User Manual that I want to cite is the arcane string: "tektronix_inc._tds3000_????". Whut? It is the author name, the title of the document (the "short title" I entered, in this case... it uses the full title if a short title is not specified), and the year (none was specified in the manual, so I had to leave it blank, and "????" was inserted by the Zotero→BibTeX translator module) concatenated with underscores ("_") and with spaces replaced by underscores. So, the assumption here is that I will have a copy of the BibTeX database open (or printed) beside me, and any time I want to insert a citation into my LaTeX document, I need to do the same: find the title (or short title, if specified), author, and year (or use "????" if there is none), replace all spaces with underscores, and then concatenate the three with underscores and use that as my label. In a word: no. In two words: hell no! Onto the 'net to find a solution and... lots of bitching, but no solution. The way I want to work is to assign a mnemonic to the reference so I can easily recall it and use that mnemonic as my tag. For instance, in this specific case, I want to use the tag "tds3000". I had stored that name in the "short title" field in the Zotero entry for the manual, but there was no way for me to tell the Zotero→BibTeX translator to use it instead. In fact, when I went to look, people had made changes to the tags, but by modifying the Javascript source code! Sigh... so my lab report was going to involve rewriting the tools I was using? So be it!
The following is detailed just to remind myself what to do... if you don't care about the technical details (and made it this far even), the story part of this post is over and you can get on with your life now ;). For posterity's sake, here's what I did... First, to find the source file. On Windoze XP, which I was using, it is stored in "C:\Documents and Settings\<my user name>\Application Data\Mozilla\Firefox\Profiles\<unique identifier for my Firefox data folder>\zotero\translators\BibTeX.js". In my case, I copied it over to Linux to edit (and copy it back to test... I have shared network drives at home that make this easy), but any text editor on Windows would work as well (as long as it allowed for raw editing, e.g. Wordpad). The first thing was to identify where the tag gets its format from. Easy enough: a variable called "citeKeyFormat" set to "%a_%t_%y"... "author", "title", "year". Okay, question: what is the format code to use the "short title" instead? Search the Internet... nothing... rummage in the source code... find the function "citeKeyConversions()" which will return the values for "%a", "%t", and "%y"... and absolutely nothing else. There is no support for any other fields. Sigh... so... start to figure out how to add support for using a "%s" format code to the "citeKeyConversions()" function. But what is the name of the "short title" field even? I tried a few things that didn't work, and eventually found a web page that had my answer: Zotero Item Types (and jumping to the "document" document type, which is what I specified in my reference entry). The answer? The field is called "shortTitle". So...
var citeKeyConversions = { [...] "s":function (flags, item) { if(item["shortTitle"]) { return item["shortTitle"].toLowerCase().replace(/ /g,"_"); } return ""; }, [...] }Then change the "citeKeyFormat" variable from "%a_%t_%y" to "%s". Copy over and test... yay! But... I notice that in the BibTeX.js file, there is a line in the metadata section: "lastUpdated": "2014-03-09 06:00:00"... that seems pretty darned recent... does Zotero automatically update itself, including translators? Search web... answer: indeed it does. If I were to install my modified version, as soon as the official maintainers of BibTeX.js made a change, my copy of it would be overwritten. So, I needed to make my own custom translator I correctly surmised. I would call it "BibTex_PF.js" and install it like that... but the translator to use when exporting a collection of references is chosen from a pull-down list, so I presumed (correctly) I needed to modify the metadata to cause Zotero to add mine as a new option. A quick glance at the metadata in BibTeX.js showed it had a "label" of "BibTeX", which is what is shown in the pull-down list. Change it in my version to "BibTeX_PF", good. But then there is is mysterious line: "translatorID": "9cb70025-a888-4a29-a210-93ec52da40d4", which worried me. A little research and it turns out that each module published for use with Zotero must have a Globally Unique IDentifier (GUID), and new modules cannot use the same GUID as existing ones. How to generate a GUID for my custom translator??? More web searching and it turns out that the way all the cool Zotero developers do it is using a tool called "Scaffold - an IDE for Zotero translators". It has a spiffy button: "Generate Translator ID"... so, install the tool (it's an .xpi that installs into Firefox only and creates a new menu item under "Tools" that launches the IDE), get my ID, and copy it into my "BibTeX_PF.js" file and copy that file into the Zotero "translators" folder, and... BibTeX_PF does not show up in the pull-down list when I try to export... hmmm... restart Firefox (under the presumption Zotero loads its modules when it starts up as part of Firefox... it's an extension to Firefox the way I run it, fyi), and <insert choir of angels noises> there it is. Export the collection to BibTeX format and open the resulting file and... my new tag for the oscilloscope manual is now "tds3000". Mission accomplished! Grab some lunch before blogging about this... yum, homemade chili that I finished at 11PM last night before crawling into bed.
And now, the question: was the hours of frustration worth it? The answer: yes it is. My goal is to learn how to use a set of tools that will allow me to produce quality science and reports that I can share with other scientists around the world knowing that they will be in a format that is accepted by those I wish to be reading it. It also means that every time I go to produce a new document, it will be more and more about the content and not its production as I learn how to effectively use all the tools required to produce my results and publish them. If it means rolling up my sleeves (well, I'm wearing my gnauga shirt, so it's already short sleeved) and digging into the code of those tools to make them work the way I want, so be it. This two hours will save me endless frustration going forward and more than pay me back in hours saved. The only thing hanging over my head now is the question: should I make the changes to the official translator (and Zotero itself so the translator can be configured from the options page rather than having to modify the code to do it) and publish my changes? Is this even the right way to do it? For instance, another user wrote a modification that used the "tags" fields for a reference entry to specify the BibTeX item tag... if you prepended "bibtex:" to a tag, it would use the string that came after that as the citation tag to use in LaTeX. Dunno. That will have to wait until the summer to answer as I am more than slammed with the school work I have now...
Oh, one last update, I have found out that I will definitely be hired as a Research Assistant again this summer (money, yay!), and that I will be going to Fermilab for three weeks in May as part of a team doing a beam test there (there is still a chance the beam test will be cancelled, there are a lot of technical and political issues that need to be resolved for it to happen). More details to follow in a later post :).