![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
I made a pact with myself this year that I would actually spend the time to learn things and to do things The Right Way™ when it came to physics and math. The requirement to get a grade stands in direct opposition to my desire to learn however. For instance, in my 3rd year "modern physics" lab, the expectation is for us to use MS Excel for "data analysis" and MS Word to write up our report (we can include hand-drawn graphs, equations, and diagrams as well, pasted into the printed document with tape and glue if need be). References are managed by hand. This teaches me almost no skills that I can apply in anything else I will ever do for the rest of my life as a physicist. Carleton is part of the ATLAS collaboration that Large Hadron Collider (LHC) at CERN... real physicists working with the ATLAS collaboration (and others) use ROOT for data analysis (it sucks, but it is what it is and it is used for real work). Reports and papers (for publications and/or communication) are produced with the LaTeX document preparation system. References are managed using BibTeX. Figures are all done in Encapsulated PostScript format (EPS). Documents are shared in PDF format. Cutting and pasting paper cutouts into a printed MS Word document just isn't going to help me be a professional physicist. In defense of my 3rd year lab, it is the best that it can do because the students in that class will never have had any training in real tools by the time they get there and need to perform and do writeups on their experiments.
So... I am using LaTeX (I'd used it extensively in the 80s, but I got pretty rusty) for generating the document (it typesets equations beautifully and they are easy to "type in"), Inkscape for illustrations (it draws in Scalable Vector Graphics (SVG) format and exports beautifully to EPS), I'm converting images (PNG and JPG) to EPS format using ImageMagick's "convert" program, I'm using ROOT for all my data analysis (exporting graphs and stuff as EPS, and capturing text as required for inclusion in the document), I'm using Zotero to manage my references and export them to BibTeX format, and am converting the output of LaTeX into PDF format using the "dvipdf" conversion software. The result is a visually appealing report with proper diagrams and graphs and equations, professional data analysis, and a well managed and properly formatted references section. Every single piece of software I use is also free.
But... as with any set of tools (whether free or commercial), they are developed for the way someone or some group of people thought they should work, and have the features that support the outputs that give the biggest bang for the (development) buck (hey, time is money, even when the software is free). Today's problem is I had never used Zotero to export to BibTeX format before. I have used it extensively and successfully, as a fully integrated citation management tool, with LibreOffice, but not with BibTeX. In my searches, I'm apparently not alone with the issues I had, but as is the danger with open source projects, nobody has stepped up to address the issue I was having. What issue? As with any reference management strategy/system, one will generally want to refer to a source using some abbreviation while editing a document, and then have that reference placed into the bibliography of the document automagically by some software. So far so good with Zotero→BibTeX, it does all those things wonderfully.
The problem? The tag Zotero generates (that I need to reference in my document as I'm inserting the citation) is bollocks. For instance, the tag for the TDS3000 Series Oscilloscope User Manual that I want to cite is the arcane string: "tektronix_inc._tds3000_????". Whut? It is the author name, the title of the document (the "short title" I entered, in this case... it uses the full title if a short title is not specified), and the year (none was specified in the manual, so I had to leave it blank, and "????" was inserted by the Zotero→BibTeX translator module) concatenated with underscores ("_") and with spaces replaced by underscores. So, the assumption here is that I will have a copy of the BibTeX database open (or printed) beside me, and any time I want to insert a citation into my LaTeX document, I need to do the same: find the title (or short title, if specified), author, and year (or use "????" if there is none), replace all spaces with underscores, and then concatenate the three with underscores and use that as my label. In a word: no. In two words: hell no! Onto the 'net to find a solution and... lots of bitching, but no solution. The way I want to work is to assign a mnemonic to the reference so I can easily recall it and use that mnemonic as my tag. For instance, in this specific case, I want to use the tag "tds3000". I had stored that name in the "short title" field in the Zotero entry for the manual, but there was no way for me to tell the Zotero→BibTeX translator to use it instead. In fact, when I went to look, people had made changes to the tags, but by modifying the Javascript source code! Sigh... so my lab report was going to involve rewriting the tools I was using? So be it!
The following is detailed just to remind myself what to do... if you don't care about the technical details (and made it this far even), the story part of this post is over and you can get on with your life now ;). For posterity's sake, here's what I did... First, to find the source file. On Windoze XP, which I was using, it is stored in "C:\Documents and Settings\<my user name>\Application Data\Mozilla\Firefox\Profiles\<unique identifier for my Firefox data folder>\zotero\translators\BibTeX.js". In my case, I copied it over to Linux to edit (and copy it back to test... I have shared network drives at home that make this easy), but any text editor on Windows would work as well (as long as it allowed for raw editing, e.g. Wordpad). The first thing was to identify where the tag gets its format from. Easy enough: a variable called "citeKeyFormat" set to "%a_%t_%y"... "author", "title", "year". Okay, question: what is the format code to use the "short title" instead? Search the Internet... nothing... rummage in the source code... find the function "citeKeyConversions()" which will return the values for "%a", "%t", and "%y"... and absolutely nothing else. There is no support for any other fields. Sigh... so... start to figure out how to add support for using a "%s" format code to the "citeKeyConversions()" function. But what is the name of the "short title" field even? I tried a few things that didn't work, and eventually found a web page that had my answer: Zotero Item Types (and jumping to the "document" document type, which is what I specified in my reference entry). The answer? The field is called "shortTitle". So...
And now, the question: was the hours of frustration worth it? The answer: yes it is. My goal is to learn how to use a set of tools that will allow me to produce quality science and reports that I can share with other scientists around the world knowing that they will be in a format that is accepted by those I wish to be reading it. It also means that every time I go to produce a new document, it will be more and more about the content and not its production as I learn how to effectively use all the tools required to produce my results and publish them. If it means rolling up my sleeves (well, I'm wearing my gnauga shirt, so it's already short sleeved) and digging into the code of those tools to make them work the way I want, so be it. This two hours will save me endless frustration going forward and more than pay me back in hours saved. The only thing hanging over my head now is the question: should I make the changes to the official translator (and Zotero itself so the translator can be configured from the options page rather than having to modify the code to do it) and publish my changes? Is this even the right way to do it? For instance, another user wrote a modification that used the "tags" fields for a reference entry to specify the BibTeX item tag... if you prepended "bibtex:" to a tag, it would use the string that came after that as the citation tag to use in LaTeX. Dunno. That will have to wait until the summer to answer as I am more than slammed with the school work I have now...
Oh, one last update, I have found out that I will definitely be hired as a Research Assistant again this summer (money, yay!), and that I will be going to Fermilab for three weeks in May as part of a team doing a beam test there (there is still a chance the beam test will be cancelled, there are a lot of technical and political issues that need to be resolved for it to happen). More details to follow in a later post :).
So... I am using LaTeX (I'd used it extensively in the 80s, but I got pretty rusty) for generating the document (it typesets equations beautifully and they are easy to "type in"), Inkscape for illustrations (it draws in Scalable Vector Graphics (SVG) format and exports beautifully to EPS), I'm converting images (PNG and JPG) to EPS format using ImageMagick's "convert" program, I'm using ROOT for all my data analysis (exporting graphs and stuff as EPS, and capturing text as required for inclusion in the document), I'm using Zotero to manage my references and export them to BibTeX format, and am converting the output of LaTeX into PDF format using the "dvipdf" conversion software. The result is a visually appealing report with proper diagrams and graphs and equations, professional data analysis, and a well managed and properly formatted references section. Every single piece of software I use is also free.
But... as with any set of tools (whether free or commercial), they are developed for the way someone or some group of people thought they should work, and have the features that support the outputs that give the biggest bang for the (development) buck (hey, time is money, even when the software is free). Today's problem is I had never used Zotero to export to BibTeX format before. I have used it extensively and successfully, as a fully integrated citation management tool, with LibreOffice, but not with BibTeX. In my searches, I'm apparently not alone with the issues I had, but as is the danger with open source projects, nobody has stepped up to address the issue I was having. What issue? As with any reference management strategy/system, one will generally want to refer to a source using some abbreviation while editing a document, and then have that reference placed into the bibliography of the document automagically by some software. So far so good with Zotero→BibTeX, it does all those things wonderfully.
The problem? The tag Zotero generates (that I need to reference in my document as I'm inserting the citation) is bollocks. For instance, the tag for the TDS3000 Series Oscilloscope User Manual that I want to cite is the arcane string: "tektronix_inc._tds3000_????". Whut? It is the author name, the title of the document (the "short title" I entered, in this case... it uses the full title if a short title is not specified), and the year (none was specified in the manual, so I had to leave it blank, and "????" was inserted by the Zotero→BibTeX translator module) concatenated with underscores ("_") and with spaces replaced by underscores. So, the assumption here is that I will have a copy of the BibTeX database open (or printed) beside me, and any time I want to insert a citation into my LaTeX document, I need to do the same: find the title (or short title, if specified), author, and year (or use "????" if there is none), replace all spaces with underscores, and then concatenate the three with underscores and use that as my label. In a word: no. In two words: hell no! Onto the 'net to find a solution and... lots of bitching, but no solution. The way I want to work is to assign a mnemonic to the reference so I can easily recall it and use that mnemonic as my tag. For instance, in this specific case, I want to use the tag "tds3000". I had stored that name in the "short title" field in the Zotero entry for the manual, but there was no way for me to tell the Zotero→BibTeX translator to use it instead. In fact, when I went to look, people had made changes to the tags, but by modifying the Javascript source code! Sigh... so my lab report was going to involve rewriting the tools I was using? So be it!
The following is detailed just to remind myself what to do... if you don't care about the technical details (and made it this far even), the story part of this post is over and you can get on with your life now ;). For posterity's sake, here's what I did... First, to find the source file. On Windoze XP, which I was using, it is stored in "C:\Documents and Settings\<my user name>\Application Data\Mozilla\Firefox\Profiles\<unique identifier for my Firefox data folder>\zotero\translators\BibTeX.js". In my case, I copied it over to Linux to edit (and copy it back to test... I have shared network drives at home that make this easy), but any text editor on Windows would work as well (as long as it allowed for raw editing, e.g. Wordpad). The first thing was to identify where the tag gets its format from. Easy enough: a variable called "citeKeyFormat" set to "%a_%t_%y"... "author", "title", "year". Okay, question: what is the format code to use the "short title" instead? Search the Internet... nothing... rummage in the source code... find the function "citeKeyConversions()" which will return the values for "%a", "%t", and "%y"... and absolutely nothing else. There is no support for any other fields. Sigh... so... start to figure out how to add support for using a "%s" format code to the "citeKeyConversions()" function. But what is the name of the "short title" field even? I tried a few things that didn't work, and eventually found a web page that had my answer: Zotero Item Types (and jumping to the "document" document type, which is what I specified in my reference entry). The answer? The field is called "shortTitle". So...
var citeKeyConversions = { [...] "s":function (flags, item) { if(item["shortTitle"]) { return item["shortTitle"].toLowerCase().replace(/ /g,"_"); } return ""; }, [...] }Then change the "citeKeyFormat" variable from "%a_%t_%y" to "%s". Copy over and test... yay! But... I notice that in the BibTeX.js file, there is a line in the metadata section: "lastUpdated": "2014-03-09 06:00:00"... that seems pretty darned recent... does Zotero automatically update itself, including translators? Search web... answer: indeed it does. If I were to install my modified version, as soon as the official maintainers of BibTeX.js made a change, my copy of it would be overwritten. So, I needed to make my own custom translator I correctly surmised. I would call it "BibTex_PF.js" and install it like that... but the translator to use when exporting a collection of references is chosen from a pull-down list, so I presumed (correctly) I needed to modify the metadata to cause Zotero to add mine as a new option. A quick glance at the metadata in BibTeX.js showed it had a "label" of "BibTeX", which is what is shown in the pull-down list. Change it in my version to "BibTeX_PF", good. But then there is is mysterious line: "translatorID": "9cb70025-a888-4a29-a210-93ec52da40d4", which worried me. A little research and it turns out that each module published for use with Zotero must have a Globally Unique IDentifier (GUID), and new modules cannot use the same GUID as existing ones. How to generate a GUID for my custom translator??? More web searching and it turns out that the way all the cool Zotero developers do it is using a tool called "Scaffold - an IDE for Zotero translators". It has a spiffy button: "Generate Translator ID"... so, install the tool (it's an .xpi that installs into Firefox only and creates a new menu item under "Tools" that launches the IDE), get my ID, and copy it into my "BibTeX_PF.js" file and copy that file into the Zotero "translators" folder, and... BibTeX_PF does not show up in the pull-down list when I try to export... hmmm... restart Firefox (under the presumption Zotero loads its modules when it starts up as part of Firefox... it's an extension to Firefox the way I run it, fyi), and <insert choir of angels noises> there it is. Export the collection to BibTeX format and open the resulting file and... my new tag for the oscilloscope manual is now "tds3000". Mission accomplished! Grab some lunch before blogging about this... yum, homemade chili that I finished at 11PM last night before crawling into bed.
And now, the question: was the hours of frustration worth it? The answer: yes it is. My goal is to learn how to use a set of tools that will allow me to produce quality science and reports that I can share with other scientists around the world knowing that they will be in a format that is accepted by those I wish to be reading it. It also means that every time I go to produce a new document, it will be more and more about the content and not its production as I learn how to effectively use all the tools required to produce my results and publish them. If it means rolling up my sleeves (well, I'm wearing my gnauga shirt, so it's already short sleeved) and digging into the code of those tools to make them work the way I want, so be it. This two hours will save me endless frustration going forward and more than pay me back in hours saved. The only thing hanging over my head now is the question: should I make the changes to the official translator (and Zotero itself so the translator can be configured from the options page rather than having to modify the code to do it) and publish my changes? Is this even the right way to do it? For instance, another user wrote a modification that used the "tags" fields for a reference entry to specify the BibTeX item tag... if you prepended "bibtex:" to a tag, it would use the string that came after that as the citation tag to use in LaTeX. Dunno. That will have to wait until the summer to answer as I am more than slammed with the school work I have now...
Oh, one last update, I have found out that I will definitely be hired as a Research Assistant again this summer (money, yay!), and that I will be going to Fermilab for three weeks in May as part of a team doing a beam test there (there is still a chance the beam test will be cancelled, there are a lot of technical and political issues that need to be resolved for it to happen). More details to follow in a later post :).
frickin BRAVO!
Date: 2014-03-14 05:01 am (UTC)Re: frickin BRAVO!
Date: 2014-03-14 07:00 am (UTC)I find it appalling that the tools used are so sub standard
I don't actually find the tools "sub standard"... in fact, the tools are mind-blowingly awesome. As I said, I've used Zotero with LibreOffice and the power and sophistication in that environment is nothing short of astonishing for free software. It beats anything commercial I've ever tried. What seems to be the case is I'm operating (as I seem to often do... sigh) in the wilderness of software environments with what I'm doing at the moment. In such instances, it's not unsurprising to find bugs or at least rough edges all over. The way open source software works is that I (and you, and everyone) have access to the source code and can enhance and/or repair it if we are so inclined (or modify it in any way we see fit... and depending on the license, have to release our changes back into the free software community to let them reuse our artistry and skill).
Failing the ability to fix it (insufficient skill or time perhaps), the next best thing is generating a quality bug report or feature request. The value of testing and good feedback cannot be underestimated, and can make it much easier to continue the development of the software by the original developers and new developers that want to get involved. The key is that the people doing the work don't owe anyone else anything... they are working on the software because they feel like working on the software and will pick and choose what interests (or at least vexes) them. Etiquette demands a cautious and somewhat passive approach to requesting changes and fixes because many a promising or already amazing project has been abandoned because people using it were too demanding or inconsiderate (I have seen many a "hey dumbass why don't you fix your piece-of-shit software, I can't use it for what I want to do", and that doesn't help anyone).
Thus, my comment of "I need to figure out what I want to do about the BibTeX export functionality grumbles I have". Do I care enough to put in the effort to think about what The Right Thing™ is for addressing the usability problems that I had (others may feel very differently about it)? If so, am I prepared to engage in a public discussion and possibly even debate about the merits of various approaches? Will I have the time and skill to code and test a solution for others to test? Do I think the current package maintainers will welcome external changes? Does the main Zotero program need to be modified to properly support the changes? If so, is there a way of making the changes such that other packages can take advantage of a general framework for setting module parameters? Again, do I personally have the ability to put in that sort of time and effort? If I just makes changes for myself, it's just a few hours of work and I'm done. If I want to make changes to the supported free software packages, then I'm dealing with other humans all over the world and need to deal with all that entails. Maybe it'll be good, maybe it'll be bad, heh. There's no way to tell before starting, of course. Alternately, I could just put my thoughts out there (there are usually "bug/feature tracking" repositories associated with most major projects that are used for that sort of thing) and hope that someone else already in that particular development community likes my ideas and runs with the changes themselves. Finally, I can just make the changes I think should be made and share them with the world, and people can use them or ignore them as they see fit (including the package maintainers). Given that others seemed to be having the same struggles as I did, just tossing stuff "out there" might be sufficiently helpful and have a lasting, positive, impact (that's more the "hacker" mentality).
Eric S. Raymond wrote what is considered the seminal work on how such an anarchy can, and does, work: The Cathedral and the Bazaar.
P.S. Now that I think about it, what I really, really want is an Emacs extension that interfaces with Zotero so I can include citations in my LaTeX documents as easily as I can do it in LibreOffice. It'd be pretty cool if there was a way or auto-exporting the BibTeX collection from Zotero without having to go through the Zotero user interface (Zotero's LibreOffice integration package handles building the bibliography automatically and dynamically based on the citations pulled into the document being worked on... which is pretty darned cool). See... more work ;).
Re: frickin BRAVO!
Date: 2014-03-14 12:39 pm (UTC)*of course my comment wad based on my inderstanding or misunderstanding of what you wrote. Hehe
Re: frickin BRAVO!
Date: 2014-03-15 07:06 pm (UTC)$13.5 billion to find the Higgs bo'sun? Huh, what's all the fuss about?
Re: frickin BRAVO!
Date: 2014-03-18 02:30 am (UTC)As well, there is also the paranoia of the US NSA having added backdoors to Windows letting them spy on you and your work. With open source, you can actually look at the code to make sure the NSA isn't spying on you.