Damn you ampersand!

I’ve been working on an Excel version of the Google scripts written by Stephen Flynn, which automates the searching of the Sherpa/Romeo database, you can view his blog post and video here. I’m very glad that I found his article, it inspired me to explore this automated workflow again**.

I want to create an Excel version because I want something that anyone in my library can use with a click of a button, and we won’t be limited by Google the number of calls we can make each day. Excel spreadsheets also integrate with our workflow better than XML and Google spreadsheets. Additionally, I want to get specific restrictions for preprint and postprint archiving, as well as all the other conditions for each title, so all required information is brought down to the spreadsheet, we don’t have to run the search again when a title requires double checking.

I’ve been agonising about that piece of VB code, while it works for most of the rows in my test spreadsheet, I can’t seem to get the information for a bunch of journal titles. Coincidentally they’re all of one publisher, I don’t think publisher would make any difference because the Sherpa API returns XML data. I thought there must be something wrong with my code but I cannot determine what’s wrong. Why does it work for every other title I tested but a bunch of titles by a particular publisher? I examined the XML results 2000 times, I rewrote my code in at least 4 different ways and the result is still the same. So I bit the bullet and asked a question on Stack Overflow, mentally prepared for the harsh critics, or even worse, no response at all.

Screen Shot 2013-07-10 at 10.27.35 PM

Someone replied and said there’s an invalid character in the XML document. I was like, WTF? I didn’t notice that?! All this time I was thinking there’s something wrong with my code, but it’s actually something wrong with the Sherpa data….

Screen Shot 2013-07-10 at 10.30.25 PM

So there we go, my codes are probably fine. Looks like it’s the & that appears in the copyright link text tag for all titles of one publisher, that’s why it returned a result but it won’t parse at all, I’ll report the problem to Sherpa. I’m learning something new every day!

Edit: I learned how to bypass the validation from the person who answered my question, and my codes are now working fine. woohoo! 

Ahh… the wonderful world of codes 🙂

** My manager suggested the use of Sherpa API a couple of years ago, but at the time the XML reference in Excel was not available to my machine, therefore it wasn’t possible at the time.

Leave a comment