Tuesday 20 October 2015

Thoughts on the NDFNZ wikipedia panel






Last week I was on an NDFNZ wikipedia panel with Courtney Johnston, Sara Barham and Mike Dickison. Having reflected a little and watched the youtube at https://www.youtube.com/watch?v=3b8X2SQO1UA I've got some comments to make (or to repeat, as the case may be).

Many people, including apparently including Courtney, seemed to get the most enjoyment out of writing the ‘body text’ of articles. This is fine, because the body text (the core textual content of the article) is the core of what the encyclopaedia is about. If you can’t be bothered with wikiprojects, categories, infoboxes, common names and wikidata, you’re not alone and there’s no reason you need to delve into them to any extent. If you start an article with body text and references that’s fine; other people will to a greater or less extent do that work for you over time. If you’re starting a non-trivial number of similar articles, get yourself a prototype which does most of the stuff for you (I still use https://en.wikipedia.org/wiki/User:Stuartyeates/sandbox/academicbio which I wrote for doing New Zealand women academics). If you need a prototype like this, feel free to ask me.

If you have a list of things (people, public art works, exhibitions) in some machine readable format (Excel, CSV, etc) it’s pretty straightforward to turn them into a table like https://en.wikipedia.org/wiki/Wikipedia:WikiProject_New_Zealand/Requested_articles/Craft#Proposed_artists or https://en.wikipedia.org/wiki/Enjoy_Public_Art_Gallery Send me your data and what kind of direction you want to take it.

If you have a random thing that you think needs a Wikipedia article, add to https://en.wikipedia.org/wiki/Wikipedia:WikiProject_New_Zealand/Requested_articles  if you have a hundred things that you think need articles, start a subpage, a la https://en.wikipedia.org/wiki/Wikipedia:WikiProject_New_Zealand/Requested_articles/Craft and https://en.wikipedia.org/wiki/Wikipedia:WikiProject_New_Zealand/Requested_articles/New_Zealand_academic_biographies both completed projects of mine.

Sara mentioned that they were thinking of getting subject matter experts to contribute to relevant wikipedia articles. In theory this is a great idea and some famous subject matter experts contributed to Britannica, so this is well-established ground. However, there have been some recent wikipedia failures particularly in the sciences. People used to ground-breaking writing may have difficulty switching to a genre where no original ideas are permitted and everything needs to be balanced and referenced.

Preparing for the event, I created a list of things the awesome Dowse team could do as follow-ups to they craft artists work, but we never got to that in the session, so I've listed them here:
  1. [[List of public art in Lower Hutt]] Since public art is out of copyright, someone could spend a couple of weeks taking photos of all the public art and creating a table with clickable thumbnail, name, artist, date, notes and GPS coordinates. Could probably steal some logic from somewhere to make the table convertible to a set of points inside a GPS for a tour.
  2. Publish from their archives a complete list of every exhibition ever held at the Dowse since founding. Each exhibition is a shout-out to the artists involved and the list can be used to check for potentially missing wikipedia articles.
  3. Digitise and release photos taken at exhibition openings, capturing the people, fashion and feeling of those era. The hard part of this, of course, is labelling the people.
  4. Reach out to their broader community to use the Dowse blog to publish community-written obituaries and similar content (i.e. encourage the generation of quality secondary sources).
  5. Engage with your local artists and politicians by taking pictures at Dowse events, uploading them to commons and adding them to the subjects’ wikipedia articles—have attending a Dowse exhibition opening being the easiest way for locals to get a new wikipedia image.
I've not listed the 'digitise the collections' option, since at the end of the day, the value of this (to wikipedia) declines over time (because there are more and more alternative sources) and the price of putting them online declines. I'd much rather people tried new innovative things when they had the agility and leadership that lets them do it, because that's how the community as a whole moves forward.

Thursday 15 October 2015

Feedback on NLNZ ‘DigitalNZ Concepts API‘



This blog post is feedback on a recent blog post ‘Introducing the DigitalNZ Concepts API’ http://digitalnz.org/blog/posts/introducing-the-digitalnz-concepts-api by the National Library of New Zealand’s DigitalNZ team. Some of the feedback also rests on conversations I've had with various NLNZ staffers and other interested parties and a great stack of my own prejudices. I've not actually generated an API key and run the thing, since I'm currently on parental leave.
  1. Parts of the Concepts API look very much like authority control, but authority control is not mentioned in the blog post or the docs that I can find. It may be that there are good reasons for this (such as parallel comms in the pipeline for the authority control community) but there are also potentially very worrying reasons. Clarity is needed here when the system goes live.
  2. All the URLs in examples are HTTP, but the ALA’s Freedom to Read Statement requires all practical measures be taken to ensure the confidentiality of the reader’s searching and reading. Thus, if the API is to be used for real-time searching, HTTPS URLs must be an option. 
  3. There is insufficient detail of of the identifiers in use. If I'm building a system to interoperate with the Concepts API, which identifiers should I be keeping at my end to identify things that the DigitalNZ end? The clearer this definition is, the more robust this interoperability is likely to be, there’s a very good reason for the highly structured formats of identifiers such as ISNI and ISBN. If nothing else a regexp would be very useful. Personally I’d recommend browsing around http://id.loc.gov/ a little and rethinking the URL structure too.
  4. There needs to be an insanely clear statement on the exact relationship between DigitalNZ Concepts and those authority control systems mapped into VIAF. Both DigitalNZ Concepts and VIAF are semi-automated authority matching systems and if we’re not carefully they’ll end up polluting each other (as for example, DNB already has with gender data). 
  5. Deep interoperability is going to require large-scale matching of DigitalNZ Concepts with things in a wide variety of GLAM collections and incorporating identifiers into those collections’ metadata. That doesn't appear possible with the current licensing arrangements. Maybe a flat-file dump (csv or json) of all the Concepts under a CC0 license? URLs to rights-obsessed partners could be excluded.
  6. If non-techies are to understand Concepts, http://api.digitalnz.org/concepts/448 is going to have to provide human-comprehensible content without an API key (I’m guessing that this is going to happen when it comes out of beta?)
  7. Mistakes happen (see https://en.wikipedia.org/wiki/Wikipedia:VIAF/errors for recently found errors in VIAF, for example). There needs to be a clear contact point and likely timescale for getting errors fixed. 
Having said all that, it looks great!