Wednesday 4 February 2009

NDHA demo and the National Library

This morning I went to the NDHA demonstration where a National Library techie talked us through the NDHA ingest tools. The tools are the most visible piece of the NDHA infrastructure, and are designed to unify the ingest of digital documents, whether they are born-digital documents physically submitted (i.e. producers mail in CDs/DVDs etc); born-digital documents electronically submitted (i.e. producers upload content via wizzy web tools); or digital scans of current holdings produced as part of the on-going digitisation efforts. The tools have a unified system with different workflows for unpublished material (=archive) and published material (=libarary). The unification of library and archival functionality seemed like futile ground for miscommunication.

The infrastructure is (correctly) embedded across the library, and uses all the current tools for collection maintenance, searching and access.

As a whole it looks like the system is going to save time and money for large content producers and capture better metadata for small donors of content, which is great. By moving the capture of metadata closer to the source (while still allowing professional curatorial processes for selection and cataloguing), it looks like more context is going to be captured, which is fabulous.

A couple of things struck me as odd:

  1. The first feedback to the producer/uploader is via a human. Despite having an elaborate validation suite, the user wasn't told immediately "that .doc file looks like a PDF, would you like to try again?" or "Yep, that *.xml file is valid and conforming XML" Time and again studies have shown that immediate feedback to allow people to correct their mistakes immediately is important and effective.
  2. The range of metadata fields available for tagging content was very limited. For example there was no Iwi/Hapu list, no Maori Subject Headings, no Gazetteer of Official Geographic Names. When I asked about these I was told "that's the CMS's role" (=Collection Management Software, i.e. those would be added by professional cataloguers later), but if you're going to move the metadata collection to as close to content generation, it makes sense to at least have the option of proper authority control over it.
Or that's my take, anyway. Maybe I'm missing something.

1 comment:

tompasley said...

Stuart, just reading you blog for the first time... and saw this post.

In short, "No - I don't think you're missing anything".

I'm surprised that more "systems", (which require detailed metadata), do not expose more/most metadata fields at the point of ingest/input.

My experience is that this is your best chance of getting (accurate) information which otherwise might not ever be added/corrected by the submitter.

At the point of ingest/input, you should still be able provide a thesaurus to pick and choose from, as well as the ability to add new terms.