Metadata Madness

August 3, 2008

ISO19139 – hello? hello?

Filed under: inspire, iso19139 — metadatamadness @ 4:24 pm

Hello? What were the designers of ISO19139 thinking? Were any of them, in fact, thinking?

Consider this, my fellow sufferers:

gmd:identificationInfo/gmd:MD_DataIdentification/gmd:pointOfContact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_Contact/gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString

This is the hoop-jumping necessary to extricate the email address of a contact person for a data set, from a piece of ISO19139 XML. That’s 10, count ‘em, 10 XML element levels deep.

This is metadata madness!

  • Why are so many elements repeated, with a short name then a long one?
  • Why is every text string wrapped in a gco:CharacterString element – just in case an XML parser doesn’t realise that it’s looking at text?

I suspect there’s a tyranny of the toolset at work, an expectation that everyone’s working with W3C XML Schema and not with a more relaxed schema language or, imagine, none at all.

19139 is simply grotesque. It makes a joke of the word “standard”. The INSPIRE expert group on Metadata skirted round the issue, but it’s not being recommended for use in European projects – the dog’s not barking, though ISO may be.

I wouldn’t care, except that it is starting to affect me now. Organisations buy a proprietary toolkit, read some dodgy abstract reference documents which say that 19115 is The Standard Way To Do Geo-Metadata and 19139 is The Standard Way To Put It In XML. Then some poor muggins has to write code to actually re-use the information.

I am replacing an old, minimal elementtree based 19139 parser with an XSLT stylesheet which transforms the data into RDF/XML. The verbosity of XSLT and of 19139 complement each other beautifully, leading to a baroque intertwining which would make great net.art wallpaper but is hopeless for information management purposes.

I am not even going near the topic of model overdesign issues with ISO19115 itself because people with influence and vested interest quite sensibly do not care.

March 30, 2007

after all, why blog about metadata?

Filed under: inspire, iso19115 — metadatamadness @ 8:40 pm

Recently my work has become far too niche, acronym-ridden and full of curious and monotonous purpose to inflict on the Mapping Hacks blog. Recently i helped co-ordinate a free and open source software community response to the draft Implementing Rules for Metadata underlying the INSPIRE directive establishing a spatial data infrastructure in Europe, *deep breath*, and I learned a lot during that process and while trying to follow the corresponding US process of establishment of a new metadata profile based on the ISO19115 standard. A couple of weeks ago I had a look at ISO 19115 in this rough essay written after reading the draft North American Profile for metadata, and I’m not alone in holding a dissenting view on the grounds of overcomplexity and lack of machine-reusability.

I’ve been researching metadata models, exchange interfaces and appropriate standards, for a BitTorrent-based data distribution project with Terradue, using GeoNetwork with a mimimal Dublin Core based profile using GeoRSS and iCal to indicate more specific spatio temporal events, based on a simple model called called DCLite4G, a minimal information model for metadata oriented towards GeoRSS and RDF. This is something i have worked on via wiki and email over the last year with Stefan Keller, based on a collective effort by the Geodata Committee at the Open Source Geospatial Foundation, using the FGDC Core standard model as a reference.

Recently I gave a talk to a cosy geoforum convened by Stefan in Zurich. The slides for my talk (huge 23Mb pdf) are partly more visual illustration than they are narrative of what I was actually saying; I have a half-written essay about “open process” in geodata re-use and redistribution which I’ll post here when it’s done.

At the Open Knowledge Foundation Rufus has been doing some good work on a web interface for a testable generic metadata repository service for data packages, with transparent versioning in the backend. I hope at some point the work on geospatial data contribution and search services, with the advice of people in the “information retrieval” community, will connect up with this sort of thing.

So I would like to talk on a blog about all this sort of thing and consider that if even just three people really connect with it, the time spent writing it will have been totally worthwhile.

Blog at WordPress.com.