A different kind of data standard

This year, the UN gave me the chance to bring people together to work on data standards for humanitarian crises (like the Ebola or Syria crisis). We put together a working group from multiple humanitarian agencies and NGOs and got to work in February.  The result is the alpha version of the Humanitarian Exchange Language (HXL, pronounced /HEX-el/), a very different kind of data standard.

What’s wrong with data standards these days?

Unlike most data standards, HXL is cooperative rather than competitive. A competitive standard typically considers the way you currently work to be a problem, and starts by presenting you with a list of demands:

  • Switch to a different data format (and acquire and learn new software tools).
  • Change the information you share (and the way your organisation collects and uses that information).
  • Abandon what is valuable and unique about your organisation’s data (and conform to the common denominator).

For HXL, we reversed the process and started by asking humanitarian responders how they’re actually working right now, then thought out how we could build a cooperative standard to enhance what they already do.

Not JSON or XML

Given the conditions under which humanitarian responders work in the field (iffy connectivity, time pressure, lots to do besides putting together data reports), we realised that an XML-, JSON-, or RDF-based standard wasn’t going to work.

The one data tool people already know is the infamous spreadsheet program, so HXL would have to work with spreadsheets.  But it also had to be able to accommodate local naming conventions (e.g. we couldn’t force everyone to use “ADM1” as a header for what is a province in Guinea, a departamento in Colombia, or a governorate in Syria). So in the end, we decided to come up to add a row of hashtags below the human-readable headers to signal the common meaning of the columns and the start of the actual data. It looks a bit like this:

Location name Location code People affected
#loc #loc_id #aff_num
Town A 01000001 2000
Town B 01000002 750
Town C 01000003 1920

The tagging conventions are slightly more-sophisticated than that, also including special support for repeated fields, multiple languages, and compact-disaggregated data (e.g. time-series data).

HXL in action

While HXL is still in the alpha stage, the Standby Task Force is already using it as part of the international Ebola response, and we’re running informal interoperability trials with the International Aid Transparency initiative, with planned more-formal trials with UNHCR and IOM.

We also have an interactive HXL showcase demo site, and a collection of public-domain HXL libraries available on GitHub. More news soon.

Credits

Thanks to the Humanitarian Data Exchange project (managed by Sarah Telford) at the UN’s Office for the Coordination of Humanitarian Affairs for giving me the opportunity and support to do this kind of work, to the Humanitarian Innovation Fund for backing it financially, to the HXL Working Group for coming together to figure this stuff out, and especially to CJ Hendrix and Carsten Keßler for their excellent work on an earlier incarnation of HXL and for their ongoing support.

About David Megginson

Scholar, tech guy, Canuck, open-source/data/information zealot, urban pedestrian, language geek, tea drinker, pater familias, red tory, amateur musician, private pilot.
This entry was posted in Aid, Data and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s