A better approach to municipal open data

Every municipality that is releasing open data is doing the right thing, but some are doing a better job of it than others. Municipalities are new to open data, so these teething pains should be no surprise. The wider open data community is obliged to point out where municipalities are falling down but we are also obliged to point out who is opening their data more effectively.

Washington DC

Washington DC is a nice little place on the Potomac River that might be best known for its museums and sports teams. Or perhaps it is best known for being a sister city to Sunderland, England[1]? Washington DC should be best known for the Washington DC open data catalog which is comprehensive in scope and impressive in technical merit.

Washington DC open data highlights

The Washington DC open data catalog is impressive for the diversity of its data sets. Included are auditors' reports, artistic fellowship grants and FOIA requests as well as geo data like sidewalks, no-fly zones, fire stations and hazardous waste locations.

As impressive as the list of data sets is, and it is an impressive list, the presentation of their data in suitable formats is even better. Geographic data is presented in shape file as well as kml format, and GeoRSS feeds are available for live data. The use of appropriate and open data formats makes the DC data catalog available to the widest possible audience. They even include a feed for additions and updates to the catalog. This critically important feature to increase use of their data sets should seem obvious enough but is missing from many municipal open data sets.

The license for the Washington DC open data catalog is pretty good, but still falls down as a home-grown license with unintended drawbacks. The bulk of the license disclaims liability and warns that DC could change the service in the future. The hidden trap is that the data sets carry a notification requirement.

Washington DC open data fails the cake test


This notification requirement, that the user must notify the District of Columbia by email seems both reasonable and benign. As a term and condition, notification is certainly not onerous for most users and it serves as a great reminder to thank DC for their open data catalog. The trap here is in the unintended consequences and the edge cases. A notification requirement fails the cake test.

Formulated by Iván Sánchez Ortega, the cake test asks, could I make a gift of this data, in the form of a cake for the data publisher, and serve it at a surprise party? And in this case we could not, for if we have to notify the District that we are using the data to create a beautiful map cake, then surely it will spoil the surprise. After all, it's just a cake.

Arguments of Oh, that's silly and surely the District of Columbia wouldn't mind if you violated their license just this once are missing two points. Firstly, that the notification requirement discourages wider, approved use of the data. And secondly, that the notification requirement also has negative, unintended consequences in serious situations as demonstrated in the desert island test and the dissident test in the Debian Free Software guidelines

Minor changes for a major improvement

Washington DC can remove the unintended drawbacks of their open data catalog license by applying the Public Domain Dedication and License to their data sets. This will bring the DC data into the light, where they pass The Cake Test, The Dissident Test and the Desert Island Test as well. Additional benefits to Washington DC are that they no longer have to maintain their own in house data license, they no longer have to worry about enforcing their data under their license and they increase adoption of their data in the long tail.

It is no coincidence that the most widely used open data is the US federal government data released to the public domain, such as the TIGER data set. Data publishers may not have the legal tools to put their data into the public domain, depending on their copyright law jurisdiction. The PDDL allows data publishers to release their data in a manner as close as possible to the Public Domain.

Developers using the Washington DC open data will thank DC for adopting a well known license and reducing the burden to consider and understand Yet Another Home Grown Municipal Data License. And that is a delicious piece of cake indeed.

License version

Publishing a license version, and a method to track changes in the license would also be welcome.

Related articles

Consider reading this detailed review of the weaknesses in the early version of the open data license used in Edmonton, Toronto and Vancouver.

References

[1] http://en.wikipedia.org/wiki/Washington,_D.C.#Sister_cities

Credits

Thinker statue photo CCBY by Brian Hillegas.
Map Cake photo © 2009 CCBY R.Weait.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <q> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <img> <h1> <h2> <h3> <h4> <pre> <sup> <sub> <blockquote>
  • Lines and paragraphs break automatically.

More information about formatting options