How to Publish Open Data —And Why It's Important
Why Publish Open Data?
There's a lot of data available in Metro Nashville. It's hugely valuable for all of us within Metro's government, since data allows us to spot and solve problems, and track progress along the way.
And, data can also help Nashville residents answer their own questions about the city. Wondering what's going on in your neighborhood? Residents can check out NashView! Want to know the number of pothole complaints Metro received this month? Head to hubNashville. Eager to make Nashville a better place to live and work? Residents can visit Metro's Open Data Portal as well as learn what the Urban Institute knows about Open Data.
But all these resources and insights for Nashville's government and residents are only available when data is loaded onto our open data portal. That's where you — department-level Data Champions — come in. On this page, we'll explain how to identify data that can be published and the process to get a dataset live on data.nashville.gov.
Open data benefits Metro internally by allowing departments to share data across agencies to facilitate interdepartmental collaboration,” Keith Durbin, Metro Nashville Chief Information Officer and ITS Director.
The Data Publication Process
- Catalog and classify departmental datasets: As a start, each department in Nashville should catalogue and inventory all its available data.
- Prioritize identified public data assets for publication: With a catalogue in place, the next step is to determine which data should make its way onto the portal. You might prioritize data related to departmental or city council goals or data that is frequently requested by residents. (See recommendations from our open data software provider Socrata.)
- Prepare data set metadata: Accurate metadata is essential. It helps make it clear what users will see in the dataset. In a perfect world, metadata will be so accurate and descriptive that no user will ever need to approach the data owner with questions. See more about metadata below.
- Submit metadata and data sample to Metro Data Steering: The six-person committee will review the dataset's metadata to make sure it's accurate, easy to grasp, and uses the correct terminology.
- Determine "best" visualization to release with data: A wall of numbers is difficult for everyday users — residents, journalists, etc. — to absorb. During the publishing process, consider the optimal way to present data, such as a pie chart, bar graph or map. Read these tips on determining the best visualization option for a given dataset.
- Decide on publicity for your dataset: Publishing the dataset is really just the first step — to get staffers and residents engaged with it, they have to know the dataset is available. You can share that a dataset is live through newsletters and social media, as well as informing the media. Don't forget to celebrate when the dataset goes live!
How Long Does It Take to Publish Data?
With a good command of data and active data owner engagement in the process, it is possible to publish a dataset in a single week — subject to timing restrictions. Every dataset is reviewed by a six-member Data Steering Committee, which meets the first week of every month when there is new data to consider. Complex data sets with non-public data generally have longer consultative reviews with owners.
Metro Data Champions can access our Metro Data Champions SharePoint site for forms and submissions.
Why Metadata Is Important
Metadata is all the information about data that a data user or consumer would ask. It's the who, what, when, where, and why of the dataset, if you will. Metadata gives technical (or structural), descriptive, and administrative details to users to allow them to begin to make sense of the data.
Metadata normally includes information about data owners, update intervals, source systems, variables, contents, licensing, and more.
Ensure Solid Metadata
For each dataset that's published, Data Coordinators must fill out Metro's Metadata template (see right).
During your metadata review, examine it to make sure that all the variables in the sample data are present and well described. You'll want to share information about the source of the data, how frequently it's updated, and who the data owner is.
To steer you though this process, think back to the questions you had when you first examined the data.