short talk delivered during ALA Midwinter 2016, LITA Top Tech Trends Panel
Today, I’d like to discuss a trend I call broad data praxis. Data praxis itself is the combination of the theory and the practice of working with data. Typically, data praxis is naturalized within a particular discipline. There is a long tradition of data praxis in a number of STEM disciplines and to a lesser degree in the Social Sciences. Data praxis is emergent in the Humanities in part through the activity of people working in the Digital Humanities community.
A broad data praxis is distinct insofar as it is not geared toward any one disciplinary community. Rather a broad data praxis is geared toward communication of information that is intended to resonate with a diffuse community of information consumers. We witness a prime example of a broad data praxis in data driven journalism.
Think interactive web based visualization accompanying in depth reporting on police violence in the United States. Think sharing the underlying data and methodologies via something like Github.
Given that journalism, generally speaking is geared toward communication of information to a diffuse population, the broad data praxis that takes place there holds the potential to normalize fundamental components of data praxis. Issues like data provenance, data documentation, and data quality are elaborated in the context of telling broadly resonant stories.
We see evidence of broad data praxis in action in the Quartz ‘Guide to Bad Data’ and Buzzfeed’s “What we’ve learned about sharing our data analysis”.
On the library side this all raises the profile of working with data, and understanding how to interpret it from a specialist domain suited to subsets of the population, to something that is core to the experience of a much broader swath of the population. It makes sense then that we try to think critically on how a broad data praxis relates to data information literacy efforts. Similarly we might ask how a broad data praxis impacts library collection infrastructure and interface. For more data fluent users, is item by item access enough? Should we torrent our data, e.g. academic torrents? With respect to use, should we allow users to fork our collections, e.g. Knight Foundation funded Dat? Fun questions.
I think I’ve set a record for saying the word ‘data’ in 5 minutes or less so Ill close there!