Introducing Dat: If Git Were Designed For Big Data

Cloning datasets locally, munging them into the format that you need and then indexing and querying are parts of the complex workflow of working with data that hasn’t yet been made easy to capture and share.

I believe that being able to represent and share these workflows is what will enable the fabled “GitHub for data”. This presentation will introduce Dat and it’s approach to data versioning, transformation and sync and why they are important for open (and closed) data.

