The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
This presentation puts side by side all the ways to do queries and scripts in Hadoop: MapReduce, Pig, Hive, SQL, Spring XD…
You’ll get a clear answer to questions such as:
* Should you use MapReduce?
* When to use Pig and Hive? Do they have limitations?
* When to use Spring XD?