Video | Strata + Hadoop World NYC 2016 | “The Evolution of Massive Scale Data Processing”

In this video, Tyler Akida presents a whirlwind tour of the evolution of massive-scale data processing at Google, from the original MapReduce paradigm to the high-level pipelines of Flume to the streaming approach of MillWheel to the portable, unified streaming/batch model of Google Cloud Dataflow and Apache Beam (incubating).

Tyler also highlights similarities and differences with related open source systems such as Flink, Spark, Storm, and Gearpump, calling out ways in which they’re converging on and diverging from the Beam model and what that means when running Beam pipelines on their respective runners. Watch Video

Edu-Video | Spark Summit East 2016 | Keynote Speakers (Day 1)

Spark Summit East 2016 took place last week in NYC. Here is the Day 1 Keynotes video. It begins with Matei Zaharia – MIT professor, Databricks co-founder and Creator of Spark – discussing the upcoming release of Spark 2.0.

It is followed by four other, very knowledgeable speakers discussing subjects like ‘Democratizing Spark,’ ‘Enterprise Spark’ and ‘Spark as an Analytics OS.’ (1Hr:12min). Enjoy! Watch Video

Blog Publisher / Head of Data Science Search

Founder & Head of Data Science Search at Starbridge Partners, LLC.