*Also – Please take a look at the bottom of this post for links to Four “Spark heavy” open positions. Thanks.
Here, Juliet Houghland, a Data Scientist from Cloudera talks about the client-side need/demand for PySpark (An API that exposes the Spark programming model to Python) and best practices. Concludes with short Q & A.
Below, Aaron Davidson from Databricks talks about some more recent problems he sees emerging as more people, use more Spark with more data. And the lessons learned. The presentation ends with specific common pitfalls slides and Q & A.
A few of our top open positions – Spark, Scala & Python skill-sets