Spark Meetup: Monoids, Store, and Dependency Injection - Abstractions for Spark Streaming Jobs

Below is the presentation I gave at the Spark User Meetup on 01/16/2014

Monoids, Store, and Dependency Injection - Abstractions for Spark Streaming Jobs

Abstract:

One of the most difficult aspects of deploying spark streaming as part of your technology stack is maintaining all the code associated with stream processing jobs. In this talk I will discuss the tools and techniques that Sharethrough has found most useful for maintaining a large number of spark streaming jobs. We will look in detail at the way Monoids and Twitter’s Algebrid library can be used to create generic aggregations. As well as the way we can create generic interfaces for writing the results of streaming jobs to multiple data stores. Finally we will look at the way dependency injection can be used to tie all the pieces together, enabling raping development of new streaming jobs.

About Me

Hi I'm Ryan, I am currently building data products to help people understand their bodies and eliminate chronic disease before it begins. I'm interested in machine learning, data systems, startups & software engineering.

On this blog you can find my posts on Ruby, Scala, data infrastructure and open health data analysis.

Find me online at the following locations:

Ryan Weald's Blog

Spark Meetup: Monoids, Store, and Dependency Injection - Abstractions for Spark Streaming Jobs

Monoids, Store, and Dependency Injection - Abstractions for Spark Streaming Jobs

Abstract:

Comments