About the Talk
March 29, 2012 7:25 AM
Marines’ Memorial Club & Hotel 609 Sutter Street San FranciscoMarines’ Memorial Club & Hotel 609 Sutter Street San Francisco
Data-parallel processing frameworks are being introduced at a fast pace and Erlang seems to be particularly well suited to soft real-time applications with high level of data parallelism and processing concurrency where reliability is important. With increased memory size and multi-core computing capabilities of modern processors and introduction of high-performance persistent storage, Erlang covers increasing portions of response time-data volume chart and complements very well existing large-scale analytics platforms like Hadoop.
This talk aims at presenting a case for building soft real-time, scalable data-parallel processing pipelines in Erlang.
We present architecture and simple specification language for building data-parallel flows in Erlang and share use cases covering data-parallel methods such as map-reduce and iterative graph algorithms to illustrate flexibility of the proposed approach. We discuss other important elements of the architecture such as capacity planning for typical use cases, relationship with other ecosystem components, instrumentation and monitoring, scheduling, replication and failover.
Talk objectives: Introduce the general area of data-parallel processing. Make case for using Erlang for data-parallel processing. Share proposed architecture for building data-parallel pipelines in Erlang. Share use cases and lessons learned; solicit feedback on proposed architecture
Target audience: System and infrastructure architects, large-scale data architects and scientists, CTOs, CIOs.