Main Content

Large Data and Clojure: the middle ground between RAM and EC2

A talk by kyle.burton

About the Talk

November 22, 2010 2:00 PM

Wharton's Huntsman Hall in room F65, 3730 Walnut Street, Philadelphia, PA 19104

Wharton's Huntsman Hall in room F65, 3730 Walnut Street, Philadelphia, PA 19104

In this talk I’ll discuss techniques I have learned for working with large data sets, which are too large to fit in RAM, but not so large you need distributed computing to work with them. I’ll discuss things like: taking random samples; finding duplicated values as well as other types of basic analysis.

Though not specific to Clojure, many of the techniques lead to elegant, easy to understand implementations in Clojure by leveraging it’s sequence abstraction as well as other aspects of the language.

Ratings and Recommendations

Avg. Rating

Average based
on 3 ratings

comments powered by Disqus