Large Data and Clojure: the middle ground between RAM and EC2 4.83 http://spkr8.com/t/5117

Description:

In this talk I’ll discuss techniques I have learned for working with large data sets, which are too large to fit in RAM, but not so large you need distributed computing to work with them. I’ll discuss things like: taking random samples; finding duplicated values as well as other types of basic analysis.

Though not specific to Clojure, many of the techniques lead to elegant, easy to understand implementations in Clojure by leveraging it’s sequence abstraction as well as other aspects of the language.

Comments on this Talk

Avatar-missing-icon-07 Nick Canzoneri, 23 Nov 02:06 PM

I really enjoyed the talk Kyle. I think the only thing I would recommend would be to have a few more slides showing the intermediate steps of some of the calculations. Go through the first few steps of the algorithms, instead of just "talking it out".

Stream.31287 kyle.burton, 23 Nov 04:19 PM

Thank you Nick, putting more into the slides wouldn't be difficult, knowing that it would improve it helps. Thank you!

Stream.22482 Dave Konopka, 23 Nov 05:55 PM

Really enjoyed the talk as well. You did a good job connecting theory with practical use. Also did a good job of responding to questions and feedback during the talk. It might be helpful to do a quick overview of all approaches up front to set the stage for the structure of the talk.

Have an account? Sign in or register.

Leave a Comment

3 Ratings: 4.83

Delivery: 4.67

Content: 5.00

Time & Location

November 22, 2010 — 07:00 PM
Wharton's Huntsman Hall in room F65, 3730 Walnut Street, Philadelphia, PA 19104 (Map It)

Room: Wharton's Huntsman Hall in room F65

Part of a Series

Phillly Lambda (1 talk)