-
Matthew McCullough 4.58
Description:
Moore’s law has finally hit the wall and CPU speeds have actually decreased in the last few years. The industry is reacting with hardware with an ever-growing number of cores and software that can leverage “grids” of distributed, often commodity, computing resources. But how is a traditional Java developer supposed to easily take advantage of this revolution? The answer is the Apache Hadoop family of projects. Hadoop is a suite of Open Source APIs at the forefront of this grid computing revolution and is considered the absolute gold standard for the divide-and-conquer model of distributed problem crunching. The well-travelled Apache Hadoop framework is curently being leveraged in production by prominent names such as Yahoo, IBM, Amazon, Adobe, AOL, Facebook and Hulu just to name a few.
Details
In this session, you’ll start by learning the vocabulary unique to the distributed computing space. Next, we’ll discover how to shape a problem and processing to fit the Hadoop MapReduce framework. We’ll then examine the incredible auto-replicating, redundant and self-healing HDFS filesystem. Finally, we’ll fire up several Hadoop nodes and watch our calculation process get devoured live by our Hadoop grid. At this talk’s conclusion, you’ll feel equipped to take on any massive data set and processing your employer can throw at you with absolute ease.
Comments on this Talk
Matthew McCullough,
15 Jan 05:07 PM
That is an occasional mistake of mine. I'll be sure to repeat all questions at my next speaking engagement.
axiom6,
15 Jan 11:54 PM
I really liked how you described the MapReduce and Hadoop approach of propagating the functions to the data and getting back the results. With Hadoop you provided insight about what the functional paradigm is all about.

My only caveat in this talk was that I could not hear some of the questions/responses from the audience and you did not repeat them for the general audience.