About the Talk
July 22, 2010 10:30 AM
Portland, OregonPortland, Oregon
How does Twitter analyze its massive dataset? What tools do we use, and where do we focus our analysis? In this talk, I will discuss our transition from a MySQL-based to a Hadoop-based data infrastructure and our use of Pig (a scripting language built on top of Hadoop) to democratize big-data analysis across the company. I will present concrete examples of interesting analyses at each step.