Last week, Criteolabs’ s Justin Coffey gave an excellent talk on Enabling a Large Internal Analytic User Base with Hadoop & Co at GOTO Amsterdam
In his talk he detailed Criteo’s Hadoop-based analytics stack and how we take advantage of technologies like Scalding, Hive, Vertica and Tableau (plus a few in-house items) to facilitate our different user groups’ daily interactions with it.
Abstract
Criteo is a performance driven company that requires a minimum of analytic prowess from all of its employees. We process up to 50 billion real time bidding requests daily resulting in an ingest rate on the order of 25TB/day to a 1000+ node Hadoop cluster. The many petabyte-scale data warehouse is accessed directly by hundreds of engineers, analysts and data scientists and indirectly by over a thousand operational users via domain specific datamarts and reporting systems.
Presentation slides can be found here: http://bit.ly/1fAReof
-
Senior Dev Lead at Criteo Paris Justin Coffey is a senior staff devlead at Criteo in charge of the Analytics Infrastructure team. He oversees (and even manages the occasional contribution to) the development of better tools to manage the petabytes of analytic data employed by hundreds of Criteo analysts and engineers across the world. With over 15 years of experience working in the Internet, Justin has worked with web technologies since their inception. Prior to working for Criteo, Justin worked in a number of Internet startups as an on-hands engineering manager helping drive explosive growth at the early stages.
See Dev Lead roles