Big Data – an overloaded term that spans anything from a few terabytes to petabytes. At Criteo, the real-time nature of our bidding system firmly plants us in the truly Big Data camp. Our bidding system deals with 1M QPS of requests, generating 150+ TB of new data while guaranteeing 99.95% availability. So, you can bet that we appreciate all things truly Big Data. Towards this end, we recently launched a meetup group in Palo Alto, Criteo Labs Tech Talks, that is focused on exchanging knowledge, experience and wisdom that was gained from engineering and managing Big Data systems.
At our first meetup, we hosted Karthik Ramasamy who gave a talk on Data Stream Processing at Scale. Karthik initiated and oversaw the development of Heron, the next generation streaming system at Twitter as a Storm alternative. He is the author of a book “Network Routing – Algorithms, Protocols and Architectures”, has authored several publications and patents in large scale data processing.
Karthik gave us an insider’s view on the journey from Storm to Heron, walking us through the real-time constraints of Twitter and on the limitations that were faced while using the popular alternatives of Kestrel and Kafka and on the need to use distributed logs. He also answered practical questions from the audience on how the Heron system was provided as a service to other groups and on the ease of upgrading topologies. It was a comprehensive talk that spoke to all aspects of Heron’s evolution and we urge you to check it out here.
We also had a lightning talk, by our Staff Dev Lead, Neil Thombre who gave a quick introduction to the Analytics stack at Criteo, which manages 7 petabytes of data fielding time-sensitive queries from over 1500+ users. Neil also spoke about the Open Source culture at Criteo and highlighted our recent contribution to Vertica and our upcoming Not Another Big Data Conference.
Criteo Labs has an exciting lineup of Big Data speakers lined up for the upcoming instances of our Tech Talk Meetup group, spanning topics from large scale machine learning to database optimization. Looking forward to interacting with you there!
Jump into the talks presented below:
Subscribe to our Palo Alto Tech Talks Meetup
Post written by:
Staff Software Enginer, R&D – Palo Alto
Our lovely Community Manager / Event Manager is updating you about what's happening at Criteo Labs.See DevOps Engineer roles