We were happy to talk at home for the Paris Kafka meet-up on January 15th. Every month, we gladly host Meetups within our offices in Paris, and this January, we started the year right by welcoming the Kafka community.
Being one of the biggest Kafka users worldwide, we couldn’t miss this opportunity to present the challenges we face and how we solve them using Kafka. Sharing the technicity of our challenges, explaining how we made technical choices and following up on the results, whether successes or pitfalls, is part of the Criteo R&D culture on how we choose to give back to the community.
Criteo’s first Kafka cluster was deployed 4.5 years ago to prototype a streaming application with Storm. Three years ago, it replaced our whole business logging pipeline to feed data into our centralized data lake. Today, we are using several streaming frameworks for our data workflows and for replication, namely Kafka Connect, Kafka Stream, and Flink.
As streaming is growing fast to reduce latency compared to batching, we expect to double our Kafka infrastructure in the coming months. What scale are we talking about? Here are some key figures on Kafka at Criteo:
- Up to 7M+ messages/second (400 billion messages/day)
- 180+TB/day
- 200+ Kafka brokers; 14 clusters on 7 datacenters
- 150+ topics
- 4000+ partitions / cluster
We were glad to see that more than 200 attendees enjoyed the two talks about:
- How we manage one of the biggest Kafka Infrastructure in Europe – talk by Ricardo Paiva, Senior DevOps Lead at Criteo
- How Quicksign use Kafka as a backbone for their Microsoft infrastructure – talk by Quicksign CTO Cedric Vidal
Want to know more? Find back the slides of our presentations here
Want to join a team of experts? Find our open roles here
Want to attend our next events? Find our community here
-
Our lovely Community Manager / Event Manager is updating you about what's happening at Criteo Labs.
See DevOps Engineer roles