The second year of NABDConf was, dare I say it, even better than the inaugural 2016 edition! This year not only did the talks exceed expectations, but so did the weather, with both lunch and evening festivities held on Criteo Paris’ rooftop deck.
You can of course have a look at the schedule of talks to get the abstracts, but I’d like to touch on a few of the highlights.
First we had a great talk from Josh Baer of Spotify to open the conference reviewing the how and the why of getting Spotify up and running on GCP.
Nicolas Belmonte from Uber then wowed the crowd with some ridiculously beautiful in-browser visualizations built off of deck.gl.
While I don’t want to toot Criteo’s horn too much, we did close out the morning session with François Jehl, Pawel Szostek and Neil Thombre’s work on HLL which shows huge promise for distinct counts on OLAP workloads in Vertica.
In the afternoon we had something of a data-production track with inspiring stuff on the data developer’s work cycle at Spotify from Rafal Wojdyla and then a little bit of data workflow development history (and future!) from Guillaume Bort and myself in which we introduced our new open source scheduler Cuttle. We closed the track with the final talk on the subject from Marc Bux of Humboldt University in Berlin with his approach on scheduling scientific workflows in YARN.
Rounding out the talks you have BigGraphite (Graphite on Cassandra) from Corentin Chary, to how we build our billion node, billion edge user graph from Bruno Roggeri to the discussion of the best named project ever, DataDisco (Criteo’s hdfs data schema/discovery framework) from Francois Visconte and Mathieu Chataigner.
Last year we had lots of requests to put presentations online, and I am very happy to say that not only have we done so, but we took the extra step of filming all of the talks as well. You can relive this year’s experience via the videos below:
Moving to the Cloud: A Story from the Trenches – Josh Baer, Spotify
Visualizing Data with deck.gl – Nicolas Garcia Belmonte
HLL performance characteristics in large-scale aggregations over structured data – François Jehl, Pawel Szostek, Neil Thombre, Criteo
Building a billion node / billion edge graph – Bruno Roggeri, Criteo
BigGraphite – Graphite meets Cassandra to Scale Monitoring at Criteo – Corentin Chary, Criteo
Data pipeline at Spotify – from the inception to the production – Rafal Wojdyla, Spotify
One schema to rule them all and kill your data legacy – Francois Visconte, Mathieu Chataigner, Criteo
Time-series workflow scheduling with Scala in Langoustine – Guillaume Bort, Justin Coffey, Criteo
Hi-WAY: Execution of Scientific Workflows on Hadoop YARN – Marc Bux, Humboldt University of Berlin
Photos from event
Curious to see how things went down at this year’s event? Follow this link
A special thank you to our speakers and sponsors (Vertica and Criteo) and of course to our wonderful attendees. Without all of you this conference wouldn’t exist!
See you in 2018 for yet Not Another Big Data Conference.
Our lovely Community Manager / Event Manager is updating you about what's happening at Criteo Labs.See DevOps Engineer roles