January 15th, 2015

Great Engineers and Crap CVs

The Criteo R&D recruitment team receives a few applicant CVs, not as many as we would like as we are not that well known (yet) but far more hires come from headhunted candidates and referrals. We are used to working with LinkedIn profiles, github contributions, blogs, as well as notes from our discussions about what you do and want to do. I have been known to present only the sparse LinkedIn and github profiles instead of the available (and horrible) CV kindly sent by the great engineer I have been hunting for months. And our CV reviewers are more interested in seeing your code than what school you went to, or that you like cooking [Hiring Manager chiming in: SAY WHAAT ? Of course you get bonus points if you like cooking ! We just love home-baked cookies. Now if you “like hiking, reading and going to the movies”, then we couldn’t care less indeed, stop wasting pixels].

October 10th, 2014

Criteo Announces the Mo PB Mo Problems World Tour!

That’s right, Criteo R&D is on its way to NYC for Hadoop World 2014 and is bringing its Mo  PB Mo Problems World Tour with it! No, this isn’t a PR / marketing campaign to tell the world how much “big data” we have and why we’re so cutting edge that you should become a client immediately (which is all true, btw). This is our Paris – Bay Area recruitment tour with a (sole) stop at the 2014 NYC Hadoop World conference!

If you hadn’t heard, Criteo has a massive infrastructure and massively interesting engineering problems to tackle and so we’re hiring massively :).

September 25th, 2014

Kaggle contest dataset is now available for academic use!

We have launched a Kaggle challenge on CTR prediction 3 months ago.
Large participation, close race …
…and the winner will officially be announced next week!

Some updates on the contest have been presented at the Paris Machine Learning Meetup. Please visit the site for video of the meetup and slides.
We have updated the curves representing the evolution of the contest over time:

kaggle

Meanwhile the dataset is now available for academic use.
This one is pretty big, have a lot of fun with it.
http://labs.criteo.com/downloads/2014-kaggle-display-advertising-challenge-dataset/

JB Tien.

September 10th, 2014

PoH – Part 3 – Distributed optimization

In the context of web advertising, it is crucial to be able to make predictions extremely quickly as little time is given to send a bid to the ad exchange. On average, Criteo is able to predict the click probability of a user in less than 100 microseconds, as opposed to the 50 milliseconds required by deep models, and does so up to 500 000 times per second. This is the main reason why generalized linear models like the logistic regression, which are simple, are still widely used in our industry. As such models are faster to train, the move to distributed learning was therefore not as much a priority as it might have been for other companies.