Our Criteo Labs team @ Europython ’16
EuroPython is a generalist conference around the Python programming language. Started in Belgium in 2002, it has the particularity of being organized by volunteers and moving its location regularly around Europe. This year, for the 2nd time, it took place in Bilbao, Spain between the 17th and 24th of July.
One unique feature of this conference is that: the talks selection is mainly based on attendees vote and the recordings of the talks are freely available to anyone during the event (via streaming) and after (see here).
From 250 attendees at its beginnings, EuroPython has become the largest Python conference in Europe and the 2nd worldwide (after PyCon) with 1100+ attendees in 2016.
The conference lasts 8 days. It starts by 1 day for workshops, followed by 5 days of talks, trainings, helpdesk and poster sessions held in 7 parallel tracks to finish with a week-end of sprints on open source projects.
Criteo was the proud sponsor for this edition of EuroPython and 10 Criteo’s attended the conference. Among them, Rudy Sicard gave a talk on ‘ Leveraging documentation power for better web APIs’
Why is Python important for Criteo?
If it is true that C#, Java and Scala are the main programming languages used at Criteo, Python succeeded in finding its place in different teams thanks to its flexibility and speed of prototyping:
Daily, tens of software updates are deployed in production on more than 12 000 Windows servers, most of them directly exposed on the web. The deployment of software is a critical task where Python provides us both scalability and the agility to react for new features and issues.
With such a rhythm, we need safeguards to ensure our business still runs smoothly despite inevitable bugs & issues. All our software & hardware are measured constantly. Specific analysis happen after each update, generic ones are done continuously and are summarized by reports, graphs, alerts. Python is present at all the steps of this monitoring chain.
The Criteo Infrastructure is composed of more than 20 000 servers and 3 000 network devices across 15 data centers worldwide. Those numbers continue to grow. To operate the infrastructure at such a scale, dedicated tools have been developed to manage data centers. Python is used there to speed up the development of new features.
Criteo is a heavy user of machine learning algorithms. Deciding which algorithm to use is the result of a lot of research. Thanks to its data science and machine learning libraries as well as its integration with plenty of tools, Python is the language of choice to investigate new solutions.
Why was Criteo present?
As part of the R&D Culture @ Criteo, we consider it important that our engineers can attend big conferences to keep themselves up-to-date on the technologies they use.
Also, we are constantly looking for new talents who’d like to tackle exciting problems at Criteo’s scale and such a conference is the occasion of meeting talented developers.
So, if you’re interested in working with Python (or any of the other languages used at Criteo) in an international, technology-driven company, or just curious, drop us an email at email@example.com or have a look at our current opportunities.
With so many talks to listen to, it could be hard to decide where to start. To help you, each Criteo engineer who attended the conference picked one talk and explained in a few words why they liked it.
Here is a talk I went because the title sounded fun and I didn’t know anything about the matter. It is about the personal experience of Radomir, a Red Hat software engineer interested in robotics in his spare time. After an introduction to the physics of walking robots (number of legs, balance, number of leg joints, placing a foot in a 3 dimensional space) illustrated by hand-made drawings, he presented the various robots he has built. Obviously, most of the programming work was done using Python or microPython on various “embedded” chips (BBC microkit, Arduino, Raspberry Pi…). In the end, it was a good surprise since I got to learn a few things and it makes part of robotics less blurry. (Anael Beutot, Software Engineer, Criteo R&D)
Javier Arias Losada – Machine learning for dummies with Python
Nowadays, we hear a lot about machine learning and its hype, plus we cross it every day: smartphones, web search results, image recognition, even your car. There is a huge potential in this technology, but ML is a lot of things and so it was a little bit blurry to me. Javier explained it very clearly from the main concepts to the types of ML as well as presenting the different algorithms available but also showing concrete python code examples. (Djothi Carpentier, Software Engineer Criteo R&D)
Manuel Miranda – Where is the bottleneck?
When we develop a program, we must be sure to meet the set performance targets. In his presentation, Manuel discusses about 2 important aspects: (1) the strategy for optimization (What we do want to do?, at what cost?, do we have enough knowledge of the code, context, environment?, …) and (2) the tools for optimization (speed, memory, resources… From basic ones (time, htop, ntop, …), to memory profiler, line profiler, plop, …). You can use this presentation both for general development (strategy part) and operational dev (tools for optimization) (Gilles Bourguignon)
If you don’t have much time, just take 1 minute to read Erik’s slides, which basically are a pretty bullet-point list of things to not forget when you’re building a service that might become big and heavily used before you’d realize you’re not ready for it. If you have half an hour, listen for his feedback on his experience on building an efficient and continuously used service built on well-known technologies. (Hugues Lerebours)
Python’s double-underscore (“__x__”) methods and attributes go by many names, including “special”, “dunder”, and “magic”. They are used by everyone whether they know it or not. Doing an addition of objects, getting an item from a list and many other actions are disguised dunders. However, they can be magical if you know how to hack them. Anjana showed us in a very fun and pedagogical way how we can use them, modify them and even ease our everyday work, creating a CrazyList doing weird stuff. With these examples, the Python developers have a better understanding of the dunders mechanism and have thus more knowledge to optimize their code. (Remi Guillard)
Mircea Zetea – Managing Technical Debt ( slides)
We all know the term: ‘Technical Debt’. But did you really think about it ? What does it means ? What is the money ? How it evolves ? Is it ok to have it ? Mircea gives us clues about how you
could prevent, accept, reduce or pay a debt, and the money of that debt. Debt can be contracted by ignorance or on purpose as a time trade-off for an imperfect something but now – like a bank loan. It’s not mandatory to pay it (don’t try with your bank!). It’s not good or bad. It all depends on the interests you pay for it and the time amount it takes to reduce it. (Rudy Sicard)
Anjana Vakil – Exploring Python Bytecode ( slides)
From the point of view of the Python developer, bytecode is a necessary by-product of her/his activity but clearly not something one looks at. In her talk, Anjana showed in a very pedagogical way how you can look at the bytecode generated from your code using the dis module and what you can gain from diving in it. In particular, she used bytecode to explains a counter-intuitive puzzle showing that knowing what happens down there can help a developer understand what happens at a higher-level in his/her code. (Renaud Bauvin)
Victor Stinner – FAT Python: a new static optimizer for Python 3.6
I love Python for its elegance and plasticity but, sometimes, I wish the interpreter would be a bit faster. The “FAT Python” project, led by Victor Stinner, is just about that. FAT Python introduces a new peephole static optimizer that does its best to optimize the abstract syntactic tree (AST). It is very different from PyPy with its tracing just-in-time compiler. FAT Python is intended to be seamlessly integrated into and improve upon CPython. The main idea behind FAT Python is that, while Python is a very dynamic language, what is not mutated can still be optimized. For each function, FAT Python holds both the original AST and an optimized one. Then, at runtime, it runs the optimized code, if no dependencies have been mutated, or the non-optimized one otherwise. The AST optimizer already implements well-known optimizations like loop unrolling, constant folding, dead code removal, etc. It is written in pure Python which make it easy to enhance and experiment with. I don’t know about you, but I find that absolutely exciting!
We take the opportunity to thanks the organizers for the great job they did in organizing this edition and for the attendees we met at our booth or during informal discussions around pinchos for the insights they brought.
The locations and date of EuroPython 2017 is not yet known but for sure, see you next year!
Our lovely Community Manager / Event Manager is updating you about what's happening at Criteo Labs.See DevOps Engineer roles