Since 2008, Airbnb has grown tremendously with over 6 million listings and 4 million hosts worldwide – becoming a viable alternative to hotels.
With the collection of extensive information on hosts, guests, the length of stay, the destinations, etc., Airbnb produces colossal volumes of data every day! In order to be able to clean, process, manage, and analyze all this data, the leader in accommodation had to implement a solid and rigid data culture in its organization.
In this article, discover the best practices implemented at Airbnb to become a data-driven company – all based on the intervention of Claire Lebarz, Head of Data Science, at the Big Data & AI Paris 2022.
The 3 levels of maturity of a data organization according to Airbnb
The term data-driven is very well-known and commonly used to describe a company that makes strategic decisions based on the analysis and interpretation of data. In a truly data-driven company, all employees and leaders harness data naturally and integrate it into their daily tasks.
According to Claire Lebarz, however, the term “data-driven” is often overused: “I prefer to think in terms of three levels of maturity that characterize a data organization: Data Busy, Data Informed, Data Powered.”
At the “Data Busy” level, a company has implemented data-centric people such as Data Analysts, Data Scientists, or Data Engineers in the organization. However, the analysis time is not quick enough, or there is no return on investment for the Data Scientists.
“At this level, there aren’t any rules in place about the quality of the data, the data is not trusted. Or it represents a bottleneck for the organization,” explains Claire.
At the “Data-Informed” level, the organization has implemented data governance and strategic decisions are increasingly based on the company’s KPIs and metrics rather than on the instincts of top management.
Finally, at the “Data-Powered” level, the highest level of the maturity matrix, data is on the critical line of the organization and becomes a key driver for business growth.
“Above all, data is no longer reserved for a group of data experts but for the entire organization – all employees are in tune with data,” explains the Head of Data Science.
The 6 steps to becoming data-driven according to Airbnb
Step 1: The scientific method
In ‘Data Science’, there is above all ‘Science’, explains Claire. So the first step is to take ownership of the scientific approach in the organization. “The idea is not to build a big R&D team, but rather to put on paper all the hypotheses we operate with and find ways to validate them or not.”
This approach implies testing, testing, and… more testing! And one of these levers is through A/B Testing. The Head of Data Science explains that it was crucial for Airbnb during the COVID-19 crisis to think about different assumptions about the world of today and that of tomorrow to make the right strategic pivots for the company.
One example that highlights the importance of A/B testing at Airbnb is the implementation of a maximum and minimum price filtering system on its booking site. Indeed, Claire explains that user experience feedback was better when travelers could indicate their maximum budget to book a stay. Without this little addition, travelers spent a lot of time on average listings and decided not to book.
Step 2: Strategic team alignment
For Claire L., setting up OKRs (Objectives & Key Results) is essential to align the different teams internally. Indeed, the data teams of an organization often tend to focus only on their own data metrics. Yet, it is imperative to put in place common company objectives to truly infuse a data culture in the company: “strategy must come before metrics.”
And the global leader in short-term rental experienced a lack of alignment. In the example below, we can see the negative consequences of this on the Airbnb site’s search experience in 2017. In this illustration, the query “los angeles” was yielding results in multiple categories without really making sense to the user.
Each team here was responsible for a decorrelated KPI. The “experience” team was responsible for company objectives to suggest things to do in the city, while another team was responsible for the cities closest to the search, etc. All were pushing multiple pieces of information to increase their own performance and drive traffic to their section of the website.
Users would get lost and end up not booking anything because the teams weren’t pulling in the same direction!
Step 3: Measuring uncertainty
For Claire L., “Uncertainty is inherent in running a business and making decisions.” Sometimes the best analysis does not equal the best decision. We need to have organizational discussions, such as: What level of confidence do we need to make decisions? What signals do we need to consider to change decisions?
In the context of OKRs, there is often a temptation to avoid initiatives whose ROI is difficult to measure. However, just because a metric is difficult to measure does not mean that the initiative that depends on it is not the best one. An example that the Head of Data Science gives us is the branding campaigns carried out by Airbnb during the Super Bowl between 2017 and 2021.
“Branding campaigns are the hardest to measure, you can almost never know their ROI. But given our indirect results, building a great branding strategy and moving away from reliance on paid channels like SEM, was perhaps the best marketing strategy to boost organic and direct traffic.”
Step 4: Centralized governance
Governance, according to Claire L., must be centralized. Indeed, she noticed at Airbnb that as soon as you decentralize the data teams, and they report to the business, you quickly lose the objectivity of the data in the company. She explains: “Data must be considered as a common asset in the organization, and it is essential to make investments centrally and at the highest level of the organization. Data should be managed as a product with the employees as the customers.“
Indeed, Conway’s law also applies to data: “organizations that design systems inevitably tend to produce designs that are copies of their organization’s communication structure.” If applied to data, this law refers to the various departments in the organization creating their own tables, analytics, and features – based on their own definitions – that are not always aligned with those of other departments.
Step 5: The right communication
Claire L. shares one of the best decisions Airbnb has made – that of hiring Data Scientists who are not only very good technically, but also good at communicating. Indeed, the company grew very fast in 2017-2018. And to get familiar with how Airbnb works, you sometimes had to read between 15 and 20 analyses for Scientists or take a lot of time to educate yourself on the company’s positioning for design teams – all of which could quickly become costly.
So Airbnb changed its approach to analytics. Instead of making traditional memos that tend to get stale over time and need to be constantly updated, the company started building “living documents.” “We set up “states of knowledge”, aggregations of all the knowledge of a team on a subject – updated according to the frequency of research on a question” Claire details.
The Head of Data Science also explains the importance of communication during the COVID crisis. Since the Airbnb teams in San Francisco were no longer face-to-face, it became essential to work on new communication formats: “We observed a great deal of email and screen fatigue in general. So we looked for more effective ways to communicate, such as via podcast or video formats, so that our employees could get information away from their screens. We needed to simplify and make information available in a simple and visual way so that all employees can appropriate the data.”
Step 6: A more human-like Machine Learning
Since its beginnings, Airbnb has used search-matching algorithms between guests and hosts. But it took time for the company to build them in volume – on the one hand, to improve the user experience – and on the other to help cross-functional teams get comfortable discussing modeling decisions.
Claire Lebarz explains that in order to have machine learning algorithms without defects, you have to look at the problem backwards: “Instead of saying that we have to solve a problem through automation and machine learning, we wanted to focus on the opposite: What kind of user experience do we want to create? And then go and inject machine learning where it makes sense to improve those processes.”
The addition of category-based searches on the Airbnb platform illustrates this. Indeed, it was about offering an alternative way to search for a place to stay: by asking the traveler what they would like to do. “Here we’re moving away from our basic model where we propose to enter dates and the place you want to go. Now we can ask you what you want to do or have, like surfing lessons, a nice beach view, or even a pool.”
These algorithms are labor-intensive because they depend on documentation provided by hosts. To avoid having to ask hosts several questions a week, it’s the machine learning that “searches” for this information and pulls it up into the right categories on the site via algorithms.
Conclusion: the 3 data-driven talents according to Airbnb
To ensure a true data culture, hiring the right talent is crucial. According to Claire, here are the three essential data roles of a data-driven enterprise:
- Analytics Engineers: they are the guarantors of data governance and quality. They position themselves between Data Engineering and Analytics to focus on insights and questions.
- Machine Learning Ops: this is a new profession that focuses on the operation and evolution of machine learning algorithms.
- Data Product Managers: they are the ones who instill the way to manage data as a product and professionalize the data approach in the organization. They provide transparency on roadmaps, and new data features and they serve as a liaison with other functions.
“It is critical to bring these three emerging professions into the organization to truly become Data Powered!”