Platform

Collaboratory

Resources

What makes Mobi.AI’s data so special?

Technical Explainers

Data

Tesla Wells

2024-09-18

Technical Explainer header image of Tesla Wells and a neural network

IN A NUTSHELL:

Mobi.AI’s world-class data development team uses cutting-edge AI algorithms to aggregate, clean, and organize travel data from diverse and unconventional sources. Mobi.AI uses this data to power our B2B solutions that aid in destination discovery and personalized recommendations answering a traveler’s expressed desires for their next trip. Our in-house dataset currently contains tens of millions of unique “Points of Interest“ (POIs). The information in each of these POIs contains more than standard “what” information (the address, the phone number, the type of location). We enrich our data by summarizing “why,” “when,” and “where.” Knowing “why” a location is unique enables us to make better recommendations, knowing “when” to visit enables us to make better schedules, and knowing “where” it belongs within a larger geographic context enables us to make better spatial decisions or routes.

Planning the perfect trip takes a tremendous amount of time. Most of this time is spent information-gathering; humans (and algorithms!) usually need information on a potential location that is not easy to find. Many travel companies understood the value of collecting travel-related data in one place a decade ago when they created online travel directories. At this time, most information about travel did not exist on the internet (or only existed buried deep in a blog post). Creating a directory-platform for hotels, restaurants, or other attractions was a way to get crowdsourced information from businesses and their customers into a searchable database with a standardized format. Now, the data landscape looks different. We are living in a time where there is too much data online. Finding “basic” information on a location is more straightforward, but going deeper than the “what is it?” of a location requires us to dig through pages of photos/reviews, cross-reference websites, and consult forums/social media. Additionally, in the last few years, the field of data development has made breakthroughs in processing large amounts of data, interpreting irregular data sources, and analyzing the data once collected. This means there is an emerging business case for “data aggregators”; companies, like Mobi.AI, that hire world-class data scientists to make world-class datasets in support of their projects.


Mobi.AI chose to build an in-house dataset when we realized other travel datasets were not meeting our standards. The locations in an average travel dataset describe only the “what” of a place. For example, the typical travel directories considers the basic information of a self-registered business (the address, the type of activity, a website, a contact number, etc) to be a sufficient description of a Point of Interest (POI). While this is useful for identification and provides necessary information, we think capturing a place goes beyond answering the “what.” Our data needs to express the “why”: what makes a place unique, interesting, or worthwhile? Why is this the best seafood restaurant in Boston? Is it because they have the best lobster roll? Or is it because it’s right on the docks? Our data also must contain a flavorful answer to “when” and “where.” This information goes beyond the hours-of-business and street address and tries to capture a sense of place. How long do people typically spend at this location and what is the best time of year to visit? How does the location relate to the surrounding neighborhood? How do people usually get there?

We felt our data needed to capture this information because the time that travelers spend “researching” a location is mostly spent answering these harder questions. The “why,” “when,” and “where” are often buried in comment sections, photos, and links scattered across multiple platforms. Importantly, the information a traveler is seeking does exist on the internet, but requires users to search and combine information from different places.

We want to automate this process. We start by collecting information from a diverse range of sources; the base of these sources are your fairly standard directories. We then pull in information from the same reviews, photos, wikipedia pages, travel blogs, and discussion boards that your average traveler consults. Then, we go a step further and overlay open-source geospatial data or hobbyist datasets to give people access to information they wouldn’t normally have the technical skills to access themselves. After we’ve chosen our data sources we have to do the work of building the dataset. We were able to incorporate these diverse and even unconventional data sources because our data team knows how to extract, process, and stitch together information from different types of media to build “POIs.” Each of the extracted “POIs” is then standardized and combined; our aggregation algorithms verify when different sources are talking about the same location, reconcile conflicting information, and extract common themes from large amounts of supplementary text or images. The team also uses algorithms to monitor our sources for changes to keep our data as current as possible. The result? A high quality, detailed, incredibly unique data bank of tens of millions of Points of Interest around the world.

We could simply deliver the dataset as a standalone product (and sometimes, we do), but the quality of this dataset really shines through when paired with our other in-house tools. This dataset is the foundation on which our schedulers, routers, and search engines build amazing trip recommendations. Datasets with more accurate and nuanced understanding of “place” and “time” not only produce more accurate schedules, but allow us to use better scheduling and routing techniques. Having a more holistic understanding of what is interesting about a location allows us to use a greater variety of recommendation algorithms and increase the transparency of our recommendations. This dataset is useful not just because it centralizes difficult-to-capture information into one place, but because it allows us to then use better automation and analysis tools. This means a traveler or agent can discover new locations and create unique plans far faster and more efficiently than ever before.

The Magic in Mobi's Data

Mobi uses an exceptional collection of data—tens of millions of curated points of interest—to generate itineraries anchored in the realities of our constantly changing world, providing a magical experience for our partners and their customers.

Read article

Technical Explainer header image of Tesla Wells and an image of robotics

From Autonomous Agents to Travel Agents: Applying Lessons from Robotics to Travel Planning

Tesla Wells explains how their background in robotics is applicable to AI travel planning.

Read article