A look back to look forward: Where do new ideas come from?

Image courtesy of Entertainment Weekly

Introduction to the business case

This was going to be a solo post, but I bring the Director into the conversation by asking her a question at the end. The inspiration for the article comes from the WSJ piece "Meet the United Airlines Executive Who Gets to Pick Its Hot New Routes" here. Alison Sider, a fellow Longhorn from my alma mater, UT Austin, interviews Patrick Quayle, the senior vice president of global network planning and alliances at United Airlines. She asks interesting questions on a topic of interest.1 How does United Airlines decide where to fly, and where not to fly? If a route is a proven cash cow, flying it is an easy decision, but of course, all the competitors are already flying it too. Everyone knows the answer; the only thing left is optimization. Finding a successful route no other competitor has yet flown is a more difficult problem because there's no data, and this challenge is no different in other domains – which products or services to offer?

Why is this a problem? Simply because there's an inherent trade-off between learning from data and finding new ideas. Data, by definition, is backward-looking (including the results of experiments, which I'll discuss later). In data analysis, we often strive to find the most accurate patterns and minimize error. This is more apparent in predictive analytics, given the context here. Airlines, like many other businesses, constantly make predictions about what might happen in the future using the historical, training data. But a new, exciting route is not in the training data.

In the article, United asks: "How do we capture the imagination of the United customer base?" Our question is, how do you get to that new idea while still using a data-centric approach?

Academic's take

Based on the content of the interview, I have come up with five ways to use data science creatively:

1. Transfer learning and clustering: United seems to cluster destinations to identify new, high-demand routes by using data from similar destinations already in its network. In this case, identifying the similarity becomes key.

"In a place like Greenland, we’re making a bet that people want to go there based on the type of travel we see to other parts of our network that are more exotic, more off the beaten path."

2. Data triangulation: United collects and combines observational data from external sources in addition to (and maybe instead of) asking customers. For example, United uses Instagram (and possibly other social media channels) to monitor trending destinations. It also keeps a close eye on searches that lead to 'destination not found' pages.

"But also, our team is very creative. We’re looking at what’s trending on Instagram. We’re looking at where people are searching on United.com. We’re looking at top travel trends. We see a spike in interest in Bangkok because of “White Lotus,” and so we’re capturing that with our new service to Bangkok, starting this October."

 "Social media does play a role in the process: Where people are going, what is trending, what are hot destinations—we try to get there in the future. "

3. More frequent experimentation: United is doing more experiments to generate new data rather than relying solely on historical records. While an experiment's results are, by definition, historical data, the purpose is to probe new ideas and generate data on new routes, not just the already known ones with existing data.

"What it taught us is that we should experiment more, and to try to look through the data and figure out, what are the key characteristics of a route that works."

4. Real-time falsification of new ideas: United seems to remain data-centric, even when a new route doesn't yield the expected outcomes. The best way to decide 'where to fly' is by also answering the complementary question: 'where not to fly?'. Being realistic about what the data shows is a more effective approach than letting wishful thinking and expectations cloud the insights.

"After 12 months, these flights did not make money, and they were not on a path to make money, and we canceled the flight. In this role, you cannot be so egotistical or proud that you’re unwilling to cancel a flight."

5. Combining data science with the art of creativity: United is clear about using all the data they have, but they recognize that hypotheses don't come from the data itself. What they're describing is a two-part process: first, use creativity and imagination to develop a hypothesis, then use data to test it. Creativity and imagination contribute just as much to the solution as data does.

"The science starts with the data. We can look at credit-card data. We can look at U.S. Department of Transportation origin-and-destination data. We can look at pricing data. The art aspect of it is, how do I look at that same data and add a lens of creativity and add a lens of color and passion that could lead me to broaden the aperture and serve more people and add more destinations?"

For me, the most important takeaway from the interview was the last point. What's missing in the data cannot be found by looking at the same data more closely. We must make assumptions, a key part of our data centricity framework discussed here. Deciding where to fly is not a data or a model problem. It's an assumptions problem. And no, some of the most advanced models we have access to today, such as large language models, aren't immune to this problem either, as a related article, "Language Models Can't Tell What's Missing," demonstrates.2

What's missing is a new idea, and bringing it to life requires both a realistic view of data and model limitations and the nurturing of imagination. Acknowledging that failures are integral to this creative process is key, as is proactively managing them: Points 1-3 above aim to decrease the probability of failures, while Point 4 is to minimize their cost.

In closing, I ask the Director the same question United tackles: Where do ideas come from in your data science routine?


Director's cut 

Before answering the academic's question, I'd like to clarify what I consider a data science routine to be. A data science routine begins with a business question and a desired strategic outcome. The business question is then broken down into a list of requirements, which include data/features, the modeling approach, and success metrics.

When converting the business question into a modeling plan, it's critical to understand the novelty of the question and the cost of making a suboptimal strategic decision. For novel strategies with high costs, attempts to predict the outcome won't be more precise than experimentation. An example of this would be a brand-new store format or a new product line. On the other hand, strategies that are lower in cost and more traditional, such as a new promotion depth, can often be optimized using historical data.

In retail merchandising, the scope is generally optimizing prices, promotions, shelf space, and product assortment. Data science teams usually need to get creative to fill data gaps when a new product arrives or a new store opens. In these instances, machine learning techniques such as clustering and matching come to the rescue to apply historical learnings and leverage external data. In addition, LLMs now allow us to more easily cluster and match text data from sources like product details and customer feedback, which is a major benefit.


Footnotes

[1] In my research on creative craft platforms, we study how data and algorithms play a role in the creative process.

[2] Fu, H. Y., Shrivastava, A., Moore, J., West, P., Tan, C., & Holtzman, A. (2025). AbsenceBench: Language Models Can't Tell What's Missing. arXiv preprint arXiv:2506.11440.


Podcast-style discussion of the article

The raw/unedited podcast discussion produced by NotebookLM (proceed with caution):

Other popular articles

How to (and not to) log transform zero

What if parallel trends are not so parallel?

Explaining the unexplainable Part II: SHAP and SAGE