How Upstart’s AI is mastering growth, credit performance, and profitability
- Upstart Co-Founder and CTO Paul Gu explains how the company is leveraging AI to redefine consumer lending by improving credit assessment, automation and servicing.
- Upstart aims to solve lending's biggest challenge, achieving growth, profitability and strong credit performance concurrently, with plans to 10x their AI advantage and cover all consumer credit needs by 2025.

There’s an old theory in lending that you can only master two of three things: growth, credit performance, and profitability. For decades, this has been accepted wisdom — until AI started changing the fundamentals of how we assess credit risk.
Today I’m joined by Paul Gu, co-founder and Chief Technology Officer of Upstart. Paul’s journey reads like a modern Silicon Valley story—from Chinese immigrant to Yale dropout, he became part of the inaugural class of Thiel Fellows before co-founding Upstart in 2012. Under his leadership, Upstart has gone from zero model training data points in 2013 to processing 91 million data points today. Their AI predicts both default and prepayment likelihood for every month of a loan’s term.
Paul believes Upstart’s AI is bringing them closer to achieving all three pillars of lending—an approach that could redefine consumer lending across the entire credit lifecycle. We’ll explore how this evolution is playing out, dive into Upstart’s 2025 roadmap including their push for 10x AI leadership and GAAP profitability, and discuss what this means for the future of credit.
Whether you’re a lender, fintech entrepreneur, or just curious about how AI is impacting finance, Paul’s insights on building advanced technology in a regulated industry are going to be valuable.
Watch the episode
Listen to the episode
Subscribe: Apple Podcasts I SoundCloud I Spotify
From Income Share Agreements to AI-powered lending
Way back in 2010, 2011—long time now—I dropped out of Yale to do the Thiel Fellowship. That was Peter Thiel’s 20 Under 20 program. It was the first year of this program, and the basic deal was, you get $100,000 if you agree to drop out of college. It wasn’t like YC or anything, no structured program. It was just a social experiment to see if you could get 19-year-olds to leave school, and then they could figure it out. And they did.
It took a little while. I started thinking about this problem of young people getting access to money. And it occurred to me that unless you came from money, it was pretty hard to get access to it when you don’t have a lot of history of either income or employment or credit. I thought it would be really cool if you could invest in people like their stocks. That was the very first version of Upstart. I met my co-founder, Dave, and he was thinking about a very similar concept, and we kind of joined forces on the basis of that thing.
Now there was this other part of the story. I had spent a summer at this quant fund in New York called D.E. Shaw, a well-known quantitative hedge fund that does a lot of machine learning applied to predicting small changes in securities prices and finding little ways to arbitrage that. Not quite high frequency trading, but something fairly medium frequency. It occurred to me that this used a lot of compute—both computer and human compute. It was using a lot of intelligence, but it just never felt to me like it was solving an important problem for regular people. I wanted to do something that was like high frequency trading, but that would actually help people.
When we started to think about this problem of young people not having access to money, very quickly it was like, okay, could you take some of the same things that are being done at all of these quantitative firms that are hoovering up so much talent, so much compute, but apply it to something that would help ordinary people? We ended up having this model-focused approach to pricing the risk of consumers.
Then it just turned out we were wrong about one small detail, and that was this idea that people wanted this income sharing thing where you would have the upside and the downside and you wouldn’t know how much you have to pay back. Very quickly, we figured out that was not the right answer. Instead, when people ask, “How much do I have to pay you back?” you should just answer the question. And so that’s how we came to the Upstart of today.
The evolution of Upstart’s AI models
There are really a few major ways that the model makes progress at Upstart. You can think of it as there’s rows of data, and there’s columns of data, and then there’s the model architecture. Those are the three major categories.
If you start from the rows of data, when we started out, we had zero rows of data. We had to rely entirely on third party data. And third party data has all sorts of limitations. It doesn’t have the right columns of data attached to it. It doesn’t represent the kinds of borrowers that will come to upstart.com very well.
Then columns of data is something that we’ve just added over time. Our initial thought was, we were focused on younger consumers, so we said, let’s add in data about someone’s education—alternative data to what’s traditionally on a credit report. But over the years, we’ve added more and more data sources. We’ve said, let’s add data about people’s employment and let’s add data about people’s interactions with their bank accounts. Let’s add what’s now called cash flow data. Let’s add data about people’s various kinds of public records that are out there. Let’s add data about how people are interacting with us digitally. So we’ve added more and more column data over the years.
And then the third pillar is model architecture. When we started out, we started with essentially what we call a textbook approach. There literally is a “how to do consumer loans” textbook out there—many of them, in fact—and they all roughly guide you towards the same thing, which is you build kind of a roughly five to 10 variable logistic regression, really just a linear model that assumes things go up and down in proportion. That model is the one that we started out with back when we had zero rows of data and just relatively few columns of data, because that’s all we had the data for.
But as the amount of data that we had increased, we were able to spend time to build more and more sophisticated model architecture. So you go from maybe just having this one linear model to having multiple kinds of models that you’re ensembling together. Then you’re changing the base models. So instead of a linear model, you get to a tree based model, and then eventually you get to a neural network type of model. Then you have multiple kinds of these things. They get ensembled together. You have multiple layers of the model. So you end up with a very rich, proprietary model architecture. And it can get more sophisticated, almost in proportion to how much training data you’ve got. As you mentioned earlier, the rows of data went from zero to now it’s about 91 million training data points. And that keeps going up. As it goes up, it unlocks the ability for us to develop more sophisticated models. But of course, it doesn’t just come for free. You have to actually also do the technical work.
Business Model Evolution: From ISAs to institutional capital
You’re right. We have evolved a lot. We’ve done actually all of the above, probably as far as models. In our early history, we had a period of time we flirted with something like a peer-to-peer model. It was never quite only P2P, but there definitely was a marketplace element—a retail investor presence. We have done, of course, earlier when we didn’t have the proof points in our model to show third party institutional capital sources, we had to do a lot more of a balance sheet. Some of the balance sheet was directly on our balance sheet. Some of it was kind of like you would raise almost like friends and family money.
Really, you had to scrap in every possible way to get over that zero to one hump where you’re doing things in a different way than everybody else, and you don’t have the proof points to show institutional capital that they should buy in and believe that. So it took a good number of years for us to really get going. There was a long time I always thought of us as kind of the tortoise in the tortoise and the hare race, in the great lending or fintech race that was going on. We were always kind of the slowest company. And I think eventually the business model and the differentiated credit approach proved itself out. But there were definitely some dark years in there, where we were wondering why everybody else was always so far ahead of us.
Differentiation in today’s market
I would say we’re in a pretty good spot right now. If we look at kind of a typical measure of accuracy, which would be something like, if you took the 87.5% of the population that is not in default, how far are you from perfectly predicting the probability of default for those people? We are 87.5% of the way towards perfection on that metric. So there’s still 12.5% to go, but we’ve made a lot of progress.
And I think what’s really interesting is that when you compare that to where the industry was when we started, the industry was probably somewhere around 70% on that metric. So we’ve made substantial progress, but there’s still a long way to go. And I think that’s what keeps us excited—there’s still so much room for improvement.
2025 Goals: 10x AI advantage and full credit lifecycle coverage
Two broad thematic goals for us. The first is 10x our advantage in AI. The second is solve all of a consumer’s credit needs across their credit lifecycle.
The first one, I think, given the conversation, is hopefully pretty self-explanatory. No rest at 87.5%—we have a long way to go. Models can keep getting much better. That means more rows, more columns, better model architectures, and we’re just going to do that over and over, and hopefully not even just not slow down, but go faster and faster over time and get some compounding benefits, get some tailwinds from the generative AI companies, and get to a 10x there.
Then there’s apply to every part of the consumer credit lifecycle and all of their credit needs. We started out as a pure point solution for unsecured consumer loans, which are by far the smallest part of the market. They are very useful product. In some sense, they’re the best product for what we do, because they have so much risk and uncertainty at the very bottom of the payment waterfall. But for most American consumers, this is not a product they very regularly use.
So then after that, we started building products in auto, which is the most ubiquitous credit product. I think 60 some percent of Americans have auto loans at any given time. Home loans—we’re just in HELOC now. But of course, the single probably most emotionally meaningful credit product that anybody ever uses is the purchase of a home. So certainly that’s something that long term we expect to be in. And then, if you just think about the kind of ongoing daily use type credit, I think that is something we aspire to solve for. So if there’s something that the consumer needs and credit, I think ultimately we want to be there.
Personal goals and vision
It’s all linear. If you want to get to those two thematic goals over the course of, say, the next three years, then this is 1/16th of that time. And that’s the goal.
The Surprise: Expanding beyond future prime
I would say the single largest surprise for me has been we always were in this position in the market where we were so exclusively focused on what we called the future prime consumer—meaning people who today, most others wouldn’t recognize as prime. But because of some of the wins we’ve had in applying AI to areas like automation and now increasingly servicing, but also improvements we’ve had in applying AI to marketing and targeting, and also what we call the calibration part of the problem, which is the part of the problem that long term capital sources really care about.
If you’re a credit investor, you really care about having maximum predictability to the cash flows of your investments. And if you can deliver that, then you can solve for lower cost of capital, because there’s less of a risk premium that they have to apply to you. And it turns out that’s really important for the primer part of the market, which I think we are now much better able to solve for than before.
So suddenly, we find ourselves in a place where we think for close to 50% of Americans, we probably have just the very best rate. And in a market where there are uncountably many players, that spans both people who are quote unquote future prime but also many conventionally prime people, because we’ve been able to deliver AI fueled wins on model calibration, which then have turned into lower cost of capital via more happy investors.