Sunday, December 23, 2007

Lendingclub - Bad Math

Lendingclub now provides an estimated "Lender ROI" calculation on their web site. Unfortunately, their mathemematics is incompetent, and the resulting numbers are meaningless.

You might say "Gee, why is he so hard on them?" The answer is simple. They want my money. This forces me to set a standard.

I guess us P2P lenders are spoiled because got much of this right. It looks like LendingClub tried to copy the presentation on the Prosper performance web page without understanding the calculation.

Lendingstats performance statistics can be found at

Here's a picture of their ROI chart...

lendingclub ROI table

The page containing this table comes with a popup help page that explains the calculation. Lets try to use it to understand what they have done. Sorry if this is a long slog. There's no other way to show you how utterly bogus this is.

Estimated Lenders ROI table
Average rate: The average interest rate for all loans that have closed within the loan grade (A-G)
Losses: The sum of the principal amount of the loans that are expected to default, expressed as a percentage of
lenders estimated ROI. A loan is "expected to default" for purposes of this estimated ROI calculation (i) at a 50% clip with respect to loans that are at least 15 days late and (ii) at a 90% clip for loans that are at least 120 days late
Processing Fee: Fee paid by the lenders expressed as a percentage of lenders estimated ROI
Estimated ROI: The net estimated return on investment after subtracting estimated losses and fees

Average rate makes sense.

But when we come to "losses" ... Whoa. "sum of principal amount of the loans" ... That I understand. When they say "that are expected to default" they mean late loans multiplied by a factor, 50% or 90%. That I understand. They go on "expressed as a percentage of lenders estimated ROI." I stared at that for quite some time trying to figure out what the heck they could be doing. You see, we're building up to the ROI calculation. These losses are one component of it. We don't know the ROI yet, so we can't possibly divide by it. This is just wrong.

I reverse engineered the numbers in the table. What they actually have done is to add up all the estimated losses (dollars) and then divide by the total principal amount of the loans in that credit grade. So the guy who wrote the explanation didn't understand the thing he was trying to explain. Ok. The losses numbers they actually calculated are closer to being right, but are not right, for two reasons.

First, they are not properly normalized. Second, they include only the loss in principal. They forgot the interest which is lost when a loan defaults. I'll spend a few sentences explaining each of these errors a bit later, so you can understand how big these problems are. For now lets continue reading the help popup.

The processing fee explanation in the help popup is also wrong. The words "expressed as a percentage of lenders estimated ROI" simply don't belong there. After computing these various percentages, they subtract them, which means they had better all be percentages of the same thing: principal.

It isn't easy to describe a calculation precisly in english. Equations are better. So it seemed good that LendingClub's popup went on to provide equations. However, these equations are all wrong. (And by all, I mean each one of them is wrong.)

Estimated ROI = Average Interest Rate - (Loss due to Late Loans - Loss due to Default Loans) - Lending Club Processing Fee

Loss due to Late Loans = Sum (50% * (unpaid percentage) * (interest rate of the late loan))
Loss due to Default Loans = Sum (90% * (unpaid percentage) * (interest rate of the default loan))
Lending Club Processing Fee = 1% of all payments received by lenders.

Ok. Lets start at the top. The equation for Estimated ROI says we should subtract some numbers. I expected that. But look at that minus sign inside the parenthesis. This equation would have us subtract one kind of loss but then add (instead of subtract) another kind of loss. That's wrong. Either the minus sign inside the parens should be a plus, or the parens should be dropped.

Next lets look at the equations for losses. They're similar, so lets just look at the first one. It has us sum some things. We'll assume we're summing over the late loans. Lets look at what's inside the sum.

50% * unpaid percentage * interest rate of the late loan

There are many things wrong with this. First, they don't say percentage of what. Second, the numbers shown in the table are clearly the fraction of principal that is late, which has nothing to do with "interest rate of the late loan". I think what they meant to say was

sum(50%* late principal) / total principal

Finally, there's an equation for processing fee. Well, its not quite an equation. It is the definition of the LendingClub processing fee. Unfortunately, its in units of dollars. 1% of some dollars is going to be some other amount of dollars. The number in the table on the other hand is a percentage rather than dollars, so this equation can't represent what is in the table. Some calculation was done to get from dollars to percent. Not specified. Another problem is that they haven't told us how the dollar amount was calculated. Did they calculate the payments due during a year for each loan, multiply by 1% and then sum? The numbers in the table look too large for that. What they actually calculated is unknown. A proper equation would leave us with none of those questions.

Enough with this unhelpful help file. Lets go back to the two big errors in the actual calculation. These are: 1. the loss rate is not properly normalized. 2. They counted only principal and forgot lost interest.

To understand the normalization problem, we have to look at the units of the numbers in the table. The first line contains the average interest rate of the portfolio. That interest rate is a rate per year. The second line contains the loss rate, which they have forgotten to convert into a per year rate. They're going to subtract these two numbers, but they aren't in the same units!

To understand why this is important, imagine that you borrow some money from me at 3% per month, and lend it to Joe at 5% per year. Now you subtract 3% from 5% and think you are getting a return of 2%. Not so. You subtracted two numbers in different units. To do the calculation correctly, you would convert the 3% per month interest rate into an annual rate and then after you have two numbers in the same units you would subtract them. You would discover that you're losing a lot of money.

That "losses" rate in Lendingclub's table is not a per year rate. It shows loans that have gone bad (late or default) during their life so far. The Lendingclub portfolio is very young. "So far" has not been much time. To convert that number to an annual rate we're going to have to increase it a lot.

I downloaded the Lendingclub database, and calculated the average age of the entire Lendingclub portfolio. The loans in the portfolio are 84 days old on average. A lot less than a year! But that's not the entire story. Loans can't possibly go bad during their first 30 days, because no payment has yet become due. At the very least, we should subtract 30 days. But there's more. Lendingclub uses "15 days late" as the threshold where a loan becomes "bad" for the purposes of our loss calculation. This event cannot occur until a loan is 45 days old. Wow. So we should really subtract 45 days from the 84 day average loan age. You get 39 days, and that's just 0.11 of a year! To normalize the "losses" that Lendingclub shows in their table, we should divide them by 0.11, which is the same as multiplying them by approximately 9. Ouch.

Look at the column for loans of grade "B". The 0.51% losses they show, when properly normalized (annualized) becomes 4.64% per year losses. Quite a difference.

I should mention that this simple normalization method makes several assumptions about the behavior of loans. It is certainly not the only way the normalization can be done. I'd be open to Lendingclub using some other method, as long as they tell us what they've done and it makes sense. Just using 39 days worth of losses to represent a years losses is not acceptable.

The second error in this calculation is more subtle. When a loan defaults. Everybody realizes that you lose principal. You also lose the interest that you expected to get. I can hear people saying "but you didn't get it yet so how could you lose it?" Lets look at a simple example. Suppose I lend you $1 at 5% interest, and to keep this simple you're gonna pay me back in 1 year. My return will be 5% if you pay me back, or -100% if you don't. The difference between those two numbers is 105%. Using the kind of ROI formula that Lendingclub used on their stats page, I would start with the 5% interest rate I expected to receive (that's the "average rate" on the first line of their table), then I would subtract something called "losses", and I want the answer to come out to be my ROI. If I want to come out with an ROI of -100%, I'm going to have to show -105% on the losses line because that's what I need to subtract from 5% to get -100%. That -105% represents loss of both principal and interest.

If you look closely at Prosper's performance table, you will see two losses lines, labelled "net defaults" (which is where they put the estimated principal loss) and "adjustment (interest and fees)" where they put the estimated interest loss.

Lendingclub left out the 2nd line.

In the early days of, they got this one wrong too. They used to explain that all you needed to do was add the ROI you wanted to obtain to the default rate you expected on a certain grade of loan, and that sum was the interest rate you should bid. That calculation ignores the lost interest term. This wrong calculation for a long time was coded into the standing orders web page.

Having trashed their attempt at performance calculation, I don't want to end without saying a few things about some positives at Lendingclub. The principals seem like nice chaps. They seem very open to communication with the lender community. That's a great big positive. Like Prosper, they've also decided to make their loan database available. This is also a great positive. (Its not as extensive as Prosper. They only have loans, no bids or info about lenders. It turns out that in Prosper's data, the information about lenders' behavior and performance is the most useful part.)

I imagine the fine folks at Lendingclub will straigthen all this out. Once they do that, and their portfolio ages enough to be measurable, I'll be very interested to see how the returns numbers look. Before I invest in loans, I need some way to know (or believe, or have a hunch...) that the interest rates on the loans are reasonable. Investing thru, many of us learned that the historical default rates prosper used to give us from Experian were inapplicable. Prosper's default rates turned out to be much higher. I would assume its going to be the same way on Lendingclub. Lendingclub started by giving investors a set of historical default rates from Transunion. It seems likely to me that these will end up being inapplicable to the Lendingclub situation. We need good data from Lendingclub's own operation before we can decide whether these investments make sense.

The folks running Lendingclub set the interest rates for us. To do that properly they need to understand the mathematics of loans. I need to have confidence in their ability to do that math correctly. I'm not there.

No comments:

Post a Comment