The State of Fair-Lending Testing—And How to Improve It

Facebook
Twitter
LinkedIn

In the final installment of our series on fair lending, we’ll look at fair-lending testing. While we believe that our approach to fairness testing is best-in-class for the industry, standard measures of fairness have yet to be instituted and enforced by regulators. We support efforts to create more clarity and consistency in this area. Therefore, in addition to providing an overview of Upstart’s testing process, we’ll recommend five ways to improve testing for the industry as a whole. 

Upstart’s Approach to Fair-Lending Testing

Upstart’s approach to fair-lending testing was developed, in part, during a time when we operated under special oversight granted by the Consumer Financial Protection Bureau (CFPB) under a No Action Letter (NAL)—the first ever issued by the CFPB. During the NAL in 2019, the CFPB noted that our tests showed the Upstart risk model resulted in more approvals and lower APRs compared to a traditional model for all groups. We continued to enhance our fairness testing approach while under the voluntary monitorship of Relman Colfax, PLLC between 2020 and 2024.

Today, we leverage a robust and comprehensive testing process, along with advanced less discriminatory alternative model search procedures, to ensure our AI risk models are fair, accurate, and inclusive. We share the results of these tests with each of our more than 100 lending partners to provide continued transparency into the integrity of Upstart’s models from a fair lending standpoint. (For more about our approach to fairness testing, go here.)

But we, like the rest of the industry, still wrestle with important and as yet unanswered questions in relation to fair-lending testing. To remove the uncertainty left by these gaps in policy, we would recommend the following potential solutions as ways to establish a consistent standard of fairness for all lenders:

5 Ways to Improve Fair-Lending Testing

  1. Create a Benchmark Model

To assess AI’s ability to deliver greater access to credit than traditional, credit score-based approaches to lending, Upstart created a hypothetical traditional model for comparison. Of course, while we consult experts and industry participants to help develop the hypothetical traditional model, we don’t have all of the model specifications at our disposal. Therefore, it would be beneficial for regulators to identify the inputs needed to create such a hypothetical model, against which companies could objectively measure the fairness of their lending programs.

  1. Adopt a Single Proxy Methodology for Assessing Demographics

While there is a widely accepted proxy for race—Bayesian Improved Surname Geocoding—it is understood to be imperfect, as are other proxies such as the Zest Race Predictor. The result is the potential for skewed results in fairness tests such as the AIR. To solve this problem, we would encourage further research and innovation to find a better approach that could be adopted across the industry so that AIR results, and other tests, offer a more apples-to-apples comparison.

  1. Reporting Requirements for Personal Loans

The Home Mortgage Disclosure Act requires institutions to collect demographic data on their mortgage products, which are then made publicly available on the CFPB website. This transparency allows mortgage lenders to compare the fairness of their loan portfolios to that of others in the industry while incentivizing lenders to keep pace with their peers. Currently, there is no such reporting required for personal loans, and, in fact, there is potential risk in a creditor asking for such information. Instead, proxies are used to perform fairness testing. We recommend the creation of a similar framework for non-mortgage loans, to facilitate peer-to-peer fairness comparisons in this space and remove uncertainty in the use of proxies. 

  1. Set a Standard Disparity Threshold for Further Testing

To evaluate fairness in employment, regulators instituted the Four-Fifths Rule. Lending would benefit from a similar quantitative measure for fairness in lending. For example, both Adverse Impact Rations (AIRs) and Standardized Mean Differences (SMDs)—two common ways of testing for disparities—produce results that are left open to interpretation by each institution. However, if the CFPB set an approval AIR threshold of, say, 0.8, that would prompt lenders to retest models whose results do not meet that threshold. We recognize that the CFPB is unlikely to set such a threshold, but quantitative standards or even suggestions would help to bring much-needed clarity to fairness testing. 

  1. Set a Standard for Assessing ‘Legitimate Business Need’

Similar to a standard disparity threshold, we would recommend a quantitative method for comparing “legitimate business need” against disparities in a model. In the assessment of fairness laid out by the Equal Credit Opportunity Act, fairness is decided based on three questions: 1. Is there a disparity? 2. If there is a disparity, is there a legitimate non-discriminatory justification for it? And 3. Is there a less discriminatory alternative that could be used to meet the business need? The industry tends to focus on only question one—avoiding disparities—but specifically addressing step two would go a long way toward helping lenders navigate the more complex and nuanced part of this fairness test. More specifically, regulators could establish a test to measure an acceptable level of model accuracy degradation that would allow for a more inclusive lending program.

As AI grows in capability and is more widely adopted across the credit industry, today’s fair-lending testing standards will need to keep pace with this fast-moving technology. To date, Upstart has worked closely with policymakers and regulators, as well as our lending partners, to ensure our AI is more accurate, efficient, and inclusive than traditional credit models. And we are committed to continually redefining and implementing those new standards going forward—to provide access to credit for all.