Deep Dive: What’s in the black box? The challenges of explainable AI in digital finance
- In the wake of reports of discrimination and negative impacts on consumer wellbeing, regulatory scrutiny is increasing on FIs that use AI models.
- AI is central to digital finance – but do we understand how it truly works?
Artificial intelligence has taken over headlines again with the AI Bill of Rights coming only a few months after the CFPB fined Hello Digit over the firm’s algorithm running amuck and causing damage to consumers' financial wellbeing.
Despite Hello Digit’s guarantee of no overdraft fees, their algorithm routinely incurred such charges on consumers' accounts, which the company later refused to recoup.
It wasn’t a standalone case, either. Last year, the CFPB found multiple other legal violations prompting increased scrutiny into the use of AI in digital finance, the many phases of which are still unfolding.
In its Supervisory Highlights report last year, the CFPB highlighted cases of discriminatory lending ranging from incomplete information in the system leading to misdirected EFTs (electronic funds transfers), to fair lending violations.
According to the report, “Mortgage lenders violated ECOA and Regulation B by discriminating against African American and female borrowers in the granting of pricing exceptions based upon competitive offers from other institutions. Regulation B by the ECOA states that 'a creditor shall not take a prohibited basis [including religion] into account in any system of evaluating creditworthiness of applicants'."
The same report also highlighted instances of religious discrimination wherein examiners found that lenders "improperly inquired" about a small business applicant's religion and considered it in the credit decision.
Even more common than news of AI going wrong are reports of products that have been outfitted with cutting-edge AI capabilities – everything from chatbots to process automation tools fits the bill here.
This divergence is interesting – the collective attention is enthralled by both AI’s potentials and its harms.
Although enthusiasm towards innovation is welcome, it is only one side of the coin, especially considering the complexity of AI. An algorithm’s design can have negative consequences – not just in the code, but in the real world, too.
In many ways, the conversation has evolved past the debate on whether AI can cause harm and towards discussing issues of responsibility and accountability in order to ensure that these harms do not occur. And, of course, what to do when they do occur.
For FIs, the question is often about responsibility. Who is to blame when something goes wrong? Is it the engineer that helped build the model, is it the managing executives, or is it the investors who put money into the project?
Naturally, answers to these questions are complex and require granular information about how the model has been behaving and why. For example, to have any reasonable understanding of why an algorithm unjustly gives lower credit lines to women, you would need to know exactly what data points made the model make biased decisions in each instance.
It is harder than it sounds. AI is an umbrella term for different methodologies like KNN (k-nearest neighbor), decision trees, and neural networks. The technology that is hardest for us to untangle is neural networks.
A brief word on neural networks
This methodology estimates very complex functions (like the creditworthiness of an applicant) through solving for a series of simple aX + b equations.
This is what firms like FICO do. Their resources are spent on making a model that can make estimates as close to the very complex credibility function as possible.
Each of the circles, nodes, or “neurons” in the image above, calculate aX + b – the answers transformed by a neuron on the next level in a similar aX + b equation. This process is repeated over many layers, making the model very complex but also accurate.
However, in this process the data is inevitably turned into a “smoothie”, making it very difficult to discern exactly what calculations and data points (decisions) led to a specific prediction.
The structure of neural networks makes it incredibly hard to peer inside the model, which turns it into a “black box problem” – a hard-to-forget term about the opaqueness of machine-learning algorithms like neural networks.
As opaque as they are, whenever there are instances of inaccurate or biased predictions, stakeholder companies are called into question about a model that makes no promises on interpretability. In bank-speak, this turns the black box nature of AI into a compliance issue.
Scott Zoldi, FICO's chief analytics officer, explains that the requirement has affected the firm’s decisions over the years. "We need to explain the reasons for credit scores to help fraud analysts and also use them to establish customer and industry trust."
Since the brunt of FICO’s services is AI-dependent, it makes sense that there is a dedicated research team that monitors and innovates for the model.
For banks, things aren’t so cut and dry.
This is where Fiddler AI comes in. The firm builds explainable AI solutions for organizations running machine learning algorithms, and recently, it has seen FIs tilt towards explainability.
One reason is regulation, explains the chief scientist at Fiddler AI, Krishnaram Kenthapadi – an experienced professional in this field, having led responsible AI efforts at AWS, LinkedIn and Microsoft.
“Whether these are consumer protection-focused regulations, like Equal Credit Opportunity Act, Civil Rights Act, or more risk-focused regulations like the OCC audit, or ASR-11, banks are interested in understanding how their models are behaving and ensuring governance and risk management of models,” he said.
Fiddler AI’s CEO, Krishna Gade, has some experience untangling the messy predictions machine learning algorithms make. The now common “Why am I seeing this ad?” feature on Facebook was something Gade worked on during his time at Facebook.
While machine learning models can be opaque if not built for visibility from the get-go, there are mathematical techniques that can explain those predictions with reasonable confidence. Different firms have their own solutions to this problem, so let’s look at FICO and Fiddler AI to see how they’ve solved for visibility.
Looking under the hood
Fiddler AI’s core technique revolves around Shapley values. This is a game theory technique developed by Lloyd Shapley that can be adopted to see how particular variables or data types influence a model’s decision-making.
This is done by retrieving the model as is (a black box) and querying it multiple times with changed inputs. These changes are aggregated, and with some added mathematical voodoo, you can tell how much any model relies on a specific type of data.
Plus, it also gives insight into how PII (Personally Identifiable Information) or data types like gender, race and age are affecting overall model performance. These insights can help organizations fulfill many compliance and regulatory requirements, such as Ethical AI or Responsible AI audits.
Since these approaches are even newer than AI itself, Fiddler AI often finds itself engaged at the edge of research. “We published about the Shapley framework a few years back and also on intersectional fairness,” Kenthapadi elaborated.
While Fiddler AI successively takes out variables from the input to see how each changes the predictions, FICO has its own way of solving for visibility within its neural networks.
Rather than dealing with explainability in post, the firm builds its neural networks to be interpretable from the get-go. “Our model exposes the explanation of outcomes/scores, and highlights any bias in any latent (hidden) features,” said Zoldi.
This allows the firm to set monitoring thresholds under which transactions are scored: “From this, we can derive early warnings from a monitoring perspective – letting us know when the model is ill-suited for use.” added Zoldi.
Bias analysis is central to FICO too. The same focus on latent (hidden) features allows it to track any shifts in the model that may indicate increased bias. According to Zoldi, thanks to this approach, FICO doesn’t have to wait for an incorrect credit decision to be made to arrive on-site and fix the issue.
Given its work, history, and deep-domain knowledge, it is not surprising that FICO can engage in an open-heart surgery of its models and shift towards interpretability from the start rather than vying for explainability at the end. For most small banks and FIs, performing a similar overhaul can be nigh impossible, or at least out of the question for the foreseeable future.
Other organizations like Mission Lane, which provides fair-lending products, are also thinking in the same direction.
The firm’s head of data science, Jason Capehart explains, “Machine learning algorithms are a part of how we defend against fraud, reach out to people who might want to become a customer, make lending decisions, and more.”
The firm runs into performance issues as well – this is the burden that comes with powerful AI: it needs to be pruned and regularly watered with data.
This is because AI is unlike other software, and its results are not stable over time. Instead, they are non-deterministic and highly susceptible to real-world conditions, a phenomenon aptly named “model drift”.
Mission Lane’s process is dependent upon an iterative cycle of development, monitoring and redevelopment. One development requirement is acceptable service life, says Capehart. For example, that may mean limiting the scope to sufficiently stable attributes or training the models over several epochs.
In the absence of clear, thick and universal definitions of bias caused by AI, their strategy, like many others, is to put together the best of what they’ve found. The firm is mindful of the USC Information Sciences Institute’s framework on 19 types of biases in AI systems. “In this vein, our primary concern is discrimination. Fair Lending regulations provide essential standards and help create shared language,” he added.
Mapping AI governance
Explainable to whom?
As explainability’s impact on efficiency and accuracy for AI continues to sink in, for once, the call by public advocates and regulatory attention seems to be converging with industry-wide headwinds.
The AI Bill of Rights was deeply anticipated, and it was hoped that it would set the industry in balance. Broadly, the Bill states that Americans should be protected against unsafe algorithms and AI systems, discrimination through these systems, and that they retain rights over their data and to opt out of AI system flows.
However, since its release, the reception has been mixed. While some commend the Bill for its work on issues like data minimization, others still call it “toothless” against Big Tech.
With partner banking, Big Tech is opening new doors for FIs, and FIs are returning the favor, heralding an age of new financial products with immense reach. If regulation remains ineffectual against the biggest digital finance products, how will it ensure explainability en masse?
Moreover, absent from our discussion on this topic is a focus on explainability for users and individuals. Most methods currently in use focus on making algorithms explainable to professionals already endowed with some domain knowledge about the model. However, harm is often experienced by people who are far from the algorithm – people who may even be unaware of it or how it works.
Absent again is an effort to make AI, AI-led decision-making, and its possible consequences simple and easy to understand for all. The first step to this inclusion is transparency. Considering AI can be opaque unless built otherwise, this can be a difficult nut to crack. At least now we know how to get the ball rolling.
Let’s choose the right direction.