Loading...

Leveraging Data-Centric AI for Better Business Outcomes

Published: September 13, 2023 by Julie Lee

Group meeting

From science fiction-worthy image generators to automated underwriting, artificial intelligence (AI), big data sets and advances in computing power are transforming how we play and work. While the focus in the lending space has often been on improving the AI models that analyze data, the data that feeds into the models is just as important. Enter: data-centric AI.

What is a data-centric AI?

Dr. Andrew Ng, a leader in the AI field, advocates for data-centric AI and is often credited with coining the term. According to Dr. Ng, data-centric AI is, ‘the discipline of systematically engineering the data used to build an AI system.’1

To break down the definition, think of AI systems as a combination of code and data. The code is the model or algorithm that analyzes data to produce a result. The data is the information you use to train the model or later feed into the model to request a result.

Traditional approaches to AI focus on the code — the models. Multiple organizations download and use the same data sets to create and improve models. But today, continued focus on model development may offer a limited return in certain industries and use cases.

A data-centric AI approach focuses on developing tools and practices that improve the data.

You may still need to pay attention to model development but no longer treat the data as constant. Instead, you try to improve a model’s performance by increasing data quality. This can be achieved in different ways, such as using more consistent labeling, removing noisy data and collecting additional data.2

Data-centric AI isn’t just about improving data quality when you build a model — it’s also part of the ongoing iterative process. The data-focused approach should continue during post-deployment model monitoring and maintenance.

Data-centric AI in lending

Organizations in multiple industries are exploring how a data-centric approach can help them improve model performance, fairness and business outcomes. For example, lenders that take a data-centric approach to underwriting may be able to expand their lending universe, drive growth and fulfill financial inclusion goals without taking on additional risk.

Conventional credit scoring models have been trained on consumer credit bureau data for decades. New versions of these models might offer increased performance because they incorporate changes in the economic landscape, consumer behavior and advances in analytics. And some new models are built with a more data-centric approach that considers additional data points from the existing data sets — such as trended data — to score consumers more accurately. However, they still solely rely on credit bureau data.

Explainability and transparency are essential components of responsible AI and machine learning (a type of AI) in underwriting. Organizations need to be able to explain how their models come to decisions and ensure they are behaving as expected.

Model developers and lenders that use AI to build credit risk models can incorporate new high-quality data to supplement existing data sets. Alternative credit data can include information from alternative financial services, public records, consumer-permissioned data, and buy now, pay later (BNPL) data that lenders can use in compliance with the Fair Credit Reporting Act (FCRA).*

The resulting AI-driven models may more accurately predict credit risk — decreasing lenders’ losses. The models can also use alternative credit data to score consumers that conventional models can’t score.

Infographic: From initial strategy to results — with stops at verification, decisioning and approval — see how customers travel across an Automated Loan Underwriting Journey.

Business benefit of using data-centric AI models

Financial services organizations can benefit from using a data-centric AI approach to create models across the customer lifecycle. That may be why about 70 percent of businesses frequently discuss using advanced analytics and AI within underwriting and collections.3

Many have gone a step further and implemented AI. Underwriting is one of the main applications for machine learning models today, and lenders are using machine learning to:4

  • More accurately assess credit risk models.
  • Decrease model development, deployment and recalibration timelines.
  • Incorporate more alternative credit data into credit decisioning.

AI analytics solutions may also increase customer lifetime value by helping lenders manage credit lines, increase retention, cross-sell products and improve collection efforts. Additionally, data-centric AI can assist with fraud detection and prevention.

Case study: Learn how Atlas Credit, a small-dollar lender, used a machine learning model and loan automation to nearly doubled its loan approval rates while decreasing its credit risk losses.

How Experian helps clients leverage data-centric AI for better business outcomes

During a presentation in 2021, Dr. Ng used the 80-20 rule and cooking as an analogy to explain why the shift to data-centric AI makes sense.5 You might be able to make an okay meal with old or low-quality ingredients. However, if you source and prepare high-quality ingredients, you’re already 80% of the way toward making a great meal.

Your data is the primary ingredient for your model — do you want to use old and low-quality data?

Experian has provided organizations with high-quality consumer and business credit solutions for decades, and our industry-leading data sources, models and analytics allow you to build models and make confident decisions.

If you need a sous-chef, Experian offers services and has data professionals who can help you create AI-powered predictive analytics models using bureau data, alternative data and your in-house data.

Learn more about our AI analytics solutions and how you can get started today.

1DataCentricAI. (2023). Data-Centric AI.
2Exchange.scale (2021). The Data-Centric AI Approach With Andrew Ng.
3Experian (2021). Global Insights Report September/October 2021.
4FinRegLab (2021). The Use of Machine Learning for Credit Underwriting: Market & Data Science Context
5YouTube (2021). A Chat with Andrew on MLOps: From Model-Centric to Data-Centric AI
*Disclaimer: When we refer to “Alternative Credit Data,” this refers to the use of alternative data and its appropriate use in consumer credit lending decisions, as regulated by the Fair Credit Reporting Act. Hence, the term “Expanded FCRA Data” may also apply in this instance and both can be used interchangeably.

Related Posts

Fake IDs have been around for decades, but today’s fraudsters aren’t just printing counterfeit driver’s licenses — they’re using artificial intelligence (AI) to create synthetic identities. These AI fake IDs bypass traditional security checks, making it harder for businesses to distinguish real customers from fraudsters. To stay ahead, organizations need to rethink their fraud prevention solutions and invest in advanced tools to stop bad actors before they gain access. The growing threat of AI Fake IDs   AI-generated IDs aren’t just a problem for bars and nightclubs; they’re a serious risk across industries. Fraudsters use AI to generate high-quality fake government-issued IDs, complete with real-looking holograms and barcodes. These fake IDs can be used to commit financial fraud, apply for loans or even launder money. Emerging services like OnlyFake are making AI-generated fake IDs accessible. For $15, users can generate realistic government-issued IDs that can bypass identity verification checks, including Know Your Customer (KYC) processes on major cryptocurrency exchanges.1 Who’s at risk? AI-driven identity fraud is a growing problem for: Financial services – Fraudsters use AI-generated IDs to open bank accounts, apply for loans and commit credit card fraud. Without strong identity verification and fraud detection, banks may unknowingly approve fraudulent applications. E-commerce and retail – Fake accounts enable fraudsters to make unauthorized purchases, exploit return policies and commit chargeback fraud. Businesses relying on outdated identity verification methods are especially vulnerable. Healthcare and insurance – Fraudsters use fake identities to access medical services, prescription drugs or insurance benefits, creating both financial and compliance risks. The rise of synthetic ID fraud Fraudsters don’t just stop at creating fake IDs — they take it a step further by combining real and fake information to create entirely new identities. This is known as synthetic ID fraud, a rapidly growing threat in the digital economy. Unlike traditional identity theft, where a criminal steals an existing person’s information, synthetic identity fraud involves fabricating an identity that has no real-world counterpart. This makes detection more difficult, as there’s no individual to report fraudulent activity. Without strong synthetic fraud detection measures in place, businesses may unknowingly approve loans, credit cards or accounts for these fake identities. The deepfake threat AI-powered fraud isn’t limited to generating fake physical IDs. Fraudsters are also using deepfake technology to impersonate real people. With advanced AI, they can create hyper-realistic photos, videos and voice recordings to bypass facial recognition and biometric verification. For businesses relying on ID document scans and video verification, this can be a serious problem. Fraudsters can: Use AI-generated faces to create entirely fake identities that appear legitimate Manipulate real customer videos to pass live identity checks Clone voices to trick call centers and voice authentication systems As deepfake technology improves, businesses need fraud prevention solutions that go beyond traditional ID verification. AI-powered synthetic fraud detection can analyze biometric inconsistencies, detect signs of image manipulation and flag suspicious behavior. How businesses can combat AI fake ID fraud Stopping AI-powered fraud requires more than just traditional ID checks. Businesses need to upgrade their fraud defenses with identity solutions that use multidimensional data, advanced analytics and machine learning to verify identities in real time. Here’s how: Leverage AI-powered fraud detection – The same AI capabilities that fraudsters use can also be used against them. Identity verification systems powered by machine learning can detect anomalies in ID documents, biometrics and user behavior. Implement robust KYC solutions – KYC protocols help businesses verify customer identities more accurately. Enhanced KYC solutions use multi-layered authentication methods to detect fraudulent applications before they’re approved. Adopt real-time fraud prevention solutions – Businesses should invest in fraud prevention solutions that analyze transaction patterns and device intelligence to flag suspicious activity. Strengthen synthetic identity fraud detection – Detecting synthetic identities requires a combination of behavioral analytics, document verification and cross-industry data matching. Advanced synthetic fraud detection tools can help businesses identify and block synthetic identities. Stay ahead of AI fraudsters AI-generated fake IDs and synthetic identities are evolving, but businesses don’t have to be caught off guard. By investing in identity solutions that leverage AI-driven fraud detection, businesses can protect themselves from costly fraud schemes while ensuring a seamless experience for legitimate customers. At Experian, we combine cutting-edge fraud prevention, KYC and authentication solutions to help businesses detect and prevent AI-generated fake ID and synthetic ID fraud before they cause damage. Our advanced analytics, machine learning models and real-time data insights provide the intelligence businesses need to outsmart fraudsters. Learn more *This article includes content created by an AI language model and is intended to provide general information. 1 https://www.404media.co/inside-the-underground-site-where-ai-neural-networks-churns-out-fake-ids-onlyfake/

Published: March 20, 2025 by Julie Lee

Discover how data analytics in utilities helps energy providers navigate regulatory, economic, and operational challenges. Learn how utility analytics and advanced analytics solutions from Experian can optimize operations and enhance customer engagement.

Published: March 10, 2025 by Stefani Wendel

Romance scams target individuals of all ages and backgrounds. Financial institutions need to protect their customers from these schemes.

Published: February 5, 2025 by Alex Lvoff