Apply DA Tag

This is an exciting time to work in big data analytics. Here at Experian, we have more than 2 petabytes of data in the United States alone. In the past few years, because of high data volume, more computing power and the availability of open-source code algorithms, my colleagues and I have watched excitedly as more and more companies are getting into machine learning. We’ve observed the growth of competition sites like Kaggle, open-source code sharing sites like GitHub and various machine learning (ML) data repositories. We’ve noticed that on Kaggle, two algorithms win over and over at supervised learning competitions: If the data is well-structured, teams that use Gradient Boosting Machines (GBM) seem to win. For unstructured data, teams that use neural networks win pretty often. Modeling is both an art and a science. Those winning teams tend to be good at what the machine learning people call feature generation and what we credit scoring people called attribute generation. We have nearly 1,000 expert data scientists in more than 12 countries, many of whom are experts in traditional consumer risk models — techniques such as linear regression, logistic regression, survival analysis, CART (classification and regression trees) and CHAID analysis. So naturally I’ve thought about how GBM could apply in our world. Credit scoring is not quite like a machine learning contest. We have to be sure our decisions are fair and explainable and that any scoring algorithm will generalize to new customer populations and stay stable over time. Increasingly, clients are sending us their data to see what we could do with newer machine learning techniques. We combine their data with our bureau data and even third-party data, we use our world-class attributes and develop custom attributes, and we see what comes out. It’s fun — like getting paid to enter a Kaggle competition! For one financial institution, GBM armed with our patented attributes found a nearly 5 percent lift in KS when compared with traditional statistics. At Experian, we use Extreme Gradient Boosting (XGBoost) implementation of GBM that, out of the box, has regularization features we use to prevent overfitting. But it’s missing some features that we and our clients count on in risk scoring. Our Experian DataLabs team worked with our Decision Analytics team to figure out how to make it work in the real world. We found answers for a couple of important issues: Monotonicity — Risk managers count on the ability to impose what we call monotonicity. In application scoring, applications with better attribute values should score as lower risk than applications with worse values. For example, if consumer Adrienne has fewer delinquent accounts on her credit report than consumer Bill, all other things being equal, Adrienne’s machine learning score should indicate lower risk than Bill’s score. Explainability — We were able to adapt a fairly standard “Adverse Action” methodology from logistic regression to work with GBM. There has been enough enthusiasm around our results that we’ve just turned it into a standard benchmarking service. We help clients appreciate the potential for these new machine learning algorithms by evaluating them on their own data. Over time, the acceptance and use of machine learning techniques will become commonplace among model developers as well as internal validation groups and regulators. Whether you’re a data scientist looking for a cool place to work or a risk manager who wants help evaluating the latest techniques, check out our weekly data science video chats and podcasts.

How a business prices its products is a dynamic process that drives customer satisfaction and loyalty, as well as business success. In the digital age, pricing is becoming even more complex. For example, companies like Amazon may revise the price of a hot item several times per day. Dynamic pricing models for consumer financial products can be especially difficult for at least four reasons: A complex regulatory environment. Fair lending concerns. The potential for adverse selection by risky consumers and fraudsters. The direct impact the affordability of a loan may have on both the consumer’s ability to pay it and the likelihood that it will be prepaid. If a lender offered the same interest rate and terms to every customer for the same loan product, low-risk customers would secure better rates elsewhere, and high-risk customers would not. The end result? Only the higher-risk customers would select the product, which would increase losses and reduce profitability. For this reason, the lending industry has established risk-based pricing. This pricing method addresses the above issue, since customers with different risk profiles are offered different rates. But it’s limited. More advanced lenders also understand the price elasticity of customer demand, because there are diverse reasons why customers decide to take up differently priced loans. Customers have different needs and risk profiles, so they react to a loan offer in different ways. Many factors determine a customer’s propensity to take up an offer — for example, the competitive environment and availability of other lenders, how time-critical the decision is, and the loan terms offered. Understanding the customer’s price elasticity allows a business to offer the ideal price to each customer to maximize profitability. Pricing optimization is the superior method assuming the lender has a scientific, data-driven approach to predicting how different customers will respond to different prices. Optimization allows an organization to determine the best offer for each customer to meet business objectives while adhering to financial and operational constraints such as volume, margin and credit risk. The business can access trade-offs between competing objectives, such as maximizing revenue and maximizing volume, and determine the optimal decision to be made for each individual customer to best meet both objectives. In the table below, you can see five benefits lenders realize when they improve their pricing segmentation with an optimization strategy. Interested in learning more about pricing optimization? Click here to download our full white paper, Price optimization in retail consumer lending.

Machine learning (ML), the newest buzzword, has swept into the lexicon and captured the interest of us all. Its recent, widespread popularity has stemmed mainly from the consumer perspective. Whether it’s virtual assistants, self-driving cars or romantic matchmaking, ML has rapidly positioned itself into the mainstream. Though ML may appear to be a new technology, its use in commercial applications has been around for some time. In fact, many of the data scientists and statisticians at Experian are considered pioneers in the field of ML, going back decades. Our team has developed numerous products and processes leveraging ML, from our world-class consumer fraud and ID protection to producing credit data products like our Trended 3DTM attributes. In fact, we were just highlighted in the Wall Street Journal for how we’re using machine learning to improve our internal IT performance. ML’s ability to consume vast amounts of data to uncover patterns and deliver results that are not humanly possible otherwise is what makes it unique and applicable to so many fields. This predictive power has now sparked interest in the credit risk industry. Unlike fraud detection, where ML is well-established and used extensively, credit risk modeling has until recently taken a cautionary approach to adopting newer ML algorithms. Because of regulatory scrutiny and perceived lack of transparency, ML hasn’t experienced the broad acceptance as some of credit risk modeling’s more utilized applications. When it comes to credit risk models, delivering the most predictive score is not the only consideration for a model’s viability. Modelers must be able to explain and detail the model’s logic, or its “thought process,” for calculating the final score. This means taking steps to ensure the model’s compliance with the Equal Credit Opportunity Act, which forbids discriminatory lending practices. Federal laws also require adverse action responses to be sent by the lender if a consumer’s credit application has been declined. This requires the model must be able to highlight the top reasons for a less than optimal score. And so, while ML may be able to deliver the best predictive accuracy, its ability to explain how the results are generated has always been a concern. ML has been stigmatized as a “black box,” where data mysteriously gets transformed into the final predictions without a clear explanation of how. However, this is changing. Depending on the ML algorithm applied to credit risk modeling, we’ve found risk models can offer the same transparency as more traditional methods such as logistic regression. For example, gradient boosting machines (GBMs) are designed as a predictive model built from a sequence of several decision tree submodels. The very nature of GBMs’ decision tree design allows statisticians to explain the logic behind the model’s predictive behavior. We believe model governance teams and regulators in the United States may become comfortable with this approach more quickly than with deep learning or neural network algorithms. Since GBMs are represented as sets of decision trees that can be explained, while neural networks are represented as long sets of cryptic numbers that are much harder to document, manage and understand. In future blog posts, we’ll discuss the GBM algorithm in more detail and how we’re using its predictability and transparency to maximize credit risk decisioning for our clients.

The August 2018 LinkedIn Workforce Report states some interesting facts about data science and the current workforce in the United States. Demand for data scientists is off the charts, but there is a data science skills shortage in almost every U.S. city — particularly in the New York, San Francisco and Los Angeles areas. Nationally, there is a shortage of more than 150,000 people with data science skills. One way companies in financial services and other industries have coped with the skills gap in analytics is by using outside vendors. A 2017 Dun & Bradstreet and Forbes survey reported that 27 percent of respondents cited a skills gap as a major obstacle to their data and analytics efforts. Outsourcing data science work makes it easier to scale up and scale down as needs arise. But surprisingly, more than half of respondents said the third-party work was superior to their in-house analytics. At Experian, we have participated in quite a few outsourced analytics projects. Here are a few of the lessons we’ve learned along the way: Manage expectations: Everyone has their own management style, but to be successful, you must be proactively involved in managing the partnership with your provider. Doing so will keep them aligned with your objectives and prevent quality degradation or cost increases as you become more tied to them. Communication: Creating open and honest communication between executive management and your resource partner is key. You need to be able to discuss what is working well and what isn’t. This will help to ensure your partner has a thorough understanding of your goals and objectives and will properly manage any bumps in the road. Help external resources feel like a part of the team: When you’re working with external resources, either offshore or onshore, they are typically in an alternate location. This can make them feel like they aren’t a part of the team and therefore not directly tied to the business goals of the project. To help bridge the gap, performing regular status meetings via video conference can help everyone feel like a part of the team. Within these meetings, providing information on the goals and objectives of the project is key. This way, they can hear the message directly from you, which will make them feel more involved and provide a clear understanding of what they need to do to be successful. Being able to put faces to names, as well as having direct communication with you, will help external employees feel included. Drive engagement through recognition programs: Research has shown that employees are more engaged in their work when they receive recognition for their efforts. While you may not be able to provide a monetary award, recognition is still a big driver for engagement. It can be as simple as recognizing a job well done during your video conference meetings, providing certificates of excellence or sending a simple thank-you card to those who are performing well. Either way, taking the extra time to make your external workforce feel appreciated will produce engaged resources that will help drive your business goals forward. Industry training: Your external resources may have the necessary skills needed to perform the job successfully, but they may not have specific industry knowledge geared towards your business. Work with your partner to determine where they have expertise and where you can work together to providing training. Ensure your external workforce will have a solid understanding of the business line they will be supporting. If you’ve decided to augment your staff for your next big project, Experian® can help. Our Analytics on DemandTM service provides senior-level analysts, either onshore or offshore, who can help with analytical data science and modeling work for your organization.

Federal legislation makes verifying an individual’s identity by scanning identity documents during onboarding legal in all 50 states Originally posted on Mitek blog The Making Online Banking Initiation Legal and Easy (MOBILE) Act officially became law on May 24, 2018, authorizing a national standard for banks to scan and retain information from driver’s licenses and identity cards as part of a customer online onboarding process, via smartphone or website. This bill, which was proposed in 2017 with bipartisan support, allows financial institutions to fully deploy mobile technology that can make digital account openings across all states seamless and cost efficient. The MOBILE Act also stipulates that the digital image would be destroyed after account opening to further ensure customer data security. As an additional security measure, section 213 of the act mandates an update to the system to confirm matches of names to social security numbers. “The additional security this process could add for online account origination was a key selling point with the Equifax data breach fresh on everyone’s minds,” Scott Sargent, of counsel in the law firm Baker Donelson’s financial service practice, recently commented on AmericanBanker.com. Read the full article here. Though digital banking and an online onboarding process has already been a best practice for financial institutions in recent years, the MOBILE Act officially overrules any potential state legislation that, up to this point, has not recognized digital images of identity documents as valid. The MOBILE Act states: “This bill authorizes a financial institution to record personal information from a scan, copy, or image of an individual’s driver’s license or personal identification card and store the information electronically when an individual initiates an online request to open an account or obtain a financial product. The financial institution may use the information for the purpose of verifying the authenticity of the driver’s license or identification card, verifying the identity of the individual, or complying with legal requirements.” Why adopt online banking? The recently passed MOBILE Act is a boon for both financial institutions and end users. The legislation: Enables and encourages financial institutions to meet their digital transformation goals Makes the process safe with digital ID verification capabilities and other security measures Reduces time, manual Know Your Customer (KYC) duties and costs to financial institutions for onboarding new customers Provides the convenient, on-demand experience that customers want and expect The facts: 61% of people use their mobile phone to carry out banking activity.1 77% of Americans have smartphones.2 50 million consumers who are unbanked or underbanked use smartphones.3 The MOBILE Act doesn’t require any regulatory implementation. Banks can access this real-time electronic process directly or through vendors. Read all you need to know about the MOBILE Act here. Find out more about a better way to manage fraud and identity services. References 1Mobile Ecosystem Forum, MEF Mobile Money Report (https://mobileecosystemforum.com/mobile-money-report/), Feb. 5, 2018. 2Pew Research Center, Mobile Fact Sheet (http://www.pewinternet.org/fact-sheet/mobile/), Jan. 30, 2017. 3The Federal Reserve System, Consumers and Mobile Financial Services 2015 (https://www.federalreserve.gov/econresdata/consumers-and-mobile-financial-services-report-201503.pdf), March 2015.

With credit card openings and usage increasing, now is the time to make sure your financial institution is optimizing its credit card portfolio. Here are some insights on credit card trends: 51% of consumers obtained a credit card application via a digital channel. 42% of credit card applications were completed on a mobile device. The top incentives when selecting a rewards card are cash back (81%), gas rewards (74%) and retail gift cards (71%). Understanding and having a full view of your customers’ activity, behaviors and preferences can help maximize your wallet share. More credit card insight>

Identity-related fraud exposure and losses are increasing, and the underlying schemes are becoming more complex. To make better decisions on the need for step-up authentication in this dynamic environment, you should take a layered approach to the services you need. Some of these services include: Identity verification and reverification checks for ongoing reaffirmation of your customer identity data quality and accuracy. Targeted identity risk scores and underlying attributes designed to isolate identity theft, first-party fraud and synthetic identity. Layered, passive or more active authentication, such as document verification, biometrics, knowledge-based authentication and alternate data sources. Bad guys are more motivated, and they’re getting better at identity theft and synthetic identity attacks. Fraud prevention needs to advance as well. Future-proof your investments. More fraud prevention strategies to consider>

Consumer credit scores A recent survey* released by the Consumer Federation of America and VantageScore Solutions, LLC, shows that potential borrowers are more likely to have obtained their credit score than nonborrowers. 70% of those intending to take out a consumer or mortgage loan in the next year received their credit score in the past year, compared with 57% of those not planning to borrow. Consumers who obtained at least one credit score in the past year were more likely to say their knowledge of scores is good or excellent compared with those who haven’t (68% versus 45%). While progress is being made, there’s still a lot of room for improvement. By educating consumers, lenders can strengthen consumer relationships and reduce loss rates. It’s a win-win for consumers and financial institutions. Credit education for your customers>

Keeping your customers happy is critical to success. And while reducing fraud is imperative, it shouldn’t detract from a positive customer experience. Here are 3 fraud detection and prevention strategies that can help you reduce fraud and protect (and retain) customers. Use customer-centric strategies — Recognizing legitimate customers online is more important than ever, particularly since the web’s built-in anonymity makes it a breeding ground for scammers and fraudsters. Balance fraud prevention and the customer experience — When implementing security protocols, consider consumers’ fluctuating and potentially diminishing tolerance levels for security protocols. Embrace new fraud protection technologies — Multilayered approaches should include data-driven, artificial intelligence–powered systems that will recognize customers while keeping their transactions stress-free. Fraud prevention shouldn’t discourage honest customers from buying, but it should instill confidence and strengthen the customer relationship. Learn more>

Believe it or not, 66% of consumers want to see some visible signs of security and barriers when accessing their accounts so they can be sure that a transaction is more secure. Other takeaways from our 2018 Global Fraud and Identity Report: Nearly 3/4 of surveyed businesses cite fraud as growing over the past 12 months. 30% of surveyed businesses are experiencing more fraud losses year-over-year. While 83% of businesses believe that their fraud solutions are scalable, cost is the biggest obstacle to adopting new tactics. There’s a delicate balance in delivering a digital experience that instills confidence while allowing for easy and convenient account access. It’s not easy to deliver both — but it is possible.

Business guide to new markets Competition is fierce. Expectations are high. Navigating a new market can be profitable — if managed strategically. Consider these actionable insights when entering a new market: Use historical data to identify the right target population. Identify, access and leverage the right data to gain the insights you need to make sound decisions. Consider insights from a seasoned professional for a bigger, more accurate picture of the market. Entering a new market isn’t without some risk. But with the right data, strategies and expertise, you can navigate new markets, reduce risk and start making profitable decisions. Learn more>

Customer Identification Program (CIP) solution through CrossCore® Every day, I work closely with clients to reduce the negative side effects of fraud prevention. I hear the need for lower false-positive rates; maximum fraud detection in populations; and simple, streamlined verification processes. Lately, more conversations have turned toward ID verification needs for Customer Information Program (CIP) administration. As it turns out, barriers to growth, high customer friction and high costs dominate the CIP landscape. While the marketplace struggles to manage the impact of fraud prevention, CIP routinely disrupts more than 10 percent of new customer acquisitions. Internally at Experian, we talk about this as the biggest ID problem our customers aren’t solving. Think about this: The fight for business in the CIP space quickly turned to price, and price was defined by unit cost. But what’s the real cost? One of the dominant CIP solutions uses a series of hyperlinks to connect identity data. Every click is a new charge. Their website invites users to dig into the data — manually. Users keep digging, and they keep paying. And the challenges don’t stop there. Consider the data sources used for these solutions. The winners of the price fight built CIP solutions around credit bureau header data. What does that do for growth? If the identity wasn’t sufficiently verified when a credit report was pulled, does it make sense to go back to the same data source? Keep digging. Cha-ching, cha-ching. Right about now, you might be feeling like there’s some sleight of hand going on. The true cost of CIP administration is much more than a single unit price. It’s many units, manual effort, recycled data and frustrated customers — and it impacts far more clients than fraud prevention. CIP needs have moved far beyond the demand for a low-cost solution. We’re thrilled to be leading the move toward more robust data and decision capabilities to CIP through CrossCore®. With its open architecture and flexible decision structure, our CrossCore platform enables access to a diverse and robust set of data sources to meet these needs. CrossCore unites Experian data, client data and a growing list of available partner data to deliver an intelligent and cost-conscious approach to managing fraud and identity challenges. The next step will unify CIP administration, fraud analytics and a range of verification treatment options together on the CrossCore platform as well. Spoiler alert. We’ve already taken that step.

When developing a risk model, validation is an essential step in evaluating and verifying a model’s predictive performance. There are two types of data samples that can be used to validate a model. In-time validation or holdout sample: Random partitioning of the development sample is used to separate the data into a sample set for development and another set aside for validation. Out-of-time validation sample: Data from an entirely different period or customer campaign is used to determine the model’s performance. We live in a complicated world. Models can help reduce that complexity. Understanding a model’s predictive ability prior to implementation is critical to reducing risk and growing your bottom line. Learn more

As I mentioned in my previous blog, model validation is an essential step in evaluating a recently developed predictive model’s performance before finalizing and proceeding with implementation. An in-time validation sample is created to set aside a portion of the total model development sample so the predictive accuracy can be measured on a data sample not used to develop the model. However, if few records in the target performance group are available, splitting the total model development sample into the development and in-time validation samples will leave too few records in the target group for use during model development. An alternative approach to generating a validation sample is to use a resampling technique. There are many different types and variations of resampling methods. This blog will address a few common techniques. Jackknife technique — An iterative process whereby an observation is removed from each subsequent sample generation. So if there are N number of observations in the data, jackknifing calculates the model estimates on N - 1 different samples, with each sample having N - 1 observations. The model then is applied to each sample, and an average of the model predictions across all samples is derived to generate an overall measure of model performance and prediction accuracy. The jackknife technique can be broadened to a group of observations removed from each subsequent sample generation while giving equal opportunity for inclusion and exclusion to each observation in the data set. K-fold cross-validation — Generates multiple validation data sets from the holdout sample created for the model validation exercise, i.e., the holdout data is split into K subsets. The model then is applied to the K validation subsets, with each subset held out during the iterative process as the validation set while the model scores the remaining K-1 subsets. Again, an average of the predictions across the multiple validation samples is used to create an overall measure of model performance and prediction accuracy. Bootstrap technique — Generates subsets from the full model development data sample, with replacement, producing multiple samples generally of equal size. Thus, with a total sample size of N, this technique generates N random samples such that a single observation can be present in multiple subsets while another observation may not be present in any of the generated subsets. The generated samples are combined into a simulated larger data sample that then can be split into a development and an in-time, or holdout, validation sample. Before selecting a resampling technique, it’s important to check and verify data assumptions for each technique against the data sample selected for your model development, as some resampling techniques are more sensitive than others to violations of data assumptions. Learn more about how Experian Decision Analytics can help you with your custom model development.

There’s no question today’s consumers have high expectations. As financial services companies wrestle with the laws and consumer demands, here are a few points to consider: While digital delivery channels may be new, the underlying credit product remains the same. With digital delivery, adhere to credit regulations, but build in enhanced policies and technological protocols. Consult your legal, risk and compliance teams regularly. Embrace the multitude of delivery methods, including email, text, digital display and beyond. When using the latest technology, you need to work with the right partners. They can help you respect the data and consumer privacy laws, which is the foundation on which strategies should be built. Learn more