Determining Bad Definition

One of the crucial steps of the score development is determining the bad definition. You already have an “ultimate bad definition”: This is what you want to predict specifically. In different environments the ultimate definition will vary, but here I’ll examine the most common one, which is defaulting.

Whether the customer defaults or not is a binary information, 0 if the customer is good (not defaulted) and 1 if the customer is bad (defaulted). However in most cases, the percentage of portfolio that defaults in a short time frame is very low, i.e. the meaningful default rate occurs in a late stage of product lifetime. Therefore you need to develop a definition that will predict this default rate. An example:

Bad: Ever missed payment for 60 days within the next 12 month frame or has a payment that is 30 days past due in the end of 12th month from now.
Good: Never missed payment for 60 days within the next 12 months and does not have a payment above 30 days past due in the end of 12th month from now.

Also in some cases, you may want to distinguish your goods and bads even further. This is when you develop also the indeterminate definition, which are neither very good nor very bad. Later by excluding these customers, you will be able to have a much more distinguished good customers and bad customers, therefore your good/bad odds will be higher.

The use of indeterminate definition depends on the separation that comes from your performance definition analysis.

Performance Definition Analysis

Group the accounts according to their initial time frame performance, then check each group’s post-period performance. See below for the example (click the picture to see larger):
Bad Definition Performance Eop: End of period. — dwp: # days with missed payment.

We first accepted that our ultimate goal is to predict the definite bads (charged-off customers) or definite goods (totally paid off). As seen in the figure, if we see customers who ever hit 90 days without payment in their first 12 months or if we see customers who will be off-payment for 60-89 days in the end of the period of 12 months – they will be most likely to go full charge-off in the next 12 months.

On the contrary, the customers who do not have any missed payment in the end of 12 months or at least have missed less than 30 days, they are also very likely to stay current (no missed payment) in the next 12 months.

Data sampling

This is a crucial step when deciding the model. Never develop a score without separating the data into samples. The reason for that is a concept called overfitting.

Overfitting: Fitting the model to the training sample by using variable in such way that the independent variable is predicted perfectly. This shows that the coefficients of variables are forcefully pushed to fit the model perfectly, ignoring the potential outliers or deficiencies of the sample. You will have a poor fitting if you apply the same model on another sample.

To prevent this, the method called k-fold cross validation is used. It’s basically separating the total sample into “k” number of samples and training and testing the models on these samples separately to come up with the highest performing model on these sample. I’ll exhibit a simple version of this via separating the data into training and one validation sample (2-fold). It’s important to keep the two data sets randomly separate from each other.

I should give a recommendation here: While developing and validating on training and validation samples, keep a test sample apart. This test sample is usually taken as a different period than the training and validation samples, while representing similar population. In such cases, the test sample is called “out-of-time validation sample”.

The choice of % split of the samples is entirely up to the modeller, however a common usage is 60% training, 20% validation, 20% test sample. Here’s the R code how to do so:

   #Selecting the training / validation sample 
   d = sort(sample(nrow(MyData), nrow(MyData)*.6))
   TrainingData <- MyData[d,]
   OtherData <- MyData[-d,]
   d2 = sort(sample(nrow(OtherData ), nrow(OtherData )*.5))
   ValidationData <- OtherData[d2,]
   TestData <- OtherData[-d2,]

The steps following this is training the models and simply applying the results on cross-validation. There are ways to intuitively avoid overfitting even while developing the model on training sample, however we’ll come to this in modelling section. The choice of best model depends on the model’s separation power on not only the training sample but also (and more importantly) on the cross-validation sample.

Note: “Test/validation” terms may vary from literature to literature.

Portfolio Trends Examination

Obviously you have to understand the portfolio you are developing the score on. Is it credit cards, short term loan or mortgage? Is it an application score or a behavioral score? And more importantly, how will your portfolio behave in the next couple of periods?

loss trends

Take above graph as an example: Knowing that your losses maximize at 12 months after the snapshot and stabilize at 24 months level, you can develop a bad definition that will be able to predict the lifetime loss of the customer.

While your ultimate goal is to predict whether the customer will go default or not, on average this may take place in a distant future (e.g. in 2 years). In such case, you need to come up with a bad definition that will tell you in advance that the customer really will go default in the end of 24 months. This is why portfolio trends evaluation is a crucial step. It will also help during the development of score strategy.

Data Selection

Our purpose is to predict the probability whether the incoming customer is a good account or a bad account. To determine this, we need sets of historical data that already resulted in such classification.

Sources of data

For long years financial industry relied on different sources of data; however with the “big data” boom in the recent years, there are much more different resources available now.


Customer declared and verified information usually helps the understanding of the demographics of the customer. Important note here is the preliminary exclusion of sensitive discriminating information such as race, belief (and in some countries gender).

Credit bureau data is usually highly reliable depending on the country. In most developing markets, bureau is still growing and usually only negative (includes only customers’ negative information, such as missed payments, defaults, etc.) A positive bureau refers to the set of data with the customers’ positive information as well, such as utilization, credit lines, number of trades, etc.

In some countries, there are more than one bureau company, which creates the necessity of developing an optimum bureau strategy.

If the score will be built on an existing customer, previous performance of the customer within the financial institution is highly predictive. This is usually even more predictive than bureau data, since the internal customer represents a profile that the institution is already targeting, while bureau data represents an all-out market data.

In the recent years with the vast amount of data availability, social media information usage also became a trend. Your most interacted friends who have high credibility can define your score to be high as well.

Another recent data that became available is customer’s online behavior. This could be the URL content the customer has been visiting which can define the type of transactions customer do or the type of profile the customer can belong to.

Data selection

The selection of data should be representative of the portfolio we are building the score model on. The set of data should be selected keeping the “data sampling” stage in mind. What I usually recommend is to select a period of historical data where the portfolio trends and seasonality factors are considered and the last couple of snapshot periods are excluded. This excluded snapshot should be added as a “validation sample” after the development stage. For more details on data sampling please click here.



Steps of Credit Scoring

The steps of credit scoring simply varies from environment to environment you’re building the score. The implementation and credit environment defines your variable possibilities, your data sampling, models you can use and obviously your final output.

Here on this website, I will try to adapt a set of process flow where it could be adaptable to such differences in environments. Though I am more familiar with SAS Enterprise Miner in my practical work, I will use the coding in R software as it is available for free use and open to different flexible packages.

So based on different environments, we can follow below path.Below are the steps, click on each to learn the details:

data selection


portfolio trends


bad definition






data sampling


Screen Shot 2015-07-01 at 12.25.51


model building


model testing




score strategy


score maintenance