News, Technology

Building the Kabbage Small Business Revenue Index

Kabbage's Data Science team outlines the considerations in developing the Kabbage Index

By Vijay Sathish, Hirsh Gaikwad and Chris Tang

We created the Kabbage Small Business Revenue Index to leverage our rich, unique datasets as an indicator of revenue trends in small businesses. In this article, we discuss our thinking behind its design and development. 

Our Motivation

Small businesses represent 99 percent of all businesses and nearly 50% of all jobs and the non-agricultural GDP in the U.S[1]. While the definition of a small business varies depending on the industry[2], 89% have less than 20 employees (of a total of 28 million small businesses in the U.S.)[3]

Yet, there’s very little public data tracking the ongoing financial performance of small businesses today. Although the Russell 2000 index represents the stock performance of 2000 publicly traded small-cap companies in the U.S., as of December 31, 2017, the median market cap is $861 million[4]. This is a poor proxy for the population that we serve at Kabbage, and it’s a poor proxy for small businesses hoping to compare their growth to others.

Kabbage draws from millions of live data connections with its customers to analyze revenue trends, 83% of which have 10 or fewer employees and generate a median annual revenue of $280,000. With ongoing data connectivity to the systems small businesses use to run their companies, we were in a position to build a meaningful Index to show the growth of actual small businesses. 

Overall Design

Our index defines revenue as the median revenue growth of U.S. small businesses (as opposed to the average which is susceptible to outliers). As there is a wide variance in how every business recognizes revenue, Kabbage analyzes the transactions observed in connected bank accounts to estimate monthly revenue. This method represents the cash flow basis of recording revenue rather than accrual basis since bank accounts only record transactions as they are realized, not as they are booked.

What is Revenue?

Simply defined, revenue is money generated by a business in return for sales of goods or services. In practice, calculating revenue is not nearly as simple. Loans, transfers across accounts, and interest earned are examples of cash inflows not considered business revenue and must be separated. While we do continuously develop and improve sophisticated revenue estimation metrics within Kabbage, we also want simple, consistent, and interpretable measures as well.

To do this, we created a measure called the Median Revenue Estimate. Then, the MRE for a business i at time period t is: 

For each observation period, we considered all credit transactions within 60 days of a business’ bank account. For example, the MRE for a business in September 2019 considers credits for August and September 2019. We believe this makes for a robust statistical metric that is stable over a long period of time and correlates well with business revenue, creating the basis for a strong Index.

Computing Revenue Growth

Next, we calculate a monthly rate of change in MRE for each small business:

From a composite population of each business’ growth rates, their median growth is reported monthly as the overall population growth rate. Given the large fluctuations that can occur on a month-to-month basis in small business revenues, the median was again applied to diminish the susceptibility to outliers, finally giving:

The starting point for our Index is January 2017, with an initial Kabbage Index Value (KIV) of 100 points. From there, the above equation is applied inductively to update the KIV on a month-to-month basis going forward.

It is important to note that we are measuring revenue changes as a median growth percentage across the population, not as a change in the median revenue dollar value month over month. This decision was made because of survivorship and connectivity considerations noted below. Further, Kabbage can only observe accounts that the businesses choose to connect to our platform, so there may be some revenue that we cannot observe.

Industry and Index Segmentation

Many small businesses struggle to place themselves into industry categories. In order to accurately assign an industry to each small business, we aggregate industry information from a variety of sources including internal data, publicly-available data, and self-reported values from customers to assign each business an appropriate industry.

In order to help identify subtrends within specific industries or by region, we also breakout and publish the Index by top industries and U.S. states.

Within each segmentation and state, we also build in measures to make sure that the sample sizes were large enough to hold statistical weight. We constantly monitor our connectivity to each individual grouping to ensure that the metrics are not compromised due to sample size issues. 


Accounting for Biases

Selection Bias

All data have biases due to how they are collected, labeled and analyzed. The index is only able to measure the revenue of Kabbage customers. Thus, our marketing efforts and the incoming funnel of businesses can impose selection bias. Fortunately, Kabbage serves nearly every industry of business, and as such, has a portfolio that naturally reflects the national distribution of small businesses in the U.S. 

However, since the mix of industries applying to Kabbage may not necessarily be the same as that of the U.S. overall, we reweight all indices to the number of small businesses recorded by the SBA (Small Business Association) in each state and industry. At the time of writing, the Index employed SBA results from 2016 through 2018.

Survivorship Bias and Account Connectivity

Public research shows about half of all small businesses stay in business five years or longer, and about one-third of establishments survive 10 years or longer[3]. Businesses going out of business, businesses going bankrupt or losing connectivity to their data due to operational issues is common. Each month, we re-evaluate whether or not we have a full financial snapshot of a business for that month. We look at the total number and amount of transactions, the number of days a transaction occurred, as well as other heuristics and checks to make sure that we are including the right businesses in the Index calculation. Whenever events happen that cause a business to miss these thresholds, they are dropped from the index calculation but may be reincluded if criteria are met in the future.

This type of survivorship bias is prevalent even in well-known indices like S&P 500. Just as companies may fall off from the Index, new companies are added each month. The net effect is that the pool of businesses used for the Index calculation increases over time, and since we report the median revenue growth of businesses, our Index is stable to such volatility.

Alternative Approaches Considered

Early on in the development cycle, we considered building a regression to fit the observed growth. The key motivation for this was as follows: there could be cohort-specific idiosyncrasies due to several factors including, but not limited to, updates in our marketing strategies, underwriting models and other changes that could impact the incoming funnel makeup any given month. We postulated that a regression would allow us to separate out these effects using cohort-specific regressors. 

However, we quickly realized the problem with any regression-based approach using historical data: when the regression is re-fit each month, the coefficients would all update, thus changing the Index values from previous months that have already been published. If the historical Kabbage Index Values were changing every month, then there would not be any way to keep track of the performance over time. Furthermore, in reporting ‘smoothed’ values instead, we would potentially be understating meaningful movements that the data would be showing.

In light of these deficiencies, we discarded the regression approach for the simpler, but more robust approach described above. That said, we would love to discuss with anyone who has alternative ideas on how revenue indices can be constructed! 

Ultimately, we hope the Kabbage Small Business Revenue Index will provide small businesses with a baseline to compare their own growth against similarly-sized companies, and provide historical context to identify upcoming trends in their industry or geography. We also hope that the Index will help spur more analysis in this area. Small businesses are the lifeblood of economies across the globe and deserve greater representation, analysis and tools to build the best companies possible. 


[1] “Small Business GDP: Update 2002-2010,” U.S. Small Business Administration, 2012,

Working on a story?

For media and analysts considering Kabbage as a source, we can help! Contact us below, and we’ll get your information into the right hands.

Contact Us