Business Concepts for Data Scientists — Subscriptions (Part III)

Jarus Singh
8 min readJan 30, 2021

This is Part III in my Business Concepts for Data Scientists series, which focuses on subscriptions, more specifically, businesses that run on subscription models. Click the link below to go to my previous post on marketing, and from there you can get to the first post, which has an introduction to the series and a focus on finance and economics.

The concepts below will be useful for data scientists who plan to or are already working at organizations with subscription models. I think reviewing these before job interviews could be particularly useful. When interviewing at places that rely on subscription business models, the interviewer may assume that you are already familiar with these concepts. If not, there’s no guarantee that they’ll explain them in an easy to understand way when asked, and even if they do, it puts more pressure on you as the candidate to both wrap your head around these concepts in addition to solving the data science portion of their problem. They’ll also be useful to sprinkle into conversations with non-data science folks you may interview with, like product, marketing, and finance folks. In addition, data scientists who want to grow into management roles will need a strong understanding of these concepts in order to scope projects and interface with company leadership. In this article, I define terms and give relevant examples of how data scientists at organizations with subscription models might use them on the job. This list is by no means exhaustive. Please leave a comment or let me know on Twitter if you have suggested additions!

Photograph of a stack of Vogue magazines.
Magazines, an industry that relies on subscription models. Photo by Charisse Kenion on Unsplash

Subscription Concepts for Data Scientists

  • Churn (Rate): Churn is probably the most mentioned concept when it comes to subscription businesses, and a particularly relevant one for data scientists. When I google “subscription churn”, the top hits include “model”, “prediction”, and “average”. A user churns when they stop subscribing to a given service. The churn rate is the percentage of subscribers in a given period who are no longer subscribing in the next period. Data scientists may be asked to calculate churn, visualize trends and flag anomalies, predict it in future periods, and model how various business initiatives (e.g. lowering price, adding additional features) can affect it.
  • Voluntary vs. Involuntary Churn: When I imagine examples of customers churning, I think of those opting out with the quick click of a “unsubscribe” button to those driven to madness by kafkaesque service calls with Comcast (e.g. Ryan Block). This is referred to as voluntary churn. But there’s another way that customers can churn — the method they use for payment may no longer be valid. This is involuntary churn. Whether it’s a credit card that has expired, or a debit or gift card without a large enough balance, sometimes customers lose their subscriptions without lifting a finger. The tactics businesses employ to reduce voluntary vs. involuntary churn are different, so its worth thinking about them differently. In my experience, most of the time when the type of churn is unspecified, the person using the term is referring to voluntary churn.
  • Retention (Rate): If a subscriber doesn’t churn, and they instead continue subscribing from one period to the next, they’re retained. One doesn’t need to put on their Advanced Mathematics Cap to see that: (1-churn rate) = retention rate.
  • Churn Predictors: A good question an interviewer could give to prospective data science candidates, especially senior or lead ones who will be expected to solve difficult problems with little to no guidance, would be: “given what you know about our business, what variables would you want in order to model customer churn?” I’d recommend breaking the variables you list into a few different categories. The first, and probably most important, is engagement related. Highly engaged users are less likely to churn (although there may be some exceptions, like the person who has decided they’ll extract maximum value out of a service before churning. “Three days left in my Netflix subscription to finish Star Trek: Deep Space 9? Better put on the sweatpants and make some popcorn.”) The second is demographic. It’s possible that adding these dimensions improves the forecast, for example, controlling for a given level of engagement, younger subscribers may be more likely to churn than older ones because they have lower incomes: they need to get more value out of a subscription to continue subscribing due to their limited budget. The last is acquisition related. When did they sign up? How did we get these subscribers in the first place? Did they navigate to our website or app and sign up there? Were they referred by another user? Did we get them through a marketing campaign?
  • Winback: Just because a subscriber churned, it doesn’t mean that they’re gone from your service forever. Like wooing an ex-lover, a company can get them to subscribe again, also known as winning them back. Winbacks can be a great way to increase subscriptions as you’re bringing back people who already have some knowledge and experience of your products. A binary winback flag for subscribers: 0 if they’ve never churned, and 1 if they have and were won back, can be an insightful way of partitioning your subscriber base, as these groups may exhibit different churn and engagement behaviors.
  • Renewal: Most services form companies I’ve used have a credit card backed auto-renewal strategy, where customers are charged automatically at the beginning of their next subscription period. Not all do though, and some even offer auto- and non-auto-renewing options. One example I encountered recently which is currently auto-renewing, but doesn’t feel like it, is where I signed up for a subscription using a gift card for a site. If at the next billing cycle my gift card balance is too low (spoiler alert: it is), I’ll churn. Technically I’d argue that this is ultimately an auto-renewing arrangement, but the low likelihood of renewal means that users like me should be treated separately in data analysis and modeling.
  • Subscription Billing Cycle: How long are your subscriptions valid for? Typical period lengths are a week (e.g. recurring grocery delivery), a month (e.g. most other services) and a year (e.g. many, but not all services offer this in addition to a monthly option for a discount).
  • Subscription Price: Speaking of discounts, what do these subscriptions even cost in the first place? Simple business models have one, invariant price (e.g. $9.99 monthly). Others may be more complicated, like offering a low promotional price for a few months and then charging more in subsequent ones. They may or may not allow customers to get those lower promotional prices again if those customers threaten to churn, for example by opting to unsubscribe on a website and being routed to a webpage with the deal before being able to unsubscribe. Data scientists can do experimentation or causal inference on their subscriber base to create the best subscription pricing strategy for their customers.
  • Subscription Tier: Often, companies offer multiple tiers of service. Online publications may restrict content except for those who pay for higher subscription tiers. Some streaming services have limited ads at lower cost subscriptions and no ads at higher ones. As with price, data scientists can try to estimate the value of various features through survey research or causal inference in order to help a company best structure its subscription offerings.
  • Monthly/Annual Recurring Revenue (MRR/ARR): These are the revenue earned from subscriptions in a month or a year. The term “recurring” is used because this is revenue that the company is quite likely to generate each month reliably, say, as opposed to ad revenue which may be more seasonal. This doesn’t mean the revenue is guaranteed though. Subscribers can always churn!
  • Average Revenue per User (ARPU): For companies that only offer subscriptions at one price and no alternate means of generating revenue, the average revenue per user (ARPU) is simply the subscription price (number of subscribers * price per subscription = total subscription revenue; total subscription revenue / number of subscribers = price per subscription). For companies that offer subscriptions at different prices, usually because they offer different tiers of service, this number is a weighted average. After reading the below bullet, it should be clear that a company’s ARPU can change over time even though the number of subscribers it has is constant.
  • Upgrade/Downgrade: Subscribers don’t just subscribe and churn — they can move between different tiers of service. When they move to more premium, higher price tiers, they’re upgrading. If they do the opposite, they’re downgrading. This concept explains why subscribers can remain constant but ARPU can change over time, as the distribution of subscribers within each tier of service can shift ARPU. In addition to the classic data science tasks (calculation/reporting, visualization, prediction and inference), data scientists can also be asked to do attribution analysis on how expected upgrade and downgrade behavior drives downstream metrics. For instance, say a company misses its revenue target by 3% by the end of the year. Was this because fewer subscribers upgraded than expected? More folks downgraded? Was there more churn, or fewer newly acquired subscribers? This attribution analysis shows company leadership of the most pressing areas to address.
  • Trials (Free or Paid): Many subscription businesses let their services for free (free trial) or a discount (paid trial). When those in trials become subscribers, they’re converted. Obviously, businesses want to maximize this conversion. When working with this data, data scientists should know how long these trials usually last (and if the lengths can vary). If embedded in a product team, they should be able to report on actions that users take within trials, and estimate which ones best lead to conversion. These findings can inform product strategy.
  • Trial Conversion Rate: During a given period, the percentage of those in trials who become subscribers.
  • Subscription+? Business Models: Not all businesses solely rely on subscriptions for revenue. Delivery apps allow subscribers to lower delivery fees. Streaming services may have advertisements, and therefore generate ad revenue, on their lower priced subscriptions. Subscription box services may allow their subscribers to purchase additional items in their weekly or monthly shipments for a discount. Data scientists, especially those interviewing at these companies, should have a basic understanding of the other ways in which these companies generate revenue. How does subscription behavior, such as churn or upgrades/downgrades affect other areas of revenue? In the delivery business example, it’s possible that less is earned from the monthly subscription fee + reduced delivery fees, but this is compensated for my customers ordering more and being less likely to churn in the long term.
  • Non-Profit Subscriptions(?): The subscription business model is not limited to for-profit enterprises. Many non-profits have subscription-like fundraising plans, where they bill their donors a flat amount monthly. Data science can be used in this domain as well. Non-profits will want to optimize what “tiers” of subscription they offer (i.e. how much to donate monthly), what comes with each of those tiers (e.g. a public radio tote bag), and how to retain or “upgrade” their “customers” (i.e. keeping donors donating or increasing their monthly donations).

Congratulations for making it to the end of this article. I hope you found it helpful! If you enjoyed it, feel free to follow me on Medium (a free subscription to my future content!). As a reward, here’s an easter egg: Amos from The Expanse dwelling on his favorite concept, The Churn.

A black and white photo of Amos Burton from the TV show The Expanse next to white text that says “THE CHURN”.
Image from u/Danemon on Reddit

--

--

Jarus Singh

Director, FP&A @Adobe. Mentor @Springboard. Bridging the gap between business and data teams. Opinions are my own. #rstats