Data related to time and duration appears in many settings in business and marketing, like duration of unemployment for instance. Consider that we have taken a sample of N persons at random who are losing their jobs but then begin to find more employment. Plus, we also have some data of the days or weeks that every person has been without a job, t1, t2, …, tN. This article will mostly discuss how to model the data of continuous-time duration. The duration data is usually created by finding the answer to the “how much longer” or “when” questions, and then evaluating the duration in which individuals remain in a certain type of state. The basic examples in the field of marketing and customer analysis are: “The customer still alive? (Buy till you die approach)”, the latest product diffusion forecasting, time of making transactions in the stock market and the stay duration on a job.
Firstly, we need to introduce the expression of likelihood for the nonstop or continual duration data modeling. So we allow T to denote a random non-negative variable that represents the wait time up till the occurrence of any event. The T is distributed exponential with a rate parameter λ. The chance that the individual will remain in a particular state by time t is provided by the following equation;
This is also the function of cumulative probability of a random variable T.
The next step is to show the two classic applications in customer analytics.
(1) The process of dropping out in the transaction flow
The rudimentary marketing question is for how long are we able to expect a customer to stay with us, based on the given information about his continuous relationship with us till date. When in a non-contractual setting (like a supermarket for instance), shoppers become sedentary after every transaction with probability p(T≤t) and then the customers are heterogeneous. Usually, the model can be portrayed as:
- Every type of consumer has a “lifetime” that is unobserved of length T, which is a distributed exponential with a dropout rate depicted by λ.
- The dropout rates can differ across customers and are then obtained by a Gamma distribution
(2) The New Product Trial Prediction
Before the introduction of a new product, a temporary submission permits customers to use and try out a product. The company can then understand its possibility of success by analyzing data from the data set of randomly obtained customers. Identical to the above circumstance, we can create the model such as:
- Time-to-trial is determined by a massive distribution with the parameter λ which denotes an individual’s trial rate
- Assuming the trial rates are distributed as per a Gamma distribution
To get the heterogeneity across various customers, we can use a Gamma distribution to create the rate of parameter λ.
r is a shape parameter and the α is a scale parameter probability density function. Thus, the possibility that a certain event takes place is determined as
In this article, we will try to determine how to model the process of the production trial by utilizing the duration data models. Table 1 shows a collection of 1499 households who had made a trial purchase by the end of each week in the time period of 24 weeks. We can see there are a total of 101 customers who made a purchase by 24 weeks after they got some promotion offers.
Table 1 Cumulative Households Adopting a New Product
Identical to other probability models of customer analytics, we can also deploy the supreme probability estimation technique to attain the model parameters. The log-likelihood function for this question can be determined by the following:
P(t1<T≤t2) is a derivative from Eq.3, equaling P(T≤t2) minus P(T≤t1). Plus, we can also use the packages of SAS, R, and python to solve this approximation. Using the model parameter, marketing professionals can make a prediction of the product’s performance one year later, once they officially announce and launch the new product. The increasing amount of households that may have purchased the new product by time t can be depicted using the following equation;
In this article, we can just show a brief review of the continuous-time duration model. Though, the duration model has a closer link with the survival analysis of where the event of incidence is termed as ‘death’ and the wait time is termed ‘survival’ time. Thus, one of the main features for duration data modeling is that the observations are censored. This means that the sense for some units of the event of interest did not happen at the time the data was studied or analyzed. For instance, in the above discussed case, there is no trial purchase of the households that has been recorded even after 24 weeks that is right-censored.
One more serious extension of the “exponential-gamma” model is the combination of the effects of observed covariates, including customer attributes, attitudes and behavior, along with marketing activities. There are two techniques of dealing with covariate effects. One of them through explicit integration using hazardous functions that are discussed in-depth in survival analysis study. Also, we will need a function to denote the possibility that a consumer who has observed features still does not change even after a certain state at time t. An additional one is to make segments and apply the models distinctly for every segment.