☝️This complicated formula for the central limit theorem can help you get to know your customers better.

**Let’s See How**

Have you ever wondered how businesses can accurately measure customer satisfaction without surveying every single customer? The answer lies in a statistical concept known as the central limit theorem. As a data-driven business leader, you’re always on the lookout for new ways to improve the customer experience (CX) of your company. One powerful tool that you may not have considered is the central limit theorem, a statistical formula that has applications in a wide range of fields, including customer experience analysis.

The central limit theorem states that, regardless of the underlying distribution of a population, the distribution of the means of a sufficiently large sample will be approximately normal. In other words, even if a population does not follow a normal distribution, the means of multiple random samples taken from that population will form a bell-shaped curve.

But how does this relate to CX? Imagine a scenario where a business wants to measure customer satisfaction for a specific product or service. While surveying every single customer may be impractical or costly, the business could take a random sample of customers and calculate the average satisfaction score from that sample. According to the central limit theorem, the distribution of the average satisfaction scores from multiple samples will form a normal distribution, allowing businesses to estimate the overall satisfaction score of all customers with a high degree of accuracy.

This powerful insight allows businesses to gain valuable insights into customer satisfaction and identify areas for improvement without surveying every single customer. By using statistical sampling techniques and the central limit theorem, businesses can make data-driven decisions that lead to significant improvements in CX.

But the central limit theorem is not just about sample size and accurate estimates. It also highlights the importance of understanding the behavior of customer data. By understanding the underlying distribution of customer satisfaction data, businesses can take a more targeted approach to CX analysis and identify areas for improvement more effectively.

**Let’s examine some Mathematical proof of this idea.**

Let X1, X2, …, Xn be a random sample of n independent and identically distributed random variables with mean μ and standard deviation σ. Then, as n approaches infinity, the distribution of the sample mean (X̄) approaches a normal distribution with mean μ and standard deviation σ/√n:

where P is the probability, Φ is the standard normal distribution function, z is the standard normal deviation, and X̄ is the sample mean. This formula shows how the central limit theorem can be used to estimate the population means μ from a sample mean X̄, by calculating the probability that the difference between X̄ and μ is within a certain range.

While this formula may seem complex, it highlights the power of the central limit theorem in statistical analysis and its relevance to businesses looking to improve their customer experience.

**Let’s try to understand this using an example**

Let’s say an e-commerce company wants to improve its customer satisfaction scores by improving the user interface of its website (UI). To do this, the company decides to survey its customers, asking them to rate how happy they are with different parts of the website’s user interface (UI), such as how easy it is to navigate, how quickly pages load, and how to check out.

The company takes a sample of 100 customer responses and figures out the average level of satisfaction for each part. But it’s not clear if these sample means are representative of the whole population or if they just come from random differences in the sample.

To answer this question, the company can use the central limit theorem to calculate the standard error of the mean, which gives a measure of the sampling error. By doing this, the company can estimate how much the sample means are likely to differ from the population means. This lets them figure out if the differences in satisfaction scores are statistically significant.

To show this idea, the company can use a dataset with satisfaction scores for each part of the website’s user interface (UI) as well as demographic data like age, gender, and location. By using the central limit theorem to look at this data, the company can figure out which parts of the user interface have the biggest effect on customer satisfaction and focus on improving those parts first.

For example, say the data shows that customers in a certain age group are much less happy with the checkout process. The company can use this information to improve the checkout process for this group, for example by making it easier or giving them more ways to pay.

**Now, let’s back up our idea with some hard evidence.**

You can download this data from here, this analysis is being done using python on jupyter notebook/ google colab.

First, import the necessary libraries into Jupyter Notebook.

```
# Packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as stats
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")
np.random.seed(42)
```

Now, let’s read the dataset we are dealing with,

# Population df = pd.read_csv('/content/survey - Sheet2.csv') #Printing the dataset df

https://flo.uri.sh/visualisation/12804106/embed

Extracting rating for the “Ease of Navigation” Column from the dataset

```
# Extracting rating for "Ease of Navigation" Column from the dataset
df_nav = df[['Ease of Navigation']]
df_nav
# Plotting the distribution graph using Seaborn Library
sns.distplot(df_nav)
plt.show()
```

Now let’s conduct this on a sample size of 30

```
# Taking sample size of 30
samp_size = 30
# Taking the sample and calculating it's mean
df_nav.sample(samp_size).mean()
sample_means = [df_nav.sample(samp_size).mean() for i in range(1000)]
# Storing the mean values in a Series using Pandas
sample_means = pd.Series(sample_means)
# Verifying the total number of samples collected
len(sample_means)
```

```
sns.distplot(sample_means)
plt.show()
```

Overall, this example shows how the central limit theorem can be used to make data-driven decisions that improve the customer experience and, in the end, lead to increased customer loyalty and revenue for the business.

**Conclusion**

In conclusion, the central limit theorem is a fascinating concept that has the potential to revolutionize the way businesses approach CX analysis. By harnessing the power of statistical sampling and understanding the behavior of customer data, businesses can gain valuable insights into customer satisfaction and make data-driven decisions that improve their CX. So the next time you’re analyzing customer data, remember the surprising implications of the central limit theorem and take your CX analysis to the next level.

**References:**

- Gartner. (2021). Gartner Top 10 Strategic Technology Trends for 2021: Customer Experience. Retrieved from https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology-trends-for-2021-customer-experience/
- IBM. (2021). Understanding Customer Preferences. Retrieved from https://www.ibm.com/analytics/customer-preferences
- Gupta, M., & Garg, D. (2020). Impact of sample size and its ratio to population size on statistical analyses: A review of three leading journals in pharmacy. Indian Journal of Pharmacology, 52(5), 339–343.
- Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate data analysis (8th ed.). Cengage Learning.