Customer Segmentation - Using k-means

About: Customer Segmentation is a popular application of unsupervised learning. Using clustering, identify segments of customers to target the potential user base. They divide customers into groups according to common characteristics like gender, age, interests, and spending habits so they can market to each group effectively.

Use K-means clustering and also visualize the gender and age distributions. Then analyze their annual incomes and spending scores.

There are 5 columns CustomerID, Gender, Age, Annual Income and Spending Score in our dataframe 'customer'

We have a data set with 200 rows and 5 columns.

It clearlly shows that there is no NULL value present in our dataframe.

We got values like mean, std deviation, min, max, Q1, Q2 and Q3 for all attributes.

Visualizing various Distributions

This shows that our data has customer ranges from 10 years to 80 years.

This plot is more clear view on counting customer based on their Age. Also we can see that 11 customers are 32 years old which is the most value count.

Based on 5 point summary we can get a clear picture of various aspect of customer based on their age.

This violin plot shows that we have higher number of female customer who belongs to age group of 30 years.

This shows that our data has customer ranges from income of 0k to 150k.

This plot is more clear view on counting customer based on their Income. Also we can see that 12-12 customers are 54 years and 78 years old which is the most value count.

Based on 5 point summary we can get a clear picture of various aspect of customer based on their income.

This violin plot shows that we have higher number of male customer who have more income.

This shows that our data has customer ranges from with -20 to 120 spending score.

This plot is more clear view on counting customer based on their Age. Also we can see that 8 customers are 42 years old which is most value count.

Based on 5 point summary we can get a clear picture of various aspect of customer based on their spending score.

This violin plot shows that we have higher number of female customer who have mostly spending score around 50 .

This plot clearly shows that we have more female customer compare to male customers.

From this plot we got that income and spending score correlates to each other with a good score. But age and spending score does not correlates efficiently.

Cluster based on Annual Income and Spending Score

This scatter plot show the distribution of customers based on their income, spending score and gender. And we can see customer cluster clearlly in this plot.

This elbow method show a low slope line after 5 number of cluster so we can take 5 as optimum number of cluster.

Based on the above clustering we can clearly say that there are five cluster segments present based on customers' Annual Income and Spending Score. We named them as Low budget, Spenders, Average, Savers, and Best.

Cluster based on Age and Spending Score

This scatter plot show the distribution of customers based on their age, spending score and gender. And we can clearly observe that aged people don't have higher spending score.

This elbow method show a low slope line after 4 noumber of cluster so we can take 4 as optimum number of cluster.

Based on the above clustering we can clearly say that there are four cluster segments present based on customers' Age and Spending Score. We named them as Regular Customers, Usual Customer, Young Targets, and Old Targets.

And, Project is over!!!

Completed by: Ganpat Patel

Email: ganpat.patel.012@gmail.com