machine learning algorithms questions

Question

machine learning algorithms questions

hideonbush 26

Hi :

I am planing to use k-means to form algorithm to do project. However, I am aware that there are certain shortcomings to find the optimal groups using k-means.

Could you please tell the limitation and provide me with a detailed example?

Thanks

YutongTie-MSFT 53,966 Reputation points Moderator

2022-03-21T10:05:33.383+00:00

Hello @hideonbush

Thanks for reaching out to us for algorithms selection question. There are cons and pros for k-means. But it's based on the scenario more. Could you please provide more details about your project so that we can explain more?

Regards,
Yutong

Accepted answer

0 additional answers

Your answer

YutongTie-MSFT 53,966 Reputation points Moderator

2022-03-21T10:05:33.383+00:00

Hello @hideonbush

Thanks for reaching out to us for algorithms selection question. There are cons and pros for k-means. But it's based on the scenario more. Could you please provide more details about your project so that we can explain more?

Regards,
Yutong

Answer 1

Hello @hideonbush again,

Generally to think about k-means, please refer to below cons and pros. If you can provide more details and how you want to develop your project, I can share more:

Pros:

K-means is very simple, highly flexible, and efficient.
Easy to adjust and interpret the clustering results. Easy to explain the results in contrast to Neural Networks.
The efficiency of k-means implies that the algorithm is good at segmenting a dataset.
An instance can change cluster (move to another cluster) when the centroids are recomputed

Cons

It does not allow to develop the most optimal set of clusters and the number of clusters must be decided before the analysis. How many clusters to include is left at the discretion of the researcher. This involves a combination of common sense, domain knowledge, and statistical tools. Too many clusters tell you nothing because of the groups becoming very small and there are too many of them.
When doing the analysis, the k-means algorithm will randomly select several different places from which to develop clusters. This can be good or bad depending on where the algorithm chooses to begin at. From there, the center of the clusters is recalculated until an adequate "center'' is found for the number of clusters requested.
The order of the data input has an impact on the final results.

Hope this helps!

Regards,
Yutong

-Please kindly accept the answer if you feel helpful, thanks.

Share via

machine learning algorithms questions

0 additional answers

Your answer