开发者:上海品职教育科技有限公司 隐私政策详情

应用版本:4.2.11(IOS)|3.2.5(安卓)APP下载

梦梦 · 2024年04月22日

b的1和3区别?

NO.PZ2022120201000009

问题如下:

a. What does K stand for in K-means clustering?

b. Explain the steps in using the K-means clustering algorithm.

c. In practice, the algorithm is often carried out with several different initial values for the centroids. How would you choose between clusters that result from different initial choices for the centroids?

解释:

a. K is the number of centroids, or equivalently, the number of clusters. This is a parameter specified a priori before the data points are assigned to the clusters.

b. 1. Specify the number of centroids, K and choose a distance measure (e.g., the Euclidean or Manhattan distance).

2. Scale the features using either standardization or normalization.

3. Select K points at random from the training data to be the centroids

4. Allocate each data point to its nearest centroid.

5. Given the points allocated to each centroid, redetermine the appropriate location of the centroids.

6. If the centroids are in a different place to their locations in the previous iteration, then repeat step 4. If the positions of the centroids have not changed, then stop.

c. You could select the centroids where the total inertia was the lowest, as this would represent the choice of centroid positions that best fitted the feature data.

老师,没太明白b的1和3有何区别

1 个答案

李坏_品职助教 · 2024年04月23日

嗨,从没放弃的小努力你好:


题目问的是K均值聚类分析。


b的1是先确定中心点的个数K,并且选择一种衡量距离的算法(比如欧氏距离)。

b的3是在样本点中,随机选择K个点作为中心。


所以1只是确定中心点的个数,3是随机确定中心点的位置。

一开始只是随机确定中心点的位置,后面会逐步变化中心点的位置,指导每个中心点周围其他的点距离中心点都是比较近的,算法才会结束。

----------------------------------------------
就算太阳没有迎着我们而来,我们正在朝着它而去,加油!

梦梦 · 2024年04月25日

明白了,谢谢老师

  • 1

    回答
  • 0

    关注
  • 141

    浏览
相关问题