如何在此 python 代码中获取集群图? ValueError:x 和 y 必须具有相同的第一维
Posted
技术标签:
【中文标题】如何在此 python 代码中获取集群图? ValueError:x 和 y 必须具有相同的第一维【英文标题】:How to get the plot of clusters in this python code? ValueError: x and y must have same first dimension 【发布时间】:2016-11-08 14:52:23 【问题描述】:Error:
Traceback (most recent call last):
File "/Users/ankitchaudhari/PycharmProjects/Learn/datascience/gg.py", line 33, in <module> plt.plot(a, k)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/pyplot.py", line 3154, in plot
ret = ax.plot(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/__init__.py", line 1812, in inner
return func(ax, *args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/axes/_axes.py", line 1424, in plot
for line in self._get_lines(*args, **kwargs):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/axes/_base.py", line 386, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/axes/_base.py", line 364, in _plot_args
x, y = self._xy_from_xy(x, y)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/matplotlib/axes/_base.py", line 223, in _xy_from_xy
raise ValueError("x and y must have same first dimension")
ValueError: x and y must have same first dimension
如何在这个 python 代码中获得集群图?
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
data = np.array([[1, 2],
[5, 8],
[1.5, 1.8],
[8, 8],
[9, 11],
[1, 0.6],
[2, 2]])
k = np.array([2,3,4,5,6,7])
df = pd.DataFrame(data)
df
def kmeans(data, k):
labels = KMeans(n_clusters=k).fit_predict(data)
return labels
sse = 0
for i in k:
label = kmeans(data, i)
cluster_mean = df.mean()
d = np.zeros([], dtype=float)
for j in range(len(label)):
sse += sum(pow((data[j]) - cluster_mean, 2))
a = np.append(d, sse)
plt.scatter(a, k)
plt.show()
生成的图未显示集群的所有点。 a 和 k 的值不相等,将它们绘制成曲线正在成为一个问题。有人可以帮帮我吗?
谢谢。
【问题讨论】:
显示了哪些点?哪些被省略了? 【参考方案1】:你的缩进被破坏了
sse = 0
for i in k:
label = kmeans(data, i)
cluster_mean = df.mean()
d = np.zeros([], dtype=float)
# for i in k has finished here
# label, cluster_mean and d frozen in their last state
for j in range(len(label)):
sse += sum(pow((data[j]) - cluster_mean, 2))
a = np.append(d, sse)
基本上,当计算sse
和a
时,仅对k
中的最后一个i
执行此操作。你开始j
循环在i
循环:
sse = 0
for i in k:
label = kmeans(data, i)
cluster_mean = df.mean()
d = np.zeros([], dtype=float)
# same indentation as loop body!
for j in range(len(label)):
sse += sum(pow((data[j]) - cluster_mean, 2))
a = np.append(d, sse)
【讨论】:
以上是关于如何在此 python 代码中获取集群图? ValueError:x 和 y 必须具有相同的第一维的主要内容,如果未能解决你的问题,请参考以下文章
python 如何在不知道的情况下获取相对文件路径。在此示例中,Python代码,HTML文件。
如何在此代码中正确使用 oracle EXECUTE IMMEDIATE
如何在 Python 中获取 JSON 对象(Flask 框架)