Labfans是一个针对大学生、工程师和科研工作者的技术社区。 | 论坛首页 | 联系我们(Contact Us) |
![]() |
![]() |
#1 |
高级会员
注册日期: 2019-11-21
帖子: 3,006
声望力: 66 ![]() |
![]()
我正在尝试对一个矩阵进行聚类(大小:20057x2)。
T = clusterdata(X,cutoff); 但是我得到这个错误: ???使用==> pdistmex时出错记不清。输入HELP MEMORY作为您的选项。211 = => pdist中的错误 Y = pdistmex(X',dist,additionalArg);==>链接错误139 Z =连锁mex(Y,方法,pdistArg);==> clusterdata在88时出错Z =链接(X,linkageargs {1},pdistargs);==> kmeansTest中的2错误T = clusterdata(X,1);有人能帮我吗。我有4GB的ram,但认为问题出在其他地方。 回答: 正如其他人所提到的,层次聚类需要计算成对的距离矩阵,该矩阵太大而无法容纳您的情况。 尝试改用K-Means算法: numClusters = 4; T = kmeans(X, numClusters); 或者,您可以选择数据的随机子集并将其用作聚类算法的输入。接下来,您将聚类中心计算为每个聚类组的平均值/中位数。最后,对于子集中未选择的每个实例,您只需计算其与每个质心的距离,并将其分配给最接近的质心。 下面是一个示例代码来说明上述想法: %# random data X = rand(25000, 2); %# pick a subset SUBSET_SIZE = 1000; %# subset size ind = randperm(size(X,1)); data = X(ind(1:SUBSET_SIZE), :); %# cluster the subset data D = pdist(data, 'euclid'); T = linkage(D, 'ward'); CUTOFF = 0.6*max(T(:,3)); %# CUTOFF = 5; C = cluster(T, 'criterion','distance', 'cutoff',CUTOFF); K = length( unique(C) ); %# number of clusters found %# visualize the hierarchy of clusters figure(1) h = dendrogram(T, 0, 'colorthreshold',CUTOFF); set(h, 'LineWidth',2) set(gca, 'XTickLabel',[], 'XTick',[]) %# plot the subset data colored by clusters figure(2) subplot(121), gscatter(data(:,1), data(:,2), C), axis tight %# compute cluster centers centers = zeros(K, size(data,2)); for i=1:size(data,2) centers(:,i) = accumarray(C, data(:,i), [], @mean); end %# calculate distance of each instance to all cluster centers D = zeros(size(X,1), K); for k=1:K D(:,k) = sum( bsxfun(@minus, X, centers(k,:)).^2, 2); end %# assign each instance to the closest cluster [~,clustIDX] = min(D, [], 2); %#clustIDX( ind(1:SUBSET_SIZE) ) = C; %# plot the entire data colored by clusters subplot(122), gscatter(X(:,1), X(:,2), clustIDX), axis tight ![]() ![]() 更多&回答... |
![]() |
![]() |