- NMF
W basis
H coefficients
-
Rank选择
图
- Sparseness
残差和残差平方和自不必说
-
dispersion 离差
- In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.[1] Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range. 也就是和方差、标准差类似的概念
-
silhouette
Silhouette refers to a method of interpretation and validation of consistency within clusters of data. The technique provides a succinct graphical representation of how well each object lies within its cluster.[1]
The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.
The silhouette can be calculated with any distance metric, such as the Euclidean distance or the Manhattan distance.
-
cophenetic 同表象相关
- 描述
http://blog.sciencenet.cn/home.php?mod=space&uid=3406804&do=blog&quickforward=1&id=1175517
-
选择Rank方法汇总
选择cophenetic correlation coefficient 开始下降的最小Rank值
选择cophenetic随rank值变化曲线中最大变动的前一个点
选择RSS出现一个拐点(inflection point)
- 选择观测数据残差减少大于随机数据残差减少的最小Rank
更新:没用一个完全确定的方法可以自动确定数目,一般同时根据聚类的可重复性和残差进行判断。