NMF rank度量图该怎么看

幕布笔记

  • NMF
  • W basis

  • H coefficients

  • Rank选择


  • Sparseness
  • 残差和残差平方和自不必说

  • dispersion 离差

    • In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed.[1] Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range. 也就是和方差、标准差类似的概念
  • silhouette

    • Silhouette refers to a method of interpretation and validation of consistency within clusters of data. The technique provides a succinct graphical representation of how well each object lies within its cluster.[1]

    • The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.

    • The silhouette can be calculated with any distance metric, such as the Euclidean distance or the Manhattan distance.

  • cophenetic 同表象相关

    • 描述

http://blog.sciencenet.cn/home.php?mod=space&uid=3406804&do=blog&quickforward=1&id=1175517

  • 选择Rank方法汇总

    • 选择cophenetic correlation coefficient 开始下降的最小Rank值

    • 选择cophenetic随rank值变化曲线中最大变动的前一个点

    • 选择RSS出现一个拐点(inflection point)

  • 选择观测数据残差减少大于随机数据残差减少的最小Rank

更新:没用一个完全确定的方法可以自动确定数目,一般同时根据聚类的可重复性和残差进行判断。

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容