In speaker identification and speaker verification, wrong classifications can result from a high similarity between speakers that is represented in the speaker models. These similarities can be explored using the application of cluster analysis.
In speaker detection, every speaker is represented as a Gaussian Mixture Model (GMM). By using a dissimilarity measure for these models (e.g. cross-entropy), cluster analysis can be applied. Hierarchical agglomerative clustering methods are able to show structures in the form of a dendrogram.
Structures in speech corpora can be visualized and can therefore be used to select groups of highly similar or dissimilar speakers. The investigation of the structures concerning the aspect of misclassification can lead to model generation improvements.