Presidential Fellows Articles and Research

Speaker Utterances Tying Among Speaker Segmented Audio Documents Using Hierarchical Classification: Towards Speaker Indexing of Audio Databases

Sylvain Meignier, LIA - Laboratoire Informatique d'Avignon
Jean-François Bonastre, LIA - Laboratoire Informatique d'Avignon
Ivan Magrin-Chagnolleau, Chapman UniversityFollow

Document Type

Conference Proceeding

Publication Date

2002

Abstract

Speaker indexing of an audio database consists in organizing the audio data according to the speakers present in the database. It is composed of three steps: (1) segmentation by speakers of each audio document; (2) speaker tying among the various segmented portions of the audio documents; and (3) generation of a speaker- based index. This paper focuses on the second step, the speaker tying task, which has not been addressed in the literature. The re- sult of this task is a classification of the segmented acoustic data by clusters; each cluster should represent one speaker. This paper investigates on hierarchical classification approaches for speaker tying. Two new discriminant dissimilarity measures and a new bottom-up algorithm are also proposed. The experiments are con- ducted on a subset of the Switchboard database, a conversational telephone database, and show that the proposed method allows a very satisfying speaker tying among various audio documents, with a good level of purity for the clusters, but with a number of clusters significantly higher than the number of speakers.

Comments

This is a pre-copy-editing, author-produced PDF of an article presenteed at ISCA International Conference on Spoken Language Processing (ICSLP 2002). This article may not exactly replicate the final published version.

Recommended Citation

Sylvain Meignier, Jean-François Bonastre, Ivan Magrin-Chagnolleau. Speaker Utterances tying among speaker segmented audio documents using hierarchical classification: towards speaker indexing of audio databases. ISCA International Conference on Spoken Language Processing (ICSLP 2002), 2002, Denver, CO, United States. pp.577--580. ⟨hal-01434586⟩

Link to Full Text

COinS

Chapman University Digital Commons

Presidential Fellows Articles and Research

Speaker Utterances Tying Among Speaker Segmented Audio Documents Using Hierarchical Classification: Towards Speaker Indexing of Audio Databases

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Browse

Search

Author Corner

Links

Chapman University Digital Commons

Presidential Fellows Articles and Research

Speaker Utterances Tying Among Speaker Segmented Audio Documents Using Hierarchical Classification: Towards Speaker Indexing of Audio Databases

Authors

Document Type

Publication Date

Abstract

Comments

Recommended Citation

Share

Browse

Search

Author Corner

Links