Recommended Citation

Published in A Speaker Odyssey - The Speaker Recognition Workshop: Crete, Greece, June 18, 2001.

NOTE: At the time of publication, the author Xiaozheng Zhang was not yet affiliated with Cal Poly.

Abstract

With the prevalence of the information age, privacy and personalization are forefront in today's society. As such, biometrics is viewed as an essential component of current and evolving technological systems. Consumers demand unobtrusive and non-invasive approaches. In our previous work, we have demonstrated a speaker verification system that meets these criteria. However, there are additional constraints for fielded systems. The required recognition transactions are often performed in adverse environments and across diverse populations, necessitating robust solutions.

We propose a multimodal approach that builds on our current state-of-the-art speaker verification technology. In order to maintain the transparent nature of the speech interface, we focus on optical sensing technology to provide the additional modality–giving us an audio-visual person recognition system. For the audio domain, we use our existing speaker verification system. For the visual domain, we focus on lip motion.

The visual processing method makes use of both color and edge information, combined within a Markov random field (MRF) framework, to localize the lips. Geometric features are extracted and input to a polynomial classifier for the person recognition process. A late integration approach, based on a probabilistic model, is employed to combine the two modalities. The system is tested on the XM2VTS database combined with additive white Gaussian noise (AWGN) (in the audio domain) over a range of signal-to-noise ratios.

Disciplines

Electrical and Computer Engineering

Copyright

2001 International Speech Communication Association.

Number of Pages

Download

Included in

Electrical and Computer Engineering Commons

COinS

URL: https://digitalcommons.calpoly.edu/eeng_fac/273

Electrical Engineering

Using Lip Features for Multimodal Speaker Verification

Recommended Citation

Abstract

Disciplines

Copyright

Number of Pages

Included in

Search

Browse

Author Corner

LINKS

Electrical Engineering

Using Lip Features for Multimodal Speaker Verification

Author Info

Recommended Citation

Abstract

Disciplines

Copyright

Number of Pages

Included in

Share

Search

Browse

Author Corner

LINKS