TY - JOUR
T1 - The impact in forensic voice comparison of lack of calibration and of mismatched conditions between the known-speaker recording and the relevant-population sample recordings
AU - Morrison, Geoffrey Stewart
N1 - © 2017, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.
PY - 2018/2/1
Y1 - 2018/2/1
N2 - In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.
AB - In a 2017 New South Wales case, a forensic practitioner conducted a forensic voice comparison using a Gaussian mixture model – universal background model (GMM-UBM). The practitioner did not report the results of empirical tests of the performance of this system under conditions reflecting those of the case under investigation. The practitioner trained the model for the numerator of the likelihood ratio using the known-speaker recording, but trained the model for the denominator of the likelihood ratio (the UBM) using high-quality audio recordings, not recordings which reflected the conditions of the known-speaker recording. There was therefore a difference in the mismatch between the numerator model and the questioned-speaker recording versus the mismatch between the denominator model and the questioned-speaker recording. In addition, the practitioner did not calibrate the output of the system. The present paper empirically tests the performance of a replication of the practitioner’s system. It also tests a system in which the UBM was trained on known-speaker-condition data and which was empirically calibrated. The performance of the former system was very poor, and the performance of the latter was substantially better.
KW - Forensic voice comparison
KW - Automatic speaker recognition
KW - GMM-UBM
KW - ikelihood ratio
KW - Validation
KW - Calibration
KW - Admissibilit
UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85039739231&origin=SingleRecordEmailAlert&dgcid=raven_sc_affil_en_us_email&txGid=0be8cbe21d6a44c270221676b8092e13
UR - http://linkinghub.elsevier.com/retrieve/pii/S0379073817305406
U2 - 10.1016/j.forsciint.2017.12.024
DO - 10.1016/j.forsciint.2017.12.024
M3 - Article
SN - 0379-0738
VL - 283
SP - e1-e7
JO - Forensic Science International
JF - Forensic Science International
ER -