George Zimmerman being sworn in. Picture compliments of the NY Times.

The George Zimmerman trial / Trayvon Martin case is gearing up for trial, and with the state prosecutor’s introduction of audio voice ID experts, it is surely one of the more controversial forensic audio cases we’ve seen in the last few years. The defense is claiming these state experts are using unproven and non scientific techniques. Using these controversial techniques, the states wishes to prove who cried for help during the 911 call.

For those who aren’t aware of the case, George Zimmerman was on a neighborhood watch patrol and spotted 17-year-old Trayvon Martin walking through the neighborhood in the rain. Zimmerman claims Martin was acting suspiciously, and seeing as there had been a recent string of burglaries in the neighborhood, Zimmerman decided to follow him, at which point an altercation ensued, resulting in Zimmerman firing one fatal shot into Martin’s chest, which he claims was in self-defense.

While the incident was in progress, a neighbor called 911, and in the background of that call, we can clearly hear an individual screaming for “help.” We know that only two people were involved in the altercation: Trayvon Martin and George Zimmerman. So how can we figure out which one of them was calling for help? The prosecution is claiming that their voice identification experts can uncover the answer through their use of audio enhancement and audio analysis software.

Just like with fingerprints, every person on this earth has their own unique voice: no two voices are the same. With fingerprints, we distinguish between them through a close-up visual observation. With voice, you can also look at the audio signal closeup to distinguish differences.

Using high tech tools, a piece of audio can be uploaded into software and converted to a visual signal over a timeline. You can study that digital representation of sound to observe how an individual pronounces words and even individual letters.

For example, every person says the word “couch” differently. So, assuming you can gather a clear sample of two people saying “couch,” when you forensically compare the two people’s pronunciation of the word, you’re able to see how each audio sample is different from one another.

But in order to do a comparison such as this, one has to have recordings of multiple words said by the people under review. Not only that, but it’s important to note that words that are said in duress would not give the same results as words said during relative calm.

In the Zimmerman/Martin case, not only are there very few words on the 911 call, but they are said in extreme duress. In addition, the sound of those few words is of very poor quality!

Although the prosecution’s expert claims the yelling is not from Zimmerman, I believe it is impossible for this to be supported in court. Maybe in the future, new technology and techniques will be available, but not today. And the defense audio experts — who are, by the way, much more respected in this field than the prosecution audio experts — agree with me.

It will be interesting to see the specifics of the voice experts testimony during this hearing and how the judge rules on whether the prosecution can have this audio expert testify in front of a jury during trial.