Audio testing insights: DXOMARK’s audio clips

Final October, together with the launch of our model new audio high quality benchmark, we printed a few papers outlining how DXOMARK exams audio playback and recording in smartphones. On this article, we’ll take a deeper have a look at our rigorous protocol by exploring the audio clips we use for evaluating the recording high quality of every examined machine.

Let’s begin with an essential query: why are audio clips obligatory? Why not merely movie a live performance, report a gathering, name a good friend?

Actual-life conditions are by definition non-reproducible.

Whereas all these conditions are certainly consultant of typical smartphone use instances, they don’t qualify as legitimate testing environments. For each machine to be evaluated in an unbiased and neutral method, it’s important to reveal every smartphone to strictly equivalent sound scenes inside perfectly-controlled environments, permitting our Audio crew to acquire strong and constant take a look at outcomes.

To answer our protocol’s extremely particular wants, we determined to create our personal proprietary audio clips. Along with the Out of doors and Digital Live performance environments, our crew conceived three sound scenes — City, Workplace, and Residence — every of them designed to judge exact attributes particularly eventualities.

The Workplace audio clip is a mixture of varied sound components present in a generic open area.

The Workplace audio clip, with its small open area chatter, its mouse and keyboard clicks, and its individuals strolling by, is devoted to the Assembly state of affairs. It focuses on signal-to-noise ratio and sound envelope, background artifacts, goal and perceptual loudness, in addition to quite a few timbre, spatial, and artifacts attributes. We are able to use every audio clip in a number of eventualities: for example, we use the City recording for testing each Life Video (movies filmed with the rear digital camera) and Selfie Video recording high quality.

Every clip is a mixture of background sounds, recorded with a HEAD acoustics eight-microphone array in varied Parisian areas (the banks of the Seine, rue de Rivoli’s intense visitors, rue de la Huchette’s touristic effervescence and musicians, rue Montorgueil’s myriad outlets and eating places, and clearly, DXOMARK places of work), and vocal clips recorded with a measurement microphone within the anechoic chamber of the well-known IRCAM institute.

Recording in IRCAM’s anechoic chamber

Vocals are a vital factor for evaluating speech intelligibility, so our crew thought by means of them fastidiously, developing with 4 totally different timbres (two male, two feminine) with totally different accents interweaving at varied angles and amplitudes, plus two extra troublemaking voices used solely for interference functions. As for the phrases, they arrive from the Harvard sentences, a set of 22 lists comprised of 10 phonetically balanced sentences, during which every one in all English’s 44 phonemes seems on the identical frequency as they do within the language of Shakespeare.

Now that we’ve mentioned how these audio clips had been created, let’s dive into the center of the matter and see how we use them for evaluating smartphone recording high quality. Observe us into DXOMARK’s places of work: a contemporary constructing, an unlimited corridor, an elevator, a hall, an entrance door, one other hall, one other door, a lab, an acoustic door — and right here we’re, within the listening room.

Examined machine positioned within the middle of the listening room

Why, come on in, don’t be shy! Right here’s the place we play the audio clips for the machine’s microphones in a really exact and reproducible method: as soon as now we have meticulously positioned the smartphone on the middle of the auditorium, we play again the background at 360° by means of 8 calibrated audio system, whereas synchronously taking part in again vocals by means of 8 different devoted audio system.

We then evaluate the recorded file to different recordings of different smartphones performed beneath the very same situations. So that you can hear the distinction, we ready an extract of the identical audio clip recorded with three smartphones: our present high scorer for recording, a second one which’s someplace in the midst of the pack, and a 3rd one which’s among the many least succesful telephones now we have examined up to now for audio recording. 

Are you able to acknowledge them, based mostly on the audio information included within the opinions? On that topic, you could marvel what to pay attention for in these abstracts, the place you could distinguish the primary sentence from the 20th Harvard Sentences checklist (“The fruit of a fig tree is apple-shaped!”):

On this Residence clip summary, voices ought to stay clear and pure — not sound canned or nasal. The vocal depth modifications shouldn’t induce sudden drops of volume (brought on by temporal artifacts resembling overcompression), and sources must be exactly localizable. Lastly, the background shouldn’t overpower the voices, thus leaving speech completely intelligible. Numerous components to pay attention for in just a few seconds!

Be happy to guess within the feedback part which smartphones had been used within the comparability clip — and to inform us which different particular elements of our audio protocol you’d like defined.

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *