THE FACE OF YOUR VOICE (2019) by Frederik De Wilde
Privacy and Surveillance — An exploration at the intersection of art, science, and technology
Every person carries a singular acoustic signature: the voice.
We use it to exchange information, to tell stories, to persuade and to confess. Yet beyond language, the voice encodes identity.
Today, with contemporary AI systems, even a brief recording can be sufficient to authenticate, classify, and identify a speaker.
What once seemed speculative—echoing the ambitions of companies such as Lernout & Hauspie—has become commonplace. Voice activates assistants like Siri. Increasingly, it does more than respond; it reveals.
The Face of Your Voice pushes this premise further: Can a face be reconstructed from a voice sound sample alone?
Through artificial intelligence, raw acoustic data is translated into information, then transformed into visual form. A neural network—trained like an artificial nervous system—learns correlations between speech patterns and facial features. From tone, cadence, and resonance, it generates a portrait: age, gender, ethnicity—rendered as a frontal face with a neutral expression. A canonical human mask derived from data.
The artworks in this series are generated from voice samples of prominent public figures: Jeff Bezos, Elon Musk, Eminem, and Kim Kardashian. Their reconstructed faces emerge not from photographic input, but from acoustic data alone—highlighting both the power and the ambiguity of algorithmic inference. The resulting portraits oscillate between resemblance and abstraction, between recognition and projection.
The project raises urgent questions.
If a voice can be reverse-engineered into a face, where does identity reside? A voice cannot simply be exchanged like a password. And yet technologies increasingly manipulate and synthesize it. As voice interfaces become frictionless—natural language replacing buttons and screens—technology recedes into the background. We speak, and the world responds. But when does convenience become surveillance? When does representation become misrepresentation? Can innovation evolve in step with ethics?
The Face of Your Voice situates itself within these tensions.
It extends the research of Speech2Face: Learning the Face Behind a Voice, developed at Massachusetts Institute of Technology, which demonstrated that deep neural networks trained on millions of online videos can infer facial characteristics directly from audio. Using self-supervised learning—leveraging the natural co-occurrence of faces and speech in internet footage—the system learns without explicit labeling.
This artistic continuation translates scientific inquiry into embodied experience. It asks not only what can be inferred, but what it means. For now, we are left with an uncanny encounter. We listen to ourselves—and see what the machine hears.
Perhaps the face it finds resembles us. Perhaps it does not.
The Face of Your Voice is the result of a collaboration with Tae-Hyun Oh (Massachusetts Institute of Technology).
The project was featured in Neural under the title “The Face of Your Voice 3D, from the Verbal to the Physiognomic”, situating the work within critical discourse on media art, AI, and digital culture.