Senior Audio / ML Engineer (Local TTS / On-Device) VoiceWunder – Fully Remote | Start: July 2026
About the Project
VoiceWunder has built a native ARA2 plugin for Pro Tools and UXP for Premiere Pro launching soon. We are currently powered by ElevenLabs v3 and are developing a high-performance local multi-model TTS engine aimed at professional Dialogue & ADR workflows.
Existing production plugin with active professional users.
Your Mission
Build a local TTS / STS system aimed at matching or exceeding current top-tier cloud engines in quality and speed, running natively in our plugins.
Core Responsibilities
- Design and implement a multi-model inference router / orchestration layer
- Integrate and optimize state-of-the-art open-source TTS models
- Deliver high-quality TTS, STS, voice cloning, emotional expression, prosody control, voice design & remix
- Implement an integrated denoising pipeline
- Heavy performance optimization for Apple Silicon (MLX) and NVIDIA GPUs
- Ensure real-time performance suitable for professional DAW environments
- Fine-tune models using high-quality studio recordings
- Collaborate closely with our lead ARA2/JUCE developer
Requirements
- Strong experience with modern local TTS models and on-device inference
- Deep knowledge in model optimization, quantization, streaming and low-latency audio generation
- Experience with Apple Silicon (MLX) and/or CUDA is a strong plus
- Passion for building production-grade, expressive speech synthesis systems
Compensation
- Competitive fixed-price compensation for an 8-month project
- Milestone-based payments
What We Offer
- High-impact role — your work will be core to our Local Pro product
- Direct collaboration with experienced plugin developers and audio editors
- Fully remote & flexible schedule
How to Apply
If this sounds like a good fit, please send your resume, short cover letter, and portfolio or examples of relevant work to: [email protected]. Please use the subject line: “Senior Audio / ML Engineer Application”