Meta Unveils SAM Audio, Pioneering Unified AI Model for Sound Segmentation

Details

Meta has introduced SAM Audio, the first unified AI model capable of segmenting sounds from complex audio mixtures, announced on December 16, 2025.
The model enables users to isolate audio using three distinct prompts: text (such as "dog barking"), visual (selecting objects within videos), and span (highlighting specific time segments), each functioning separately or together.
SAM Audio can be explored in the Segment Anything Playground, an interactive platform where users experiment with sample or personal audio/video files, and it is available for direct download.
This marks Meta's expansion of its Segment Anything collection, originally launched for image segmentation (SAM, 2023), into the audio domain, representing the first unified cross-modal segmentation AI at this breadth.
Applications include music production, podcasting, film and television post-production, scientific analysis, and accessibility support, offering a versatile alternative to fragmented or single-function audio separation tools.

Impact

Meta’s launch of SAM Audio sets a new benchmark in the field, aiming to redefine audio editing by bringing foundation models to creative workflows. This move challenges incumbent audio software providers and could drive rapid innovation in the segment. SAM Audio’s advanced prompting features position Meta at the forefront of unified, multimodal AI automation in the media industry.

Meta Unveils SAM Audio, Pioneering Unified AI Model for Sound Segmentation

Details

Impact

Social

CONTENT

INFO