Details

  • Meta has introduced SAM 3 and SAM 3D, its latest vision AI models, expanding its Segment Anything suite with public access through the new, user-friendly Segment Anything Playground.
  • SAM 3 allows users to detect and track objects in both images and videos using flexible text and visual prompts, now supporting open-vocabulary descriptions beyond previous model limitations.
  • SAM 3D features two variants: one for high-quality scene and object reconstruction, and another for detailed estimation of human body shapes, both capable of producing 3D output from a single image.
  • Both models will power practical tools across Meta products, including improved video editing in the Edits app, enhancements to Meta AI's Vibes, and the View in Room buying feature on Facebook Marketplace.
  • Meta is releasing SAM 3's model weights, a new artist-collaborated evaluation dataset, and technical research outlining performance breakthroughs such as the inclusion of a presence head for improved separation of object recognition and localization.

Impact

With this launch, Meta sets a new standard in open-vocabulary computer vision, merging 2D and 3D segmentation capabilities that uniquely span creative, commercial, and industrial uses. By making these advanced, easily accessible models open source, Meta accelerates the pace of innovation for AR, robotics, and content creation, pushing rivals like OpenAI and Google to advance their own segmentation tools in response.