Introducing the Segment Anything Model (SAM)
Meta has launched an AI model called the ‘Segment Anything Model’ (SAM) aimed at making it easier for researchers and web developers to analyze images. The tool allows users to create “cutouts” or segments of any item in an image by clicking on a point or drawing a box around the object. The applications for this technology range from research and creative editing to enhancing virtual reality experiences.
Largest Segmentation Dataset
The company open sourced its computer vision model, which is trained on 1.1 billion segmentation masks and 11 million images licensed from a large photo company. Working with 130 human annotators based in Kenya, the dataset was created through a combination of manual and automatic labeling of millions of images. This model is accessible in real time through a browser, making it more available to users without the need for advanced AI infrastructure or data capacity.
Integration and Future Applications
Object recognition and computer vision technologies are already integrated in devices such as surveillance cameras, drones, and autonomous vehicles, as well as in various image editing software. Meta aims to encourage users to build on top of their generalized model for specific use cases in fields like biology and agriculture. Moreover, the tool is well-suited for virtual reality spaces such as Meta’s online VR game Horizon Worlds, as it can be used for gaze-based detection of objects through VR and AR headsets.
Potential Limitations
Paul Powers, CEO and founder of Physna, points out that there are limitations to a computer vision model trained on a database of 2D images, such as difficulty detecting and selecting objects in various orientations or those that are partly obscured. This means it may not accurately identify non-standardized objects through AR/VR headsets or detect partially covered objects in public spaces for autonomous vehicles.