July 31, 2023

Combining Segment Anything and Human Expertise

Combining Segment Anything and Human Expertise

Meta AI's Segment Anything Model (SAM) generates segmented images with impressive accuracy. Check out our breakdown of SAM.

In this article, we show you how to combine SAM and human validation to label home interior images.

Use Case: Home Interior Image Semantic Segmentation

Imagine you’re building an augmented reality filter for your living room, allowing you to visualize how a new white couch would fit into the space. To achieve this, you need a machine learning model capable of accurately isolating and segmenting each object and surface in the living room. The key to build such a model is semantic segmentation, which labels each pixel and then assigns classes like tables and chairs to the segmented objects.

However, this is a tedious process. To address this, we turn to SAM for a semi-automated solution.

Let’s explore how we can apply this approach to segment the image below.

1. Segmentation Using SAM

We first use SAM to automatically segment the image. It generates the output below:

SAM can generate the segmented masks in less than 3 minutes. The same manual effort takes up to 2 hours, which is 40 times slower. Try it yourself here.

SAM did well segmenting out furniture such as sofas, tables, floor mats and wall features. Nevertheless, there are instances where SAM falls short, particularly in cases of over-segmentation. For example, it may separate the wooden floor and wall art surfaces into multiple disjointed parts.

Over-segmentation arises from SAM's approach of inferring objects based on small areas of the image, using a 32 x 32 grid of points. Each point attempts to predict a set of valid object masks, which can lead to unnecessary divisions. For instance, if a point lies on the arm of a chair, SAM may distinguish it as a separate object from the chair itself.

2. Human Validation to Enhance Segmentation

Recognizing SAM's shortcomings, we introduce a layer of human validation to refine the segmentation output. Human annotators play a crucial role in addressing over-segmented masks, such as odd floor tiles in the image mentioned earlier.

Additionally, human validation allows us to improve the tracing of objects' boundaries, ensuring more precise segmentation.

3. Assigning Classes and Grouping Instances

The next step involves assigning classes to the segmented objects. Human annotators classify each segment as a chair, table, curtain, or other relevant categories.

We then address instances where a single object is split into multiple segments due to obstructions, like the rug blocked by the sofa, coffee table, and chairs. By grouping these separate rug segments, we represent them as a single object.

The final output is shown below. Notice how the different parts of the rug now share the same color after grouping.


SAM has the potential to speed up annotations by 40X. But it may encounter difficulties in segmenting certain objects effectively. With a layer of human validation, we can ensure the output meets the quality requirements. 

The synergy between AI tools like SAM and human expertise make data labeling faster and more accurate.

4X your labeling speed with our Human-AI hybrid workflow. Contact us today to learn more.

Bryce Wilson
Data Engineer at Black.ai

Consistent support

If there's one thing that makes SUPA stand out, it's their commitment to providing consistent support throughout the data labeling process. The team actively and efficiently engaged with us to ensure any ambiguity in the dataset was cleared up.

Jonas Olausson
Data Engineer at Black AI
The best interface for self-service labeling.

Everything from uploading data to seeing it labeled in real time was really cool. This is just way simpler to use compared to Amazon Sagemaker and LabelBox. I was also very impressed with how the platform delivered exactly what we needed in terms of label quality.

Sravan Bhagavatula
Director of Computer Vision at Greyscale AI
Launch a revised batch within hours

I was also able to view the labels as they were being generated, which gave me quick feedback about the label quality, rather than waiting for the whole batch. This replaced my standard manual QA process using external tools like Voxel's Fiftyone, as the labels were clear and easy to parse through in real-time.

Sparsh Shankar
Associate ML Engineer at Sprinklr
Really quick

The annotators were really quick. I would upload and 5 minutes later - 10 images done. I checked 5 minutes later - 100 images done.

Puneet Garg
Head of Data Science at Carousell
Good quality judgments

The team at [SUPA] has been very professional & easy to work with since we started our collaboration in 2019. They've provided us with good quality judgments to train, tune, and validate our Search & Recommendations models.

Book a demo

Let us walk you through the entire data labeling experience, from set up to export

Schedule a chat