January 25, 2023

How to scale up your data labeling pipeline in 24 hours

How to scale up your data labeling pipeline in 24 hours

You’ve been sitting at your desk for the last 3 hours, straining to zoom in and out to label all the classes on the image, and you’ve got 2,354 images to go. You start to wonder, isn’t there a better way to do this?

Naturally, you Google “quick way to annotate many images”, or “how to automate image labeling”. Some companies offer an annotation tool, which doesn’t solve your problem because you still have to annotate the images yourself. You then discovered annotation services, where you’ll get a workforce to annotate your data. However, you realised engaging these companies is a long-winded process where you need to sit through sales calls, demos, and contract negotiations.

You also come across some data labeling platforms that have the annotation tool and workforce component, but then engaging the workforce doesn’t seem straightforward. You wonder why you have to work with multiple companies just to annotate some images.

Why can’t there be a solution where a labeling platform already has a workforce plugged-in, so you can use it manage your data labeling projects without labeling the data yourself?

We asked the same question and that’s why we built SUPA BOLT, allowing anyone to scale data labeling operations within 24 hours.

This is how it can be done.

Sign up

Sign up process (5 mins)
  • Sign up with email, fill in name and company

Project Setup

Create new project (5 mins)
  • Fill in project name, description, select project type
Upload data (5 mins)
  • Upload from local or S3 - 100 images
Setup labels (5 mins)
  • Set up all the classes, bounding box, polygon, or nested attributes
Write instructions (5min - 1 hour)
  • Embed your already written instructions
  • Write a simple instructions using template given

Pilot Project - 1st Iteration

Start first iteration of 100 images and get back output (3 hours)
  • Start project and view annotations in real time
Review first iteration (30 mins)
  • Review annotations via View Tasks and Quality Assurance
  • Optional: Add feedback loop flow
Improve instructions (30 mins)
  • Add more examples
  • Clarify rules
  • Add edge cases

Pilot Project - 2nd Iteration

Start second iteration of 100 images and get back output (3 hours)
  • Upload another 100 images via local upload or S3
Review second iteration (30 mins)
  • Review annotations via View Tasks and Quality Assurance
  • Optional: Add feedback loop flow

Scale up - Start a large scale project

Ready to scale to 10,000 images on the same or next day
  • Upload 10k images, start project

Export data

Get back output (12-24 hours)
  • View tasks and check out Analytics in real-time
  • Expect the first output in a few hours time
  • Export data whenever

Conclusion

As you can see from the steps above, within 3-5 hours, you’re able to get back your first iteration of annotated data. After that, you’re able to scale up your labeling project with thousands of images and export the output in the next few hours.

This means within 24 hours, you’re able to start training your machine learning model with a large annotated dataset, without going through sales calls, demos, and other hassles.

Try BOLT yourself to experience the quick turnaround and train your machine learning in no time, without labeling a single image yourself.

Bryce Wilson
Data Engineer at Black.ai

Consistent support

If there's one thing that makes SUPA stand out, it's their commitment to providing consistent support throughout the data labeling process. The team actively and efficiently engaged with us to ensure any ambiguity in the dataset was cleared up.

Jonas Olausson
Data Engineer at Black AI
The best interface for self-service labeling.

Everything from uploading data to seeing it labeled in real time was really cool. This is just way simpler to use compared to Amazon Sagemaker and LabelBox. I was also very impressed with how the platform delivered exactly what we needed in terms of label quality.

Sravan Bhagavatula
Director of Computer Vision at Greyscale AI
Launch a revised batch within hours

I was also able to view the labels as they were being generated, which gave me quick feedback about the label quality, rather than waiting for the whole batch. This replaced my standard manual QA process using external tools like Voxel's Fiftyone, as the labels were clear and easy to parse through in real-time.

Sparsh Shankar
Associate ML Engineer at Sprinklr
Really quick

The annotators were really quick. I would upload and 5 minutes later - 10 images done. I checked 5 minutes later - 100 images done.

Puneet Garg
Head of Data Science at Carousell
Good quality judgments

The team at [SUPA] has been very professional & easy to work with since we started our collaboration in 2019. They've provided us with good quality judgments to train, tune, and validate our Search & Recommendations models.

Book a demo

Let us walk you through the entire data labeling experience, from set up to export

Schedule a chat