How SUPA Helped A GenAI Company Curate and Label 200,000 Design Assets in 90 Days

Problem

Obtaining a diverse dataset of ~200k vector images labeled for model ingestion for broad user needs within a tight deadline

Solution

A customizable platform and expert workforce optimized for smooth workflows and high-quality data delivery

Result

Completed 3 different workflows for the 200k image project within 90 days

SUPA partnered with a US-based Generative AI company (the Client) focused on refining AI models to generate design assets for diverse sectors, including game design, architecture, e-commerce, and marketing. 

The Problem: Obtaining a diverse dataset for model training

The Client needed to curate & label a varied dataset of 200,000 vector images  to train their model in a short period of time. Given that their model needed to be able to produce design assets in diverse art styles, the corpus had to incorporate a massive variety of imagery split across multiple categories e.g. watercolor art styles.

With that, the Client needed to fulfill these workflows in a very short time:

  • Sourcing:  Curating 200k vector images with strict & abstract criteria
  • Labeling:  Segmenting and labeling these images with relevant terms
  • Sketching: Drafting sketches ranging from child-like to top-tier artist quality for sourced images

The Client initially engaged multiple vendors which proved challenging to manage and lacked output diversity. These also resulted in quality issues, wasting a lot of time in their initial approach.

The Solution: Diverse human output via Supa’s trained expert workforce

To overcome this challenge, the Client switched to SUPA as their vendor for sourcing & labeling this dataset. GenAI is a very new industry which meant experimentation is often necessary.  This often translated to workflows needing to be amended, reworked or even abandoned very quickly. SUPA was able to match the Client’s needs in this by scaling workforces up & down as needed via:

  1. Fast iteration and experimentation: SUPA enabled quick design and launch of the project via daily iterations. This ensured that SUPA was able to resolve any quality issues early and eliminated issues that the Client was facing with other vendors
  1. Train expert labelers: SUPA leveraged a large on-demand workforce for sourcing, labeling and sketching thousands of images daily. In just a week, the workforce was scaled to:
  • 15 person sourcing team sourcing 5k images daily
  • 50 person labeling team labeling 5k images daily
  • 50 person sketching team labeling 5k images daily
  1. Sourcing Expertise: The sheer diversity of GenAI projects made hiring labelers with the relevant domain expertise quite challenging. However, SUPA was able to leverage its network of experts to source the relevant labelers very quickly. For example, the Client’s sketching workflow needed a team of skilled graphic designers, which SUPA was able to source in a week
  1. Customized platform and Solution: SUPA’s engineers customized their proprietary platform to track workflows, ensure smooth image flow, and export to the client's required format. The customization also allowed data cleansing and quality checks.

The Results: Timely completion and diverse dataset

All 200k images were successfully sourced, labeled, and sketched within the stipulated three months. Rework was required for only 3.5% of the total images, highlighting SUPA’s commitment to excellence.

Our partnership with the Client stands as a testament to our flexibility, adaptability, and commitment to quality. The project's success underscores our capability to align with client needs and deliver on large-scale projects, even when faced with technical challenges.

Book a demo

Let us walk you through the entire data labeling experience, from set up to export

Schedule a chat