Carousell is one of Asia's largest online marketplaces for new and secondhand goods. The models trained by the team power their search relevance, product recommendation, inventory management and other core product capabilities. These models required large amounts of high-quality labeled data for their millions of products to be correctly tagged.
The team previously relied on a legacy labeling service that crowdsourced labeling, but found that it failed to meet their quality requirements. Carousell had to dedicate significant time and resources on QC workflows to improve the quality of data from their previous vendor. Despite these efforts, the training data still wasn't up to par. This strained their internal resources and reduced their ability to scale quickly.
To overcome this challenge, Carousell switched to SUPA as their AI data platform for search and recommendation training data. SUPA offered a software-first approach to data annotation, and provided the team with a streamlined consensus labeling workflow. SUPA also leveraged a team of expert annotators with retail experience to rate the search queries with corresponding products from the platform, reducing the need for extra supervision from Carousell’s data scientists.
A few months into partnering with SUPA, Carousell saw significant improvements in quality and higher motivation among their team members. This came amid an acceleration in ecommerce adoption during the Covid-19 pandemic with shoppers moving online. With SUPA's help, Carousell was able to quickly scale to meet the rising demands of its customers.
“In e-commerce, search is everything – if you can’t find it, you can’t buy it. At Carousell we continuously strive to provide the best possible shopping experience for our users, and creating effective search and recommendation algorithms is a critical part of it. SUPA’s evaluations of our search data have added significant clarity and value to our search and recommendation models. - Puneet Garg, Head of Data Science and Engineering at Carousell
Everything from uploading data to seeing it labeled in real time was really cool. This is just way simpler to use compared to Amazon Sagemaker and LabelBox. I was also very impressed with how the platform delivered exactly what we needed in terms of label quality.
I was also able to view the labels as they were being generated, which gave me quick feedback about the label quality, rather than waiting for the whole batch. This replaced my standard manual QA process using external tools like Voxel's Fiftyone, as the labels were clear and easy to parse through in real-time.