Towards AI

The leading AI community and content platform focused on making AI accessible to all. Check out our new course platform: https://academy.towardsai.net/courses/beginner-to-advanced-llm-dev

Follow publication

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Responses (5)

Write a response

Super useful article!
To further improve your dataset, you could also try running outlier detection which is very simple with cleanlab:
```
from cleanlab.outlier import OutOfDistribution
outlier_scores = ood.fit_score(features=data_embeddings)
```
https://

--

As noted, the 4 percentage points accuracy gain achieved in this article is not even using the best approach to improve a dataset. The best approach would be to manually inspect each example cleanlab has identified as having a label issue and decide…

--

Great read, notebook, and congrats on the model improvement!

--