Think the cleanlab library is just for dealing with label errors? Think again!
We just released major new functionalities in cleanlab v2.3, and want this library to provide all the functionalities needed to practice data-centric AI. With the newest v2.3 release, cleanlab can now automatically:
- find mislabeled data + train robust models
- detect outliers and out-of-distribution data
- estimate consensus + annotator-quality for multi-annotator datasets
- suggest which data is most informative to (re)label next (active learning)
Read more about what’s new in our blogpost: https://cleanlab.ai/blog/cleanlab-2.3
A core cleanlab principle is to take the outputs/representations from an already-trained ML model and apply algorithms that enable automatic estimation of various data issues, such that the data can be improved to train a better version of this model. This library works with almost any
ML model (no matter how it was trained) and type of data (image, text, tabular, audio, etc).
Start using open-source tools to improve your ML data and practice data-centric AI: https://github.com/cleanlab/cleanlab/