Small Text is a python library for active learning. It integrates with Huggingface Transformers and Scikit Learn.

One of the key models that is recommended in the documentation and also by Argilla is SetFit.

Initialization

Initialization is important since we want to make sure that the first few datapoints are helpful for the first version of the model.

Initialization strategies are documented here

Query Strategies

The query strategy is an important hyperparameter when doing active learning as it is used to decide which examples we should annotate next.

Possible strategies are documented here