Contrastive training is a technique for exploiting relationships in data in order to boost the size of a dataset. It is used in training SetFit models and Sentence Transformers. It can be used in most tasks that have been aproached in an NLI style.
Imagine we have a classification task where labels are exclusive. We want to know if sentences are positive or negative.
In a traditional classification setup we would use the sentence as our input X and the class as our output y and each example (Sentence, Label) could only be used once.
If we reframe this task as a contrastive NLI-style problem we could feed in the sentence and the label as and and optimise the model on a similarity score between them .
We can use a neural architecture like the one illustrated in this diagram from the sbert documentation:
This approach also allows us to re-use our training examples.
| Example | Label |
|---|---|
| S1 | Pos |
| S2 | Neg |
| S3 | Pos |
| S4 | Neg |
Becomes:
| Example | Label | Relationship |
|---|---|---|
| S1 | Pos | 1 |
| S1 | Neg | 0 |
| S2 | Pos | 0 |
| S2 | Neg | 1 |
| .. | … | … |
The number of examples scales with number of examples N and number of labels L .