Contrastive training is a technique for exploiting relationships in data in order to boost the size of a dataset. It is used in training SetFit models and Sentence Transformers. It can be used in most tasks that have been aproached in an NLI style.
Imagine we have a classification task where labels are exclusive. We want to know if sentences are positive or negative.
In a traditional classification setup we would use the sentence as our input X and the class as our output y and each example (Sentence, Label) could only be used once.
If we reframe this task as a contrastive NLI-style problem we could feed in the sentence and the label as and and optimise the model on a similarity score between them .
We can use a neural architecture like the one illustrated in this diagram from the sbert documentation:
This approach also allows us to re-use our training examples.
Example | Label |
---|---|
S1 | Pos |
S2 | Neg |
S3 | Pos |
S4 | Neg |
Becomes:
Example | Label | Relationship |
---|---|---|
S1 | Pos | 1 |
S1 | Neg | 0 |
S2 | Pos | 0 |
S2 | Neg | 1 |
.. | … | … |
The number of examples scales with number of examples N
and number of labels L
.