Pipeline Progress Bars
Say you want to add progress bars to a zero-shot pipeline in huggingface, you can use tqdm.trange
function to iterate over batches and feed them in via a for loop:
import pandas as pd
from tqdm.auto import trange
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
df = pd.read_csv('some_data.csv')
input_column = 'text'
results = []
batch_size=32
zero_shot_labels = ['Positive','Negative','Neutral']
model = AutoModelForSequenceClassification.from_pretrained(config.model.name).to( device=device)
tokenizer = AutoTokenizer.from_pretrained(config.model.name)
pipe = pipeline(task=task, model=model, tokenizer=tokenizer)
for start_idx in trange(0, len(df), batch_size, desc="Batches"):
result = pipe(df.iloc[start_idx: start_idx + batch_size ][input_column].tolist(), zero_shot_labels, batch_size=batch_size)
results.append(result)
A pattern used in Sentence Transformers is to set up a show_progress
variable and set trange(disable=not show_progress)
in the case that the user does not want to output a progress bar.