Shortcuts

Text Classification

The Task

The Text Classification Task fine-tunes the model to predict probabilities across a set of labels given input text.

Datasets

Currently supports the XLNI, GLUE and emotion datasets, or custom input files.

Input: I don't like this at all!

Model answer: {"label": "angry", "score": 0.8}

Training

Use this task when you would like to fine-tune Transformers on a labeled text classification task. For this task, you can rely on most Transformer models as your backbone.

python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion # can be swapped to xlni or glue

Swap to GPT backbone:

python train.py task=nlp/text_classification dataset=nlp/text_classification/emotion backbone.pretrained_model_name_or_path=gpt2

We report the Precision, Recall, Accuracy and Cross Entropy Loss for validation. Find all options available for the task here.

Text Classification Using Your Own Files

To use custom text files, the files should contain new line delimited json objects within the text files.

{
    "label": 0,
    "text": "I'm feeling quite sad and sorry for myself but I'll snap out of it soon."
}
python train.py task=nlp/text_classification dataset.cfg.train_file=train.json dataset.cfg.validation_file=valid.json

Text Classification Inference Pipeline (experimental)

By default we use the sentiment-analysis pipeline, which requires an input string.

For Hydra to correctly parse your input argument, if your input contains any special characters you must either wrap the entire call in single quotes like ‘+x=”my, sentence”’ or escape special characters. See escaped characters in unquoted values.

python predict.py task=nlp/text_classification +checkpoint_path=/path/to/model.ckpt '+x="I dont like this at all!"'

You can also run prediction using a default HuggingFace pre-trained model:

python predict.py task=nlp/text_classification '+x="I dont like this at all!"'

Or run prediction on a specified HuggingFace pre-trained model:

python predict.py task=nlp/text_classification backbone.pretrained_model_name_or_path=bert-base-cased '+x="I dont like this at all!"'
Read the Docs v: stable
Versions
latest
stable
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.