Shortcuts

Multiple Choice

The Task

The Multiple Choice task requires the model to decide on a set of options, given a question with optional context.

Similar to the text classification task, the model is fine-tuned on multi-class classification to provide probabilities across all possible answers. This is useful if the data you’d like the model to predict on requires selecting from a set of answers based on context or questions, where the answers can be variable. In contrast, use the text classification task if the answers remain static and are not needed to be included during training.

Datasets

Currently supports the RACE and SWAG datasets, or custom input files.

Question: What color is the sky?
Answers:
    A: Blue
    B: Green
    C: Red

Model answer: A

Training

python train.py task=nlp/multiple_choice dataset=nlp/multiple_choice/race # can use the swag dataset instead

Swap to GPT backbone:

python train.py task=nlp/multiple_choice dataset=nlp/multiple_choice/race backbone.pretrained_model_name_or_path=gpt2

We report Cross Entropy Loss, Precision, Recall and Accuracy for validation. Find all options available for the task here.

Multiple Choice Using Your Own Files

To use custom text files, the files should contain the data you want to train and validate on and be in CSV or JSON format as described below.

The format varies from dataset to dataset as input columns may differ, as well as pre-processing. To make our life easier, we use the RACE dataset format and override the files that are loaded.

Below we have defined a json file to use as our input data.

{
    "article": "The man walked into the red house but couldn't see where the light was.",
    "question": "What colour is the house?",
    "options": ["White", "Red", "Blue"]
    "answer": "Red"
}

We override the dataset files, allowing us to still use the data transforms defined with the RACE dataset.

python train.py task=nlp/multiple_choice dataset=language_modeling/race dataset.cfg.train_file=train.json dataset.cfg.validation_file=valid.json

Multiple Choice Inference

Currently there is no HF pipeline available for this model. Feel free to make an issue or PR if you require this functionality.

Read the Docs v: stable
Versions
latest
stable
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.