Using BART (sentence summary model) with hugging face


  • BART is a model for document summarization
  • Derived from the same transformer as BERT
  • Unlike BERT, it has an encoder-decoder structure
    • This is because it is intended for sentence generation

This page shows the steps to run a tutorial on BART.


  1. install transformers

Run ``sh pip install transformers

Run summary

2. Run the summary
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

model = BartForConditionalGeneration.from_pretrained('facebook/bart-large')
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')

ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')

# Generate Summary
summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])

On 2021/01/18, the output was MyMy friends.


## Where I got stuck.
Error when the version of pytorch is different from the one specified in transformers.

pip install -U torch

to update pytorch and it solved the problem.

## tips
max_length is the length of the word sequence and num_beams is the width of the beam search.

max_length adjusts the length of the generated sentences.

When generating long sentences, a wider search must be used, or else it will fall into a local solution and unnatural sentences will be output.

The two parameters are a trade-off between computation time, so it is better to start with a small one.

## Finally.
If there are any unclear points, please comment!

