Transformers Spanish Summarizer

Hi everyone,

I am starting a project that aims at summarising text (in Spanish) and would like to leverage Transformers. I am not sure what model I should start with and how to approach this problem. I imagine I would need to fine-tune a model on my dataset but do not fully understand how to pick a model. Indeed, some support multiple languages, some are more fitted for some tasks.

Thanks for your help !


This is how I would start.

Good luck!! Happy learning!!


Hi Charles,

If you want to do extractive summarization, you could start with a pre-trained Spanish BERT model. A good place to look is the huggingface model hub:

For abstractive summarization, I don’t know if there is any pre-trained model available in Spanish. If you have a large training corpus and the resources, you could try training a model from scratch. To my knowledge, BART and T5 have shown some promising results for summarization.

1 Like

If you aren’t worried about actually building the model yourself, you could try using a library like OpenNMT or Fairseq to train a summarization model. In addition, it could serve as a baseline if you do decide to build your own. (

Just discovered a live project. Seems like a fun project to work on if you are new to Summarization (if not please ignore)

Hi Stefan,
Maybe a suggestion how to start to pre-trained BERT model in Spanish for extractive summarization?
Thank you so much in advance.

Hi Wilfredo,

I would start with some pre-trained Spanish language model, e.g., and then fine-tune it on an extractive summarization dataset. I’m not sure which datasets exist in Spanish for this task, but this one could be interesting.