LegItBART: a summarization model for Italian legal documents

Artificial Intelligence and Law:1-31 (forthcoming)
  Copy   BIBTEX

Abstract

The ever-increasing volume of electronic legal documents calls for effective, language-specific summarization and headline generation techniques to make legal content more accessible and easy-to-use. In the context of Italian law existing summarization models are either extractive or focused on abstracting long-form summaries. As a result, the generated summaries have a low level of readability or are not suited to summarize common legal documents such as norms. This paper proposes LegItBART, a new abstractive summarization model. It leverages a BART-based sequence-to-sequence architecture that is specifically pre-trained on Italian legal corpora. To enable the generation of concise summaries and headlines, we release two new annotated datasets tailored to the Italian legal domain, namely LawCodes and LegItConcepts. To successfully handle input documents exceeding the maximum token length such as verbose norms, codes, or legal articles, we also extend BART by integrating a global-sparse-local attention mechanism. We empirically analyze the performance of different pretraining and fine-tuned model combinations. The results show that using a mix of general-purpose and domain-specific pre-training yields relevant summarization performance improvements. The fine-tuned version of LegItBART outperforms all the tested baselines even those characterized by a significantly higher number of model parameters.

Other Versions

No versions found

Links

PhilArchive

    This entry is not archived by us. If you are the author and have permission from the publisher, we recommend that you archive it. Many publishers automatically grant permission to authors to archive pre-prints. By uploading a copy of your work, you will enable us to better index it, making it easier to find.

    Upload a copy of this work     Papers currently archived: 103,634

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

Similar books and articles

Analytics

Added to PP
2025-02-24

Downloads
2 (#1,912,438)

6 months
2 (#1,349,569)

Historical graph of downloads
How can I increase my downloads?

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references