LegItBART: a summarization model for Italian legal documents

Irene Benedetto; Moreno La Quatra; Luca Cagliero

Artificial Intelligence and Law:1-31 (forthcoming) Copy BIBT_EX

Abstract

The ever-increasing volume of electronic legal documents calls for effective, language-specific summarization and headline generation techniques to make legal content more accessible and easy-to-use. In the context of Italian law existing summarization models are either extractive or focused on abstracting long-form summaries. As a result, the generated summaries have a low level of readability or are not suited to summarize common legal documents such as norms. This paper proposes LegItBART, a new abstractive summarization model. It leverages a BART-based sequence-to-sequence architecture that is specifically pre-trained on Italian legal corpora. To enable the generation of concise summaries and headlines, we release two new annotated datasets tailored to the Italian legal domain, namely LawCodes and LegItConcepts. To successfully handle input documents exceeding the maximum token length such as verbose norms, codes, or legal articles, we also extend BART by integrating a global-sparse-local attention mechanism. We empirically analyze the performance of different pretraining and fine-tuned model combinations. The results show that using a mix of general-purpose and domain-specific pre-training yields relevant summarization performance improvements. The fine-tuned version of LegItBART outperforms all the tested baselines even those characterized by a significantly higher number of model parameters.

PhilArchive

This entry is not archived by us. If you are the author and have permission from the publisher, we recommend that you archive it. Many publishers automatically grant permission to authors to archive pre-prints. By uploading a copy of your work, you will enable us to better index it, making it easier to find.

Upload a copy of this work Papers currently archived: 103,634

Applied ethics	Epistemology	History of Western Philosophy	Meta-ethics	Metaphysics	Normative ethics
Philosophy of biology	Philosophy of language	Philosophy of mind	Philosophy of religion	Science Logic and Mathematics	More ...

LegItBART: a summarization model for Italian legal documents

Abstract

Categories

Keywords

Reprint years

DOI

Other Versions

Links

PhilArchive

External links

Through your library

My notes

Similar books and articles

Analytics

Citations of this work

References found in this work