Abstract
The Harold Pinter Archive at the British Library contains correspondence dating from 1977 to 2008, including his e-mail archive. In total, the correspondence strand of the archive contains c.20,000 paper letters and c.3500 e-mails. This project used data analytics (in Python) and network visualisation (in Gephi) to interrogate the ways in which digital and analogue correspondence function together within Pinter’s literary archive. The attendant paper reflects upon what this analysis might mean for archivists, curators and researchers working with hybrid correspondence collections in the context of Digital Humanities and Artificial Intelligence based research methodologies. The paper includes an analysis of what an email is both materially and functionally—what are its constituent parts and in what ways is it like and unlike a physical letter, as well as a discussion of how we can leverage the highly structured nature of e-mail data to our advantage using computational techniques, and the limitations of these approaches to socalled “dark archives”. Network visualisations are used to represent correspondence visually in order to shed light on the activities it represents; particularly literary collaboration and administration. The paper concludes with an attempt to draw conclusions from these analyses and to think about their implications for repositories and researchers collecting and using hybrid correspondence collections. The Python code produced for the project is collection agnostic and available under a Creative Commons Licence, meaning that any collecting repository holding email archives in MBOX format can use it to extract GDPR compliant metadata from their collections and visualise them in Gephi.