Universiteit Leiden

nl en


LUCDH Lunchtime Speaker Series: MacBERTh: A Historically Pre-Trained Language Model for English (1450-1950)

Wednesday 2 March 2022
12:00 (noon) - 13:00
On Campus: Digital Lab P.J. Veth 1.07 / Online: Kaltura Live Rooms
Please register via lucdh@hum.leidenuniv.nl
MacBERTh - credit Fonteyn

MacBERTh: A Historically Pre-Trained Language Model for English (1450-1950)

Join us for our first LUCDH lunchtime talk of the New Year presented by Dr. Lauren Fonteyn and Enrique Manjavacas Arevalo  on Wednesday, 2 March 2022 at 12:00 – 13:00. (Postponed from 2 Feb)

Location: on-campus in the Digital Lab P.J. Veth 1.07 or online via Kaltura Live Rooms

Researchers who interpret and analyse historical textual material are well-aware that languages are subject to change over time, and that the way in which concepts and discourses of class, gender, norms and prestige function in different time periods. As such, it is quite important that the interpretation of textual/linguistic material from the past is not approached from a present-day point-of-view, which is why NLP models pre-trained on present-day language data are less than ideal candidates for the job. In this talk, Fonteyn and Manjavacas Arevalo present "MacBERTh -- a transformer-based language model pre-trained on historical English -- and exhaustively assess its benefits on a large set of relevant downstream tasks. Our experiments highlight that, despite some differences across target time periods, pre-training on historical language from scratch outperforms models pre-trained on present-day language and later adapted to historical language.

More information on MacBERTh can be found on their website: https://macberth.netlify.app/ 

To Register: Please email: lucdh@hum.leidenuniv.nl 

We very much hope that you can join this live event in the Digital Lab in P.J. Veth 1.07.  However, we will also be live-streaming on Kaltura, so please let us know if you will be attending in person or would like Kaltura Live Room login details.

This website uses cookies.  More information.