GLAM Workbench

Presentation by developer Tim Sherratt, at IIPC RSS webinar: Tim Sherratt: Jupyter notebooks for web archives, 05 August, 2020, courtesy of IIPC

GLAM Workbench is a tool that can access data from archival services for more detailed data extraction and treatment (such as extracting and creating overviews of collection metadata, large numbers of search results, etc.).

GLAM stands for Galeries, Libraries, Archives, and Museums.

GLAM Workbench was developed by Tim Sherratt, originally with a focus on Australian-based news media archives and collections. As web archives were added (as described in the introduction video above) the scope became more international.

Data are retrieved and treated in Jupyter notebooks; that is documents with computer code and explanations of what is happening in the code. You may have to make minor changes in the Jupyter Notebooks' code to adjust them to your research purpose (e.g. to work with other web archives to the extent that those archives are compatible) but this is easily done, and explained in the notebooks.

The cloud solution "Binder" is offered as a service where the Jupyter notebooks from the service are set up and run in a remote computer environment. This service is recommended for beginners, but:

Please notice: The "Binder" solution will log the user off after ten minutes of inactivity, and work cannot be saved for later continuation. All results must therefore be created and downloaded in one session. Other options are available; Reclaim Cloud is a paid cloud service where progress can be saved for later, and a local version may be set up to run locally on one's own computer with "Docker". All options are described in the Getting started section under "Running notebooks".

Important: Also please notice that using cloud solutions may not be compliant with GDPR.

Large amounts of help are provided for the GLAM Workbench service; including but not limited to these recommendations:

Especially recommended as an introduction to using GLAM Workbench with web archives, is the full presentation video on the top of this page from the IIPC Research Speaker Series, 26. nov. 2020, "Tim Sherratt: Jupyter notebooks for web archives".

List of presentation videos and slideshows:

Introduction page - a highly recommended overview:

Help section with detailed "First steps" guide and help videos:

Getting started Jupyter notebook page (from the "First steps" guide above):

Introduction to web archives (with several use case examples):
(At the bottom of this page the following subchapters may be helpful in deciding how you may best proceed:  Using Binder, Using Reclaim Cloud, Using Docker).

Service variants (Binder, alternative Reclaim Cloud, local solution with Docker) are listed, described, and accessed from the "Getting started" section: