The J. Paul Getty Trust 2014 Report

The Getty Research Institute
Thomas Gaehtgens, Director

The Getty Research Institute is dedicated to furthering knowledge and advancing understanding of the visual arts and their various histories through its expertise, active collecting program, public programs, institutional collaborations, exhibitions, publications, digital services, and a residential scholars program. Its Research Library and special collections of rare materials and digital resources serve an international community of scholars and the interested public. The Institute's activities and scholarly resources guide and sustain each other and together provide a unique environment for research, critical inquiry, and scholarly exchange.

Front cover and pages from the Spanish explorer Hernán Cortés's account of his expedition to Mexico, Praeclara Ferdinãdi, published in Nuremberg in 1524, as viewed in the Getty Research Portal.For several decades, key electronic resources developed by the Getty Research Institute (GRI) and research conducted by senior staff have played a leading role in the development and implementation of technology-related tools for research, dissemination, and publication in the realm of art, architecture, and material culture. The GRI is home to the first major art-historical research project to make use of computer technology—the Getty Provenance Index®; and it is internationally recognized for the Getty Vocabularies, a series of multilingual electronic thesauri that are used for cataloguing, research, and retrieval. The GRI has continued such innovation by recently launching an initiative to make these vocabularies available in the form of Linked Open Data (LOD), a data format that is seen as a key element of the Semantic Web—the structured linking of web-based information to enable users anywhere to find, share, and combine information more easily. The Institute has also launched the Getty Research PortalTM, an online search platform that provides global access to digitized art history texts, as well as its German Sales Catalogs database, which offers an indispensable resource for provenance research of Nazi-looted art during World War II.

A key strategic goal of the GRI is to play a leadership role in the burgeoning field of digital art history, serving a wide range of users around the world and effecting fundamental changes in the discipline. The GRI continues to devote significant resources to this end, including the expertise, experience, and creativity of its staff; the huge research databases it has created and continues to maintain; and its strong international partnerships. Such resources help to produce tools and methods that have the potential to cause a paradigm shift in the ways that art-historical research is conducted and published, as well as how information, knowledge, and digital assets are disseminated and shared.


The GRI's collections are at the heart of everything it does. The general library contains more than one million volumes, and the special collections of rare and unique materials are so vast that they are measured in linear feet (currently totaling 23,640 feet of archives, manuscripts, and rare photos, and excluding rare books). For decades the GRI has managed its library holdings and research databases in state-of-the-art systems. In 2014, a new "next-generation" online discovery system, Primo, went live. This system replaces the traditional online library catalogue and enables users to search across the GRI's many and varied resources from a single entry point, with the ability to seamlessly page books, request items on interlibrary loan, and access full digital copies of books, journals, and a portion of the GRI's vast archival collections and photo archive. The Getty Vocabularies are maintained in a sophisticated production and publication system that was custom-built by a team of software architects from Getty Information Technology Services (ITS).

This use of information technology in the service of the humanities stretches back to the Getty's earliest days. Recognizing the potential of such technology to transform art-historical scholarship and democratize access to information, founding Getty President Harold M. Williams and Director of Program Planning Nancy Englander established in 1983 the Art History Information Program (AHIP), which was later renamed the Getty Information Institute (GII). In doing so, they created the first large-scale program specifically devoted to the use of information technology for the study of the visual arts and architecture.

Initially, AHIP focused on developing large databases of art-historical information such as the Getty Provenance Index and the first of its electronic thesauri, the Art & Architecture Thesaurus (AAT®). In some cases, AHIP took over and expanded upon projects that had been initiated by other institutions, such as Williams College's Bibliography of the History of Art (BHA) and Columbia University's Avery Index to Architectural Periodicals, large database projects for abstracts of the literature of art history and architectural journals. When the GII was disbanded in 1999, key projects like these were incorporated into the GRI—the most logical place, given its emphasis on research and dissemination of knowledge. Several key former Information Institute senior staff joined the GRI to carry on their work. Thus the GRI continues the legacy of leadership in the digital humanities that began at AHIP in the early 1980s.


The Getty Vocabularies
The Getty Vocabulary Program is one of the GRI's longest-running and most successful technology-based projects. In 2007 the Getty Vocabularies were awarded the prestigious Computerworld Honors award in the Media, Arts and Entertainment category; previous winners in this category include Disney's Pixar and the Turner Broadcasting Network. Collaboratively compiled by an expert team of database editors based at the GRI, software architects from Getty ITS, and partners from international cultural institutions, the vocabularies merge art-historical subject expertise with technology to create a suite of tools that facilitates the accurate description and discoverability of information and images relating to art, architecture, and related subjects.

 The GRI's electronic thesauri are available free of charge on the web and are one of the Institute's most heavily used research tools, with more than 1.2 million searches conducted last year by users from around the globe. The datasets of the Getty Vocabularies have also been licensed by hundreds of nonprofit and commercial entities for incorporation into their own information systems. Three vocabularies are well established: The AAT, the Getty Thesaurus of Geographic Names (TGN®), and the Union List of Artist Names (ULAN®). They are recognized the world over by cultural heritage institutions, publishers, researchers, and the interested public as authoritative sources for information on art, architecture, and material culture. The fourth GRI vocabulary database—the Cultural Objects Name Authority (CONATM)—is in the first stages of implementation. CONA provides academically rigorous data and images of works of art and architecture from all time periods and geographic regions.

As part of the GRI's strategic goal to make art history truly multicultural, a number of projects with partner institutions have been completed or are in progress to translate the AAT into a number of languages, including Chinese, Dutch, French, German, Italian, Portuguese, and Spanish. Translation of these tools not only expands the use of the electronic thesauri by cultural heritage institutions around the world, it also facilitates information retrieval across languages and cultures. The GRI's multilingual initiatives (coordinated by the International Terminology Working Group, led by the GRI) are helping to create an international environment for the study of cultural heritage.

Vocabulary Program and ITS staff are working towards two strategic goals in the near future. The first is to represent as many works of art as possible in the CONA. This activity can be thought of as a huge "crowd-sourced" project, to which museums, libraries, and archives will contribute information about works of art that will then be vetted and edited by the GRI's team of editors and technically aggregated by ITS staff.

The second goal is a three-year initiative to release all of the Getty Vocabularies as LOD. This project encodes every data element and concept in the vocabulary databases as a unique Internet link. The LOD format makes it possible for the rich, authoritative data in the GRI's electronic thesauri to be processed semantically by computers, thus creating a dynamic concept-based tool that will enhance retrieval of art-historical information across cultures and technical environments in a way that researchers have never experienced before. The first GRI dataset released as LOD, in February 2014, was the Art and Architecture Thesaurus; response from the international museum and art documentation community has been significant.

The GRI's electronic thesauri have been referred to as "the gold standard" for art documentation and online searching. They are powerful multilingual, multicultural tools that can be used for cataloguing, research, and retrieval. They are an example of the GRI's commitment to using information technology in the service of the international research community as well as that of any interested user with access to the Internet.

Global Access to the Literature of Art History: the Getty Research Portal
Access to the foundational literature of art history and to rare books has traditionally been a nearly insurmountable challenge for most advanced researchers, as well as for undergraduate and graduate students. Geographic distances, combined with the rarity and fragility of many materials, made art-historical research an arduous and expensive undertaking. Realizing that information technology and particularly the Internet had evolved sufficiently to make it possible to break down many of the traditional barriers to art-historical research materials, a small group of major libraries and research institutes (including the Avery Library at Columbia University, the University Library of Heidelberg, and the French National Institute of Art History), led by the GRI and with funding from the Samuel H. Kress Foundation, came together in 2011 for a series of meetings to brainstorm how the web could be used to provide free, integrated access to full digitized copies of holdings from a wide variety of institutions. The answer was to aggregate the descriptive metadata for holdings from the contributing institutions into a single, "one-stop-shopping" web gateway, through which users can search for art-historical literature and rare books held by different institutions around the world, as if they all resided in a single, unified database.

The GRI built the technical infrastructure for the Getty Research PortalTM and hosts the website where users can simultaneously search all of the participating institutions' contributed records from a single point of entry. Each record that is contributed to the portal contains a link back to a complete, fully downloadable digital copy of the particular item, hosted at the website of the holding institution.

Page spread from the Japanese writer, moralist, and politician Matsudaira Sadanobu's Shūko jisshu, vol. 5 (1800), a rare book from the GRI's collections, as viewed in the Getty Research Portal.Launched in mid-2012, the portal has been praised as a game-changing online resource that provides free access to art-historical literature in all languages, published before the US copyright date of 1923. There is also a substantial and growing amount of literature in the portal that dates from after 1923, depending upon the laws in the contributing countries. With this important research tool, the GRI honors its mission to expand and democratize access to art-historical knowledge for users regardless of their geographic location or academic status. The portal, to which new assets are constantly added in the form of contributions from libraries and other research institutions around the world, has already had a significant impact on art-historical scholarship, forever altering the way that research is conducted and revolutionizing worldwide access to the collections of major libraries.

The portal currently provides access to approximately 40,000 books at more than a dozen libraries and research institutes around the world. Ten or even five years ago, researchers never could have dreamed of "owning" their own copies of the classic texts of art history, much less copies of rare books of which there are only a few in existence. Now they can search carefully vetted art-historical material from the great libraries of the world from a single entry point, and download full digital copies of the works they find.

In the coming years, the GRI will focus on further growing and broadening the content of the portal, with particular emphasis on working with contributing institutions from non-Western countries.

Getty Provenance Index®
The first research database developed at the Getty was the Provenance Index; in fact, it was one of the very first projects at any cultural institution to systematically use information technology in the service of advanced humanities research. The Provenance Index databases, which have grown to six in number over the years, are heavily used all over the world. They are indispensable for certain kinds of research, especially since provenance—the history of ownership of a work of art—has become a major issue in the news, in art history, in the art market, and in the acquisition process of museums worldwide.

Detail of a spread from Goupil stock book no. 13 (1891–1895), showing an entry for the painting "Maitresse de Beaudelaire" (Baudelaire's Mistress, Reclining) by Édouard Manet on June 12, 1893.The Getty Provenance Index is a collection of databases offering online access to source material for research on the history of collecting and art markets. A pioneering project in the digital humanities, the Provenance Index was founded more than thirty years ago. And, in an arena where projects appear and disappear seemingly overnight, this groundbreaking initiative has maintained its relevance and viability for decades, continuing to operate and evolve from the period before the personal computer was invented (the first Provenance Index workstations were all connected to a single mainframe computer) up until the age of "big data." The Provenance Index now stands at a crossroads, having reached the limits of what its current technical infrastructure can do. The GRI is considering how to fundamentally re-architect and re-envision both the production system and the online presence of the Provenance Index databases in order to broaden their scope and to enable them to respond to the new kinds of research questions being asked today.

Built up record by record over decades of manual inputting, the Provenance Index currently contains more than 1.5 million records extracted from archival inventories, auction catalogues, and dealer stock books spanning the late sixteenth century to the mid-twentieth century. As a huge repository of scrupulously edited and carefully structured data, the Provenance Index is recognized as a unique and extraordinary asset for a broad range of multidisciplinary research, encompassing social and economic history as well as the history of collecting and of the art market. It also provides a rigorous, meticulously thought-out model for recording and interpreting data related to the provenance of works of art.

During the 1980s and '90s, the Provenance Index published its data in a series of printed volumes. In 1996 the data began to be released as a group of publically available databases on the Internet. Initially, usage was relatively low, but as more users became aware of the resource, the number of searches grew steadily, and by 2012 an average of 35,000 searches were being logged each quarter. The number of searches suddenly jumped to 75,000 during the first quarter of 2013 and has remained at that level ever since. The dramatic increase in traffic was driven by the GRI's ambitious and strategic initiative to cover all extant German-language auction catalogues dating from the Nazi era. While that crucial and traumatic period in the history of art is heavily researched, sales information for this time of organized looting on an unprecedented scale had been sparse and difficult to access before the GRI's German Sales Catalogs, 1930–45 project, a collaboration with the University Library of Heidelberg and the Kunstbibliothek Berlin, went online.

The German Sales project also served as a test case to experiment with state-of-the-art data acquisition technology aimed at accelerating the growth rate of the Provenance Index, after decades of exclusively manual input of data into the system. Optical character recognition (OCR) software, combined with a computer program for automated parsing specially developed by GRI Information Systems staff, made it possible to deal with more than one million database records generated from the catalogues of this fifteen-year period; 250,000 records were edited and released online in 2013. In just two years, the database was populated with an amount of information equivalent to what would have required decades of manual inputting. While the use of OCR is standard procedure in many digitization projects today, parsing information automatically is far from straightforward, especially when dealing with highly idiosyncratic historical texts. The custom-built software had to learn how to analyze the structure of each block of text in the lot description, how to determine what data elements were present, and where data elements began and ended. The German Sales project required significant software-design expertise as well as in-depth knowledge of the particular data, as the accompanying illustration shows.

As exemplified by the German Sales project, the GRI's current strategy is to embed the work of the Provenance Index in international projects that promote research and generate direct feedback loops on newly released data. A two-year collaboration with the National Gallery, London that added more than 100,000 British sales records from the late eighteenth century to the Provenance Index databases, culminated in a conference held at the National Gallery: "London and the Emergence of a European Art Market (c. 1780–1820)." An edited volume from the conference is in progress, and reflects the methodological synergy of art-historical case studies and data-driven socioeconomic analysis that is made possible by these "big data" initiatives. While art-historical narratives tend to revolve around biographical studies of individual agents such as artists, dealers, and collectors, "big provenance data" is particularly suited to the analysis and visualization of the aggregate behavior of groups of people, making it possible to trace social networks that cross national borders, as well as the flow of objects through time and space. To facilitate this kind of data-intensive, computationally oriented research, a simple data downloading feature was implemented as a first step to unlock this rich data pool for innovative use by external researchers.

The GRI leadership is keenly aware that much more needs to be done in order to be able to respond to the needs of twenty-first-century researchers in its stewardship and deployment of a uniquely important set of assets and data-driven research methodologies. There are plans to convene an interdisciplinary steering committee to explore possibilities for the implementation of a completely new system-architecture and user interface for the Provenance Index, one that will enable the creation of sophisticated visualizations, data linking, and mapping tools that can be used by the GRI as well as external collaborators. One thing needed for the new system is the ability to "open up" the Provenance Index to enable experts around the world to contribute and share knowledge online by annotating individual records in the databases. The GRI is in a unique position to implement this kind of "expert crowd sourcing" not only in the Provenance Index databases but also in its electronic thesauri and digitized collections, making it possible to enrich existing data and to generate new knowledge by combining subject expertise with the latest "Web 2.0" technologies. This is clearly an area where the GRI can play a leadership role in the international art history community.


Getty Scholars' Workspace
A cornerstone of the GRI's Digital Art History program, the Getty Scholars' WorkspaceTM provides an online environment for conducting collaborative, interdisciplinary research projects and creating born-digital publications that allow researchers working in different locations around the world to engage with digital facsimiles of primary source materials and to capture the multiple perspectives that are—or should be—part and parcel of the art-historical dialogue. The goal of this project is to release a flexible, robust open-source electronic toolset (including tools for annotation of texts and images, essay authoring, and bibliography and timeline building) with accompanying technical and methodological manuals, to be shared with the international research community. By using and freely sharing this custom-built digital research environment, the GRI seeks to create a paradigm shift in the way that art-historical research is conducted and published and to break with the tradition of the single authorial voice. It is also part of the Getty's Open Content philosophy of making both digital assets of collection materials, as well as the data that is produced in the course of GRI research projects, freely available to users worldwide. Slated for release to the international research community in late 2015, the Scholars' Workspace will also be a key element of the Getty Digital SeminarTM.

Getty Digital Seminar
As the digital humanities evolve, the GRI realizes that it is crucial not only to develop new technological tools but also to foster the cultural transformations that will enable art history—a discipline that has been much slower than the hard sciences to embrace information technology and information sharing—to take full advantage of the possibilities offered by such tools. It is this challenge that inspired the GRI to propose an innovative model for the teaching of art history in the digital age: the Getty Digital Seminar. It is the GRI's hope that this new digital teaching environment will bring together colleagues from different backgrounds and further the understanding of other art histories; inspire cross-cultural and cross-disciplinary research; and enable collaboration of both senior scholars and the next generation of emerging scholars.

Over the next two years, the GRI will design the model for this environment, launch the first seminar (focusing on a classic art-historical text, Heinrich Wölfflin's Principles of Art History, in a new English translation commissioned by the GRI), and begin planning a second seminar. The goal is to create a conceptual and practical framework that can serve as a model that not only the GRI but the broader field of academic art history can adopt. Different from the standard MOOC (Massive Open Online Course) model, which tends to rely on a single professor teaching at a distance to a huge, widely distributed (and often not particularly engaged) audience, the Getty Digital Seminar will allow professors to maintain autonomy and enjoy an in-person relationship with the students in their own classes.

Importantly, however, the concept aligns with the collaborative aims of the digital humanities to bring scholars and students into a wider, more globally oriented conversation. Access to enriched digital resources and electronic facsimiles of primary sources will bring into the course a wealth of material—texts, documents, images, videos—and considerable opportunities for international discussion and debate. The participatory nature of the traditional graduate seminar is preserved, but greatly enhanced by its inclusion in a network of seminars on the same topic held in Chicago, Kyoto, Leiden, Los Angeles, São Paolo, and Zurich, and with access to a wealth of documents, images, and primary source materials in digital form. Students will be able to partner on research with their peers in other universities around the globe, and the course will conclude with an online student colloquium.

"Big Data"
Following a well-established model from the hard sciences, the GRI has begun to make its large, rich datasets available in their entirety, free of charge, in the form of Linked Open Data, as mentioned previously in reference to the Art & Architecture Thesaurus. Hundreds of thousands of database records from the Getty Provenance Index and the Getty Vocabularies have also been made available to a variety of research projects that are exploring computer analysis and manipulation of huge datasets of cultural heritage information. An August 2014 article in the renowned journal Science, written by a team led by Maximilian Schich of the University of Texas at Dallas, is based on analysis of huge datasets from the GRI's Union List of Artist Names and Getty Thesaurus of Geographic Names databases. Big data is clearly an area where the GRI can play a leading role in collaboration with other institutions, since it has a number of large datasets of meticulously structured data from its various research databases.

Digital Publications
The earliest born-digital publication hosted by the GRI, dating from the late 1980s, is Categories for the Description of Works of Art, a complex, evolving document that would have been impractical to publish in print form. The "Introduction to..." series (including Introduction to Metadata, Introduction to Controlled Vocabularies, and Introduction to Art Image Access) is a pioneering group of titles that address technical topics relevant to digital art history; this series has been made available in both print and electronic form for many years.

Sample web page from the digital publication Pietro Mellini's Inventory in Verse, 1681, produced in the Getty Scholars' Workspace.A key element of the vision for the Getty Scholars' Workspace is that born-digital publications will result from the collaborative work that takes place within the digital working environment. The first such publication, Pietro Mellini's Inventory in Verse, 1681, is slated for release soon. By devoting the same kind of scholarly and editorial rigor to its digital publications as it does to its print publications (digital publications will be given ISBN numbers, catalogued in the GRI's library system, and contributed to the major bibliographic utilities such as WorldCat), the GRI hopes to set a model for a field in which, up to now, digital resources produced by academics are not given the same importance as print publications. In addition to breaking with the single-author, single-viewpoint model that has characterized the vast majority of art-historical publications to date, the GRI hopes to effect a sea change in the "reward system" of academic art history by applying the same high standards and according the same importance to digital publications as it does to print publications. The GRI also seeks to make its digital publications not simply digital versions of printed books but more interactive, nonlinear publications that take advantage of the design and information-delivery options that are possible in the digital realm.


The GRI has a decades-long history of engagement with and a leadership role in the digital humanities and digital art history. It will continue to develop tools and methods aimed at enabling the study of art history to take advantage of information technology, in order to pose new questions, offer multiple perspectives, and revivify a discipline that otherwise risks remaining largely elitist and increasingly irrelevant. By paving the way for new modes of research, spearheading cross-cultural and cross-disciplinary initiatives, and continuing to pursue a policy of open content (in the form of data, images, and multimedia digital files related to GRI collections and research projects), the GRI can help to fundamentally change and democratize the study of the arts, architecture, and material culture.