The Getty Previous
Home
Introduction
A View from the Top
1. What is Art and Material Culture Information, and Why is it Important?
2. Documentation: Analyzing and Recording Information
3. Standards: What Role Do They Play?
4. What, Why, and How of Vocabularies
5. The Getty Vocabularies: An Introduction
6. Improving Access Using Vocabularies: Theory into Practice
Examples
Acronyms
Glossary
Readings
Tools
Contributors
Printer Friendly PDFs



Introduction to Vocabularies


4. What, Why and How of Vocabularies

Vocabularies as knowledge bases
Vocabularies: Types and formats
The role of authority work
Vocabulary building

Vocabularies as knowledge bases

A vocabulary is a body of knowledge represented by language. It answers the question - "How do we talk (or write) about this particular subject area?" Glossaries, dictionaries, thesauri, and word lists are all examples of vocabularies. Most vocabularies focus on a special subject area (e.g. a glossary of geographical terms) or audience (e.g., a dictionary for the architecture and construction trades).

Structured vocabularies are collections of words and phrases (called terminology) that are structured to show relationships between terms and concepts. One of the tasks of a structural vocabulary is to allow better retrieval be it in a card catalog or a computerized database. For example, a vocabulary for furniture would show that there is a relationship among the three terms, bookcase, book case, and book-case. In this example, the relationship is quite simple - they are spelling variations for the same concept: a piece of case furniture with shelves for books. These vocabularies may be applied as "controlled vocabularies," where a given term (such as the "descriptor" or "preferred term") is used consistently to represent a given concept.

Why do we need vocabularies? It is because language is ever-changing, nuanced, and complex. These very characteristics that make language so wonderfully expressive can cause ambiguity and confusion in documentation, and ultimately, hamper access to materials in databases. Here are a few examples of how language can cause confusion:

  • National and regional differences
    A particular type of a rectangular, gable-roofed barn is called a Connecticut Barn in the United States. The same type of barn is called an English Barn in Great Britain.

  • Historical and contemporary names
    The nation that is today called Iran was, before 1935, called Persia.

  • Indigenous vs. culturally inappropriate terms
    Both terms KhoiKhoi and Hottentot have been used to refer to a group of people in Southern Africa. In early 20th century Western texts, these people were called Hottentot. Today, KhoiKhoi is preferred and Hottentot is now considered culturally inappropriate.

  • Linguistic differences
    The Italian artist, Tiziano, is called Titian in English and Titien in French.

Structured vocabularies are especially designed to identify and make these connections among terms by managing synonyms and disambiguating homographs, resulting in improved results for the database searcher. In this way, the terms in a vocabulary serve as a knowledge base for the materials in the database. Vocabularies are most effective when used together with other standards, especially data structure and data content standards.

Read more about how vocabularies work in Chapter 5.

Read more about how standards work together in Chapter 3.

Back to Top

Vocabularies: Types and formats

A wide range of controlled vocabularies have been developed to help describe and access art and material culture information. Many of these vocabularies were created and are maintained by research institutes, national and international cultural organizations, and professional society and associations. They can be used individually or together, depending on the type of material being described.

Examples of types of terms that can be found in controlled vocabularies available for describing cultural heritage.

  • personal names
    in the Union List of Artist Names you will find "Georgia O’Keeffe"
  • geographic place names
    in the Getty Thesaurus of Geographic Names you will find "Botswana"
  • corporate names
    in the Library of Congress Name Authority File you will find "Metropolitan Museum of Art (New York. N.Y)"
  • object names
    in the Art & Architecture Thesaurus you will find "scroll paintings"
  • iconographic subjects and themes
    in ICONCLASS you will find the "education of Cupid by Venus and Mercury"
  • genre terms
    in the Thesaurus for Graphic Materials II: Genre and Physical Characteristic Terms you will find "political cartoons"
  • multi-lingual terms
    in the Multilingual Egyptological Thesaurus you will find the term "pottery" in English, German, "keramik" and French, "céramique".

Controlled vocabularies also come in a variety of formats to fit different practices, systems, and local needs, as listed below:

  • Subject Heading Lists are compilations of headings usually displayed in alphabetical order. Headings are words, phrases, or combinations of words and modifiers that combine separate concepts into what is called a "string." The LCSH is an example of a subject heading list.

The following example is excerpted from the Library of Congress Subject Headings (LCSH), 18th edition, 1995:

Portrait prints (May Subd. Geog.)
   UF Engraved portraits
   BT Prints
      --17th century (May Subd. Geog.)
      --18th century (May Subd. Geog.)
      --19th century (May Subd. Geog.)
Portrait prints, American (May Subd. Geog.)
   UF American portrait prints
Portrait prints, British (May Subd. Geog.)
   UF British portrait prints
Portrait prints, Chinese (May Subd. Geog.)
   UF Chinese portrait prints
Portrait prints, European (May Subd. Geog.)
   UF European portrait prints
Portrait prints, French (May Subd. Geog.)
   UF French portrait prints
Portrait prints, German (May Subd. Geog.)
   UF German portrait prints
Portrait sculpture (May Subd. Geog.)
   BT sculpture
   NT Portraits, Group
      --18th century
      --19th century
      --20th century
      --South Dakota
Portrait sculpture, African (Not Subd. Geog.)
   UF African portrait sculpture

  • A Thesaurus is a compilation of terms representing single concepts. A thesaurus displays relationships among terms by creating what is called a "semantic network." Thesauri are usually displayed as a hierarchy. Most thesauri display three types of relationships among terms: hierarchical (whole/part or genus/species), equivalence (synonyms), and associative (related terms). Thesauri are referred to as structured vocabularies, but in recent years this term also has been used to describe any vocabulary with a structure, even if it is not based on the above-mentioned thesaural relationships.

Visit the Art & Architecture Thesaurus to view an example of a thesaurus hierarchy display.

  • Classifications organize a body of knowledge into conceptual categories. Classification schemes like Revised Nomenclature and ICONCLASS are intended to be used as organizational schemes for collections. Sometimes, classifications like the above-mentioned serve double-duty when catalogers extract the individual terms and use them as data values in a field, outside of the context of the rest of the classification scheme. For example, a museum may use the individual term "costume " from Revised Nomenclature without adopting its ten-category classification scheme to organize the museum’s collection.

The following is a section from The Revised Nomenclature For Museum Cataloging : A Revised And
Expanded Version of Robert G. Chenhall’s System for Classifying Man-Made Objects.
Walnut Creek, CA: AltaMira Press, 1995, p. 9:

Category: 2: FURNISHINGS
          BEDDING
                    BAG, SLEEPING
                    BEDSPREAD
                    BLANKET
                    BOLSTER
                               |
                               |
                    COMFORTER
                    Counterpane ... use BEDSPREAD
                    COVER, BOLSTER
                               |
                               |
                    MATTRESS
                               |
                               |
                    PILLOW
                    PILLOWCASE
                               |
                    SHEET

  • Term Lists are most often created by individual organizations and reflect the scope of the institution collection. Many of these local lists include terms from other controlled vocabulary resources. Sometimes an organization will collaborate with a similar collection to create a joint term list. Term lists can be a rich resource for unique, regional, historical, or very specific terminology. In some organizations the term list also functions as the authority file.

Visit the SPIRO on-line visual database at the University of California at Berkeley (http://www.mip.berkeley.edu/spiro/) to view examples of term lists.

Back to Top

The role of authority work

Authority work, in which terms and names are verified and validated, is a critical part of documentation practice. The concept originated in the library cataloging domain in the days of manual card catalogs and indexes when strict consistency was necessary for minimal access. Today authority work has extended to other information management communities and its processes and procedures have benefited greatly from computerization. The development and application of standard controlled vocabularies is an significant outcome of authority work.

Authority work is defined by the following characteristics:

  • Authority files are compilations of authorized terms or headings used by a single organization or consortium in cataloging, indexing, or documentation. Authority files are strictly maintained as terms are applied and often include associated information about the term or subject heading. This associated information can include: synonyms (e.g., "see references"), related or associated terms (e.g., "see also references"), and original sources (e.g., a note that the term was found in a particular dictionary).

Here is an authority file record from the Library of Congress Name Authority File (NAF) for the author, umberto Eco. Note the variant names (Eko, Umberto, etc.) and the sources of the information (Notes):

Library of Congress Authorities, http://authorities.loc.gov.)
  • Authority control is a system of procedures that maintains consistent information in database records. This procedure includes the recording of terms and the validation of terms using the authority files. The purpose of authority control is to ensure that the database searcher can collocate like material and relate it to others in the database. Today authority control is important in the online environment for making searching easier for users and improving precision in searching.

  • An authority file is a controlled vocabulary, but not all controlled vocabularies are authority files. This is because the main purpose of an authority file is to regulate usage in a particular database. In fact, you will find that some authority files use multiple structured vocabularies as a source for their files. For example, a historical society may use both AAT and LCSH as a source for terms in their institution’s subject authority files. Most authority files also include "local terms," originating from the institution itself.

  • Authority files are an integral part of most automated information systems but you will find differing levels of implementation depending on the system. One of the most useful implementations is when the authority file is available as a resource for catalogers and is interactive in the search interface to assist users as they query the database.

  • Authority work procedures may be automated, but the intellectual processes needed to create quality authority files are still best accomplished by humans. This work may include: verification of the proposed term or name in authoritative sources, such as dictionaries, monographs, or (if relevant) historical sources; research of synonyms, such as variant spellings; establishing relationships to other terms/names in the authority file; and creation of an authority record to be added to the database. Authority work at the local level is often expensive and time-consuming and as data sharing becomes more prevalent, shared authority files are being explored.

Back to Top

Vocabulary building

Vocabularies are available for many different subject areas and audiences but there are gaps in coverage, especially for specialized areas. If you embark on building your own vocabulary there are several good resources in the form of publications, training workshops and academic courses. Here are a few suggestions to get you started:

Recommendations for building new vocabularies

  • Build on existing work. Some established vocabularies like the AAT and LCSH, offer opportunities to enhance specific areas. For example, recently the Mystic Seaport Museum staff researched and added terminology for vessel types to the AAT.
  • Incorporate maintenance into your plan. In order for a structured vocabulary to be effective it needs to accommodate changes in the language over time.
  • Adhere to national and international standards produced by NISO, ISO and other standards organizations.
  • Find partners and collaborate with other like-minded groups. For example, the MDA supports terminology working groups, such as the Ethnographic Terminology Working Group, who pool resources to create vocabularies.
  • Get training. Schools of library and information science, cultural heritage management programs, and professional workshops are all sources for training in thesaurus construction and authority work.
  • Follow established methodologies. For example, the J. Paul Getty Trust has published guidelines for forming language equivalents to enable multi-lingual vocabulary building.

Read more about vocabularies in the Readings section.

Go to a list of vocabulary resources and tools for cultural heritage in the Tools section.

Back to Top

 
     
The J. Paul Getty Trust
The J. Paul Getty Trust
© J. Paul Getty Trust | Privacy Policy | Terms of Use