|
|
|
|
Patricia Harpring
Managing Editor, Getty Vocabulary Program
Getty Research Institute
The appetite of end-users, hungry for images, is rarely sated. Images
are notoriously difficult to retrieve with accuracy, as is evident to
anyone who has searched for images on the World Wide Web. Retrieval of
appropriate images depends on intelligent indexing, which one might call
the "language" of retrieval; in turn, good indexing depends on proper
methodology and suitable terminology. In this essay, I address the underpinnings
of indexing by exploring the use of metadata schemas1 and
controlled vocabularies to describe, catalogue, and index works of art
and architecture, and images of them. I also discuss issues relating
to data structure, cataloguing rules, vocabulary control, and retrieval
strategies, which are central components of good subject access.
What Is "Subject"?
Categories for the Description of Works of Art (CDWA) characterizes "subject" very
broadly as follows:
The subject matter of a work of art (sometimes referred
to as its content) is the narrative, iconic, or non-objective
meaning conveyed by an abstract or a figurative composition.
It is what is depicted in and by a work of art. It
also covers the function of an object or architecture
that otherwise has no narrative content.
CDWA describes a metadata element set that can be used to describe or
catalogue many types of objects and works of architecture in a single
information system. In the interest of providing access across all catalogued
objects by all of the critical fields (the "core" categories), CDWA advises
that the Subject Matter category should always be indexed, even
when the object seems to have no "subject" in the traditional sense.
In other words, in CDWA all works of art and architecture have subject
matter.
Even
though the subject matter of a work of art may
also be referred to in the Titles or Names category
of
CDWA, a thorough description and indexing of the
subject content should be done separately in the Subject Matter category.
A title does not always describe the subject of the work. More importantly,
noting the subject of a work of art in a set of fields or metadata elements
dedicated specifically to subject ensures that the subject is consistently
recorded and indexed in the same place, using the same conventions for
all objects in the database. The title of the photograph in figure 7, Chez
Mondrian, Paris, does not convey a basic description of the subject
of the photograph. Its subject could be described as "an interior space
with a stairway, doorway, table, and a vase with flowers."
The
subject matter of a work may be narrative, but
other types of subjects may also be included. A
narrative subject is one that comprises a story
or sequence of events. Examples of narrative subjects
are The Slaying of the Nemean Lion and The
Capture of the Wild Boar of Mount Erymanthus,
which are both episodes in the Labors of Herakles
series. Subject matter that does not tell a
story could be, for example, a painting or sculpture
of a genre scene, such as a young woman bathing.
For a portrait, the subject can be a named sitter;
for a sketch, an elevation for the facade of a
building; for a pot or other vessel, its geometric
decoration or its function; for a mosque or synagogue,
its function as a place of worship. Subject matter
can also take the form of implied themes or attributes
that come to light through interpretation. For
example, a brass doorknob with an embossed lion's
head can express meaning beyond the depiction of
an animal; it may suggest the householder's strength
or confer protection on the house.
In
a scholarly discussion of subject matter, various
areas of subject analysis are often woven together
into a seamless whole. It is useful, however, to
consider them separately when indexing a work of
art. One level of subject analysis could include
an objective description of what is depicted; for
example, in the Sodoma drawing in figure 8, the
words "human male," "nude," "drapery" describe
the image in general terms. An identification of
the subject would be "resurrected Christ." The
image could be further analyzed, noting that the
iconography represents "salvation" and "rebirth."
In CDWA, subject matter
is analyzed according to a method based on the work
of Erwin Panofsky.2 Panofsky identified
three main levels of meaning in art: pre-iconographic description,
iconographic identification, and iconographic interpretation or "iconology." Three
sets of subcategories under the category Subject Matter
in CDWA reflect this traditional art-historical approach
to subject analysis, but in a somewhat simplified and
more practical application of the principles, one better
suited to indexing subject matter for purposes of retrieval.
(Panofsky was writing decades before the advent of
computer databases of art-historical information and
the proliferation of resources on the World Wide Web.)
The following three levels of subject analysis are
defined in CDWA:
Subject MatterDescription. A description
of the work in terms of the generic elements of the
image or images depicted in, on, or by it.
Subject MatterIdentification. The
name of the subject depicted in or on a work of
art: its iconography. Iconography is the named
mythological, fictional, religious, or historical
narrative subject matter of a work of art, or its
non-narrative content in the form of persons, places,
or things.
Subject MatterInterpretation. The
meaning or theme represented by the subject matter
or iconography of a work of art.
These
three levels of subject analysis can be illustrated
in Andrea Mantegna's Adoration of the Magi (pl.
4). A generic description of Mantegna's painting
would point out the elements recognizable to any viewer,
regardless of his or her level of expertise or knowledge:
it depicts "a woman holding a baby, with a man located
behind her, and three men located in front of her." Possible
indexing terms to describe the scene could be "woman," "baby," "men," "vessels," "porcelain
vessel," "coins," "metal vessel," "costumes," "turbans," "hats," "drapery," "fur," "brocade," "haloes." The
next level of subject analysis is identification,
which is often the only level of access cataloguing
institutions routinely provide. The painting depicts
a known iconographic subject that is recognizable to
someone familiar with the tradition of Western art
history: "Adoration of the Magi." The iconography is
based on the story recounted in the New Testament (Matthew
2), with embellishments from other sources. The proper
names of the protagonists are Balthasar, Melchior,
Caspar, Mary, Jesus, and Joseph; these names should
also be listed as part of the identifiable subject.
The
third level of subject analysis is interpretation,
where the symbolic meaning of the iconography is
discussed. For example, the Magi represent the Three
Ages of Man (Youth, Middle Age, Old Age), the Three
Races of Man, and the Three Parts of the World (as
known in the fifteenth century: Europe, Africa, Asia).
The gifts of the Magi are symbolic of Christ's kingship
(gold), divinity (frankincense), and death (myrrh,
an embalming spice). The older Magus kneels and has
removed his crown, representing the divine child's
supremacy over earthly royalty. The journey of the
Magi symbolizes conversion to Christianity. Details
related to the subject, as depicted specifically
in this painting, could include Mantegna's composition
of figures and objects, all compressed within a shallow
space in imitation of ancient Roman reliefs.
Even when
a work of art or architecture has no overt figurative
or narrative content, as with abstract art, architecture,
or decorative arts, subject matter should still be
indexed in the appropriate metadata element or database
field. In the case of a work of abstract art, John
M. Miller's Prophecy (fig. 9), visual elements
of the composition can be listed, including the following: "abstract," "lines," "space," "diagonal." The
symbolic meaning, as stated by the artist, should also
be included. In this case, the artist's work was inspired
by a fifteenth-century prayer book.3 This
aspect of the subject could be listed as follows: "Jean
Fouquet," "Hours of Simon de Varie," "Madonna and child," "patron," "kneeling," "inward
reflection," "moment in flux."
It may
seem something of a stretch to designate subject
matter for decorative arts and architecture, where no recognizable
figure or symbolic interpretation is possible. For
the sake of consistency, however, and always keeping
end-user retrieval in mind, it is useful to note
subject matter for these types of objects as well. The subject
of a carpet, such as the one shown in figure 10,
could be design elements and symbols of the patron for whom
it was made, such as "flowers," "fruit," "acanthus
leaf scrolls," "sunflower," "Sun King," "Louis XIV." The
subject of a Renaissance drug jar, such as the one
shown in figure 11, could be its function, as well
as its decoration which is intended to invoke the
exotic East, even though the characters of the script
are invented and nonsensical: "drugs," "medicines," "pharmacy," "storage," "Middle
East," "China," "Islamic knot work," "Kufic script," "Chinese
calligraphy," "alphabet." Indexing terms for describing
the subject matter of the pair of globes in figure
12 could be "Earth," "heavens," "geography." The
subject of a building, such as the J. Paul Getty
Museum (fig. 13), could be the building's function
and critical design elements: "art museum," "space," "square," "axes," "reflection," "shadow."
Since information
about art is often uncertain or ambiguous, there may
be multiple interpretations for the subject of a particular
work. Given that interpretations of subjects can change
over time and that more than one interpretation may
exist at one time, the history of the interpretation
of the work should also be noted. For example, the
sitter in Jacopo Pontormo's Portrait of a Halberdier (fig.
14) is sometimes identified as the Florentine duke
Cosimo de' Medici, but he is more often considered
to be the young nobleman Francesco Guardi. An "unbiased," objective
description would identify the sitter simply as a "halberdier" or "soldier." The
subject matter of this painting should be accessible
by any
of these subject designations. It is important to have
a data structure that allows for this kind of variety
and flexibility.
Structure
to Allow Subject Access
Among the key decisions that must be made to provide
subject access to images is selection of the appropriate
format or metadata schema. Indeed, a suitable data
structure is essential for creating good end-user access
to images. The data structure must include all necessary
fields; it must allow repeating fields as appropriate;
and it must include links or otherwise accommodate
the particular relationships that are inherent between
museum objects and works of architecture (or their
visual surrogates) and the subjects depicted in them.
The
data structure for subject access must be contained
within an overall workable data structure for the
objects being described or catalogued. To successfully
create a versatile, useful information system on
art and architecture, several critical issues must
be addressed. The institution or cataloguing project
must decide what is being catalogued: museum objects,
groups of objects, buildings, or visual documents
(surrogate images) of those objects or buildings.
Other decisions are critical to the format and structure
of the system: Which metadata elements or fields
are critical? Are there additional optional fields
that are desirable but not necessary for retrieval?
Which fields should be repeating? Which fields should
be populated with controlled vocabulary terms? Should
there be linked authorities?
CDWA
specifies fields for various attributes of an object
record, including a set of fields for subject identification
in the category Subject Matter.4 This
set of fields is repeatable, and includes a field
for a free-text description of the subject, as well
as fields for indexing terms. For the fourth-century
b.c.e. Greek amphora shown in figure 15, the free-text
description of the subject might be the following: "Side
A: Athena Promachos; Side B: Nike crowning the victor,
with the judge on the right and the defeated opponent
on the left." The important elements of the subject
are then indexed with controlled vocabulary terms
to provide reliable retrieval; for example, the indexing
terms for this object might be "human male," "human
female," "nudes," "Greek mythology," "Athena Promachos," "Nike, "judge," "competition," "game," "games," "athlete," "prize," "festival," "victory." Ideally,
all three levels of subject matter (description,
identification, and interpretation) should be analyzed
and indexed for access, although the terms should
be stored in the same table for end-user retrieval.5A
sample descriptive record for the amphora, formulated
according to CDWA guidelines, is shown below (core
categories are indicated with asterisks).
Display versus
Indexing
For an information system to be effective, information
for display and information intended for search and
retrieval must be distinguished. A field for display
is all that the end-user sees. Information critical
for research must, however, also be properly indexed
in fields to allow adequate retrieval. The field for
description or display can provide a clear, coherent
text that identifies or explains the subject. As I
have already pointed out, art information can often
be ambiguous or even seemingly contradictory. In the
display field, uncertainty and ambiguity can be expressed
in a way that is intelligible to end-users; words such
as "probably" and "possibly" may be used. For example,
the subject for one Dosso Dossi painting (see pl. 3)
could be described in a display field as follows: "Mythological
scene, uncertain subject; probably represents 'love'
and 'lust,' personified with central figures that are
possibly Pan, Echo, Terra, and an unidentified goddess." The
indexing fields would use controlled vocabulary to
ensure reliable, consistent access to the same information.
All terms representing all possible interpretations
should be included for access; for the Dossi painting,
the terms could include "Greek mythology," "love," "lust," "cupids," "landscape," "nude," "human
female," "flowers," "Pan," "satyr," "nymph," "Echo," "Terra," "elderly
female," "armor," "goddess."
Specificity versus Inclusivity
In the Dosso Dossi painting, the indexing terms include
all likely interpretations of the subject matter. This
is the approach taken by a knowledgeable cataloguer
who can be specific in listing the possible subjects.
A different approach must be used when the cataloguer
does not know the subject due to lack of informationthat
is, if the information is possibly "knowable," but
simply "not known" because the particular cataloguer
does not have the time or means to do the research.
In such cases, it is advisable to list terms that are
broad and accurate rather than to be specific at the
risk of being inaccurate. If the cataloguer is not
familiar with the scholarly literature addressing the
likely purpose of the maiolica jar shown in figure
11, the cataloguer is better off calling it a "vessel" or
even a "container" rather than guessing that it may
be a "drug jar." For the eighteenth-century French
woodcarving shown in figure 16, the cataloguer should
not try to surmise the allegorical meaning of the work
if he or she does not have research or documentation
to support the supposition. In such a case, the cataloguer
could resort to performing only the first level (description)
of subject analysis, naming the objects clearly seen
in the piece: "flowers," "medallion," "bird," "nest." Only
if there is credible supporting evidence should indexing
terms relating to the allegoryfor example, "Constitution
of 1791," "French Revolution," "French monarchy," "death," "National
Assembly," "failure," "ending"be added.
Repeating
Fields
Repeating fields refers to a data structure in which
there are multiple occurrences of a given field, so
that multiple terms or data values may be recorded
efficiently. CDWA suggests which fields or metadata
elements should be repeating. Obviously, the field
for Subject Matter should be repeatable. Repeating
fields can store indexing terms for all three levels
of subject analysis; although these aspects of the
subject are analyzed separately, retrieval is more
efficient if they are stored together. Multiple interpretations
of the subject can also be indexed and recorded in
this set of fields.
Authorities
CDWA describes a set of relational tables that includes
information about the object along with links to tables
that hold information about the subject in a Subject
Identification Authority. There are also links to other
authorities as well. In this context, an "authority" is
a separate file in which important information indirectly
related to the objects being described can be recorded.
A "link" may be made between the appropriate field
in the object record and the relevant authority record.
The relationship of authorities to object records in
an information system is presented in the following
entity-relationship diagram:
An authority for subjects provides an efficient way
to record preferred and variant names, broader concepts,
and related information regarding subjects. The information
need be entered only once in the authority record rather
than in each object record related to that subject.
For some subject information, authorities may be efficiently
constructed by using previously compiled data.6 The
fields in the CDWA's Subject Identification Authority
are Subject Type, Preferred Subject Name, Variant Subject
Names, Dates, Earliest Date, Latest Date, Indexing
Terms, Related Subjects, Relationship Type, Name of
Related Subject, Remarks, and Citations.
The Subject
Identification Authority7 contains fields
for the preferred, or most commonly known, name of
the subject, as well as variant names by which the
subject may also be known; variant names in multiple
languages could also be included. Many subjects may
be known by multiple names, all of which are useful
to include as access points for search and retrieval.
Using such a controlled vocabulary or classification
system ensures that synonyms are available for end-user
access. For example, "Three Kings" and "Three Wise
Men" are variant names for the "Magi" "stag beetle" and "pinching
bug" are synonyms for an insect of the family "Lucanidae." Because
the cataloguer or indexer has no way of knowing which
form or forms end-users will choose in searching, as
many variant forms as possible (or reasonable) should
be included. The following sample subject authority
record offers several name variants for the preferred
name "Herakles": "Hercules," "Heracles," "Ercole," "Hercule," "Hércules." Using
an authority or controlled vocabulary ensures that
all these synonyms can be used in search and retrieval.
- Subject Type: mythological character, Greek
and Roman
- Subject Name: Herakles
- Variant Subject Names: Hercules, Heracles,
Ercole, Hercule, Hércules
- Display Dates: story developed in Argos, but
was taken over at early date by Thebes; literary
sources are late, though earlier texts may be
surmised.
- Earliest: 1000 Latest:
9999 (date ranges for searching)
- Indexing terms: Greek hero, king, strength,
fortitude, perseverance, labors, labours, Nemean
lion, Argos, Thebes
- Related Subjects: Labors of Herakles, Zeus,
Alcmene, Hera
- Remarks: Probably based on actual historical
figure, a king of ancient Argos. The legendary
figure was the son of Zeus and Alcmene, granddaughter
of Perseus. Often a victim of jealous Hera. Episodes
in his story include the Labors of Herakles.
In art and literature Herakles is depicted as
an enormously strong, muscular man, generally
of moderate height. His characteristics include
being a huge eater and drinker, very amorous,
generally kind, but with occasional outbursts
of brutal rage. He is often depicted with characteristic
weapons, a bow or a club; he may wear or hold
the skin of a lion. In Italy he may be portrayed
as a god of merchants and traders, related to
his legendary good luck and ability to be rescued
from danger.
- Citations: Grant, Michael, and John Hazel,
Gods and Mortals in Classical Mythology (Springfield,
Mass.: G & C Merriam, 1973); Encyclopedia
Britannica Online, "Heracles" (Accessed 06/02/2001)
|
Other fields are also
useful in providing access. In the sample subject authority
record for Herakles, a note (corresponding to the Remarks
category in CDWA) describes the iconography associated
with Herakles and some of the ways in which this figure
may appear in works of art. Terms that allow researchers
to find all similar subjects must be indexed as well;
such indexing provides access to the record (and thus
to objects linked to it). In the sample record, examples
for Herakles could appear in the "indexing terms" field: "Greek
hero," "king," "strength," "fortitude," "perseverance," "Labors," "Labours," "Nemean
lion," "Argos," "Thebes." They include places, events,
and characters related to the iconography of Herakles,
as well as abstract attributes symbolized by the Greek
hero (for example, "strength" and "fortitude"). The
subject authority can also contain a date field, noting
the time frame when the subject may have been developed
or when it was first documented. In addition, links
to other subject authority records may be useful; the
record for Herakles is linked to the records of other
protagonists related to the iconography of this mythological
figure, namely "Hera" and the "Nemean lion." There
can also be a field for listing sources for more information
about the subject.
Hierarchical
Relationships
Layne stresses in her essay in this volume the power
that vocabularies and classification systems with
syndetic structures can have for indexing and retrieval.
Thus it may be desirable to design an information
system that allows for hierarchical relationships
for subjects. One way to maintain distinctions among
related iconographic themes efficiently is to create
a data structure that makes it possible to link records.
For example, the episodes of the Labors of Herakles
could be linked hierarchically to the general record
for Herakles and to even broader concepts such as
classical mythology or Greek heroic legends,8 as
shown in the following example from the ICONCLASS
system
Vocabularies
Published controlled vocabularies that have gained
a degree of acceptance in the visual resources and
art-historical communities can be used to record
terms for subject matter. If an authority for subject
identification is being created for a particular
collection or body of material, such controlled vocabularies
can be used to "populate" the authority file.
No
single authority can provide adequate subject access
for most collections. Typically, institutions will
have to create an authority for local use, one
compiled, whenever possible, from existing controlled
vocabularies. A number of vocabularies are currently
available for "populating" local authority files.
The ICONCLASS system has proven to be a powerful
tool for recording and providing access to iconographic
themes, particularly for Western art.9 This
system, developed in the Netherlands and now in use
in many countries and institutions, contains textual
descriptions of subject matter in art, organized
by alphanumeric codes that can be arranged in hierarchies.
The Art & Architecture Thesaurus (AAT)
is a source of terms for describing architectural
subjects or objects (for example, "onion dome," "cathedral," "columns").
The Library of Congress's Thesaurus for Graphic
Materials (TGM), like the AAT, is useful for
populating authority files for object type or medium,
but it can also provide terms for subject authorities.
The Getty Thesaurus of Geographic Names (TGN)
can provide the names of places depicted in or symbolized
by art objects, as can the Library of Congress
Subject Headings (LCSH). The Union List of
Artist Names (ULAN) and the Library of Congress
Name Authority File (LCNAF) can provide preferred
and variant names for portraits or self-portraits
of artists, as well as for the creators of works
of art and architecture.
Other
useful vocabularies or term lists could be added
to local authorities. Subjects that would be useful
for many image collections might include non-Western
iconography, Latin names of plants and animals,
proper names of people who are not artists (for
which the LCNAF would be a good source), events,
actions, and abstract concepts (for example, emotions).
Conclusion: The Ultimate Goal Is Retrieval
Obviously, the reason for designing appropriate
data structures and devoting considerable time and
labor to indexing subjects in visual works is to
provide good search and retrieval for the images
being catalogued or indexed. Therefore, it is crucial
to consider current and future retrieval needs of
the particular institution and of its various types
of users before beginning a cataloguing or indexing
project. It is important to keep in mind that the
system designed for cataloguing is unlikely to be
the same system that will be used for retrieval by
the public, so the data created in the editorial
or cataloguing system must be exported or "published" to
a second system. A certain level of retrieval is
required even within a cataloguing system, however,
so that cataloguers and their supervisors can check
and organize their work. I think it is safe to say
that if data is well organized and catalogued according
to recognized standards and using the appropriate
vocabularies, "re-purposing" it for various projects
and migrating it to new systems in the future (which
is inevitable) can be relatively routine tasks. People
and institutions that are designing information systems
should be aware that data can be compliant with multiple
standards at the same time. Consulting a metadata
standards crosswalk can aid in designing appropriate
data structures and cataloguing rules so that data
can be re-purposed and published in a variety of
ways but recorded only once.10
In providing
retrieval, it is important to remember that subjects
are typically requested in combination with a variety
of other elements, including the date or date span
of the creation of a work, an artist's name, an artist's
nationality, the medium or material of a work of
art, and the type of object.11 Furthermore,
multiple subjects may be requested at once. Finally,
end-users can range from the general public to art
historians and other experts. Information systems
should allow versatile retrieval for various audiences
with different needs and levels of experience.
If Subject
Matter and other core metadata elements are well
indexed, versatile retrieval is possible. If search
is done on the iconographical theme "Adoration of
the Magi," the results are those in figure 17. The
search could then be narrowed by adding another criterion:
for example, narrowing the results to only manuscript
illuminations of this eventvia the Object/Work-Type
metadata elementwould retrieve the last three
images in the top row. If the objects have also been
indexed by individual characters and elements of
the scene and by broad themes, users could ask numerous
questions. If a user asked to see all images of "Mary
and Jesus," the images in the first and second rows
would be among the results, including scenes of "Madonna
and Child," the "Coronation of the Virgin," the "Pietà," and
the "Crucifixion." If a user asked to see images
of "mother and child," the last row would be added
to the results.
As Colum
Hourihane points out in the next essay, subject matter
is one of the two main criteria end-users employ
in searching for images of works of art. Careful
consideration and application of standards and controlled
vocabularies are critical to success in providing
good end-user access to artworks via their subject
matter.
Notes
|
|
|