 |
Purpose
The Getty Thesaurus of Geographic Names ® (TGN), the Art & Architecture Thesaurus ® (AAT), the Union List of Artist Names ® (ULAN), and the Cultural Objects Name Authority ® (CONA) (in development) are structured vocabularies that can be used to improve access to information about art, architecture, and material culture.
- Cataloging: They may be used as data value standards at the point of documentation or cataloging. In this context, they may be used as a controlled vocabulary or authority by the cataloger or indexer; they provide preferred names/terms and synonyms for people, places, and things. They also provide structure and classification schemes that can aid in documentation.
- Retrieval: They may be used as search assistants in database retrieval systems. They are knowledge bases that include semantic networks that show links and paths between places; these relationships can make retrieval more successful.
- Research tools: They may be utilized as research tools, valuable because of the rich information and contextual knowledge that they contain.
Target audience: The four Getty vocabularies are intended
to provide terminology and other information about the objects,
artists, concepts, and places important to various disciplines that
specialize in art, architecture and material culture. The TGN includes
names and associated information about places. Places in TGN include
administrative political entities (e.g., cities, nations)
and physical features (e.g., mountains, rivers). Current
and historical places are included.
The primary users of the Getty vocabularies include museums, art libraries, archives, visual resource collection catalogers, bibliographic projects concerned with art, researchers in art and art history, and the information specialists who are dealing with the needs of these users. In addition, a significant number of users of the Getty vocabularies are students or members of the general public.
Accessing the vocabularies: Catalogers and indexers who
use the vocabularies typically access them in two ways: By using
them as implemented in a collection management system (either purchased
off-the-shelf through a vendor or custom-built for their local requirements),
or by using the online databases on the Getty Web site. The databases
made available on the Web site are intended to support limited research
and cataloging efforts. Companies and institutions interested in
regular or extensive use of the Getty vocabularies should explore
licensing options by contacting the Getty Vocabulary Program at
vocab@getty.edu. Implementers who wish to provide vocabularies to end-users or use them in search engines may license the vocabularies in XML or relational tables, which are released annually. The data is also available via Web services, where it is updated every two weeks. The licensed files include no user interface.
It is planned that the Getty vocabularies will be released as Linked Open Data in 2013/2014.
Comprehensiveness and updates: The TGN is a compiled resource;
it is not comprehensive. A minimum TGN records contains a numeric
ID, a name, a place in the hierarchy, and a place type. The TGN
grows through contributions. Current areas of TGN development include 1) updating the modern hierarchy of administrative divisions, 2) adding archaeological sites, World Heritage Site names, and other historical sites, focusing on Asian, Pre-Columbian, Middle Eastern, and others, 3) building historical hierarchies for historical nations and empires.
History of the TGN
Work on the TGN began in 1987, when the Getty created a department
dedicated to compiling and distributing terminology, at that time
called the Vocabulary Coordination Group. The AAT was already being
managed by the Getty at that time, and the Getty attempted to respond
to requests from the creators of art information for additional
controlled vocabularies for artists' names (ULAN) and geographic
names (TGN). The development of TGN was informed by an international
study completed by the Thesaurus Artis Universalis (TAU), a working
group of the Comité International d'Histoire de l'Art (CIHA),
and by the consensus reached at a colloquium held in 1991, attended
by the spectrum of potential users of geographic vocabulary in cataloging
and scholarship of art and architectural history and archaeology.
The initial core of the TGN was compiled from thousands of geographic
names in use by various Getty cataloging and indexing projects,
enlarged by information from U. S. government databases, and further
enhanced by the manual entry of information from published hard-copy
sources. The TGN grows and changes via contributions from the user
community and editorial work of the Getty Vocabulary Program.
The basic principles under which the TGN is constructed and maintained
were established by the AAT and also employed for the ULAN: Its
scope includes terminology needed to catalog and retrieve information
about the visual arts and architecture; it is constructed using
national and international standards for thesaurus construction;
it comprises a hierarchy with tree structures corresponding to the
current and historical worlds; it is based on terminology that is
current, warranted for use by authoritative literary sources, and
validated by use in the scholarly art and architectural history
community; and it is compiled and edited in response to the needs
of the user community.
TGN was founded under the management of Eleanor Fink (head of what
was then called the Vocabulary Coordination Group, and subsequently
Director of the Art History Information Program, later called the
Getty Information Institute). TGN has been constructed over the
years by numerous members of the user community and an army of dedicated
editors, under the supervision of several managers. Technical support
for the TGN was provided by the Getty. TGN was first published in
1997 in machine-readable files. Given the growing size and frequency
of changes and additions to the TGN, hard-copy publication was deemed
to be impractical. It is currently published in both a searchable
online Web interface and in data files available for licensing.
The data for the TGN is compiled and edited in an editorial system
that was custom-built by Getty technical staff to meet the unique
requirements of compiling data from many contributors, building
complex and changing polyhierarchies, merging, moving, and publishing
in various formats. Final editorial control of the TGN is maintained
by the Getty Vocabulary Program, using well-established editorial
rules.
The current manager
of the Getty vocabularies is Patricia Harpring, Managing Editor. Administratively, the Vocabulary Program resides under the GRI Collection Management and Description Division (David
Farneth, Head). Other GRI departments in this division are General Collection Cataloging,
Special Collections Cataloging, Digital Services, the Registrar’s Office, Institutional
Records and Archives, and Conservation and Preservation. The Vocabulary Program works with Art History Documentation (Murtha Baca, Head) to foster foreign language
translations of the vocabularies, maintain national and international
partnerships, and oversee licensing and marketing.
Scope and Structure
TGN is a structured vocabulary currently containing around 2,035,195
names and other information about places. Names for a place may
include names in the vernacular language, English, other languages,
historical names, names and in natural order and inverted order.
Among these names, one is flagged as the preferred name.
TGN is a thesaurus, compliant with ISO and NISO standards for thesaurus
construction; it contains hierarchical, equivalence, and associative
relationships. Note that TGN is not a GIS (Geographic Information
System). While many records in TGN include coordinates, these coordinates
are approximate and are intended for reference only.
The focus of each TGN record is a place. There are around 1,431,380
places in the TGN. In the database, each place record (also called
a subject) is identified by a unique numeric ID. Linked to
the record for the place are names, the place's parent or
position in the hierarchy, other relationships, geographic coordinates,
notes, sources for the data, and place types, which are terms
describing the role of the place (e.g., inhabited place and
state capital). The temporal coverage of the TGN ranges from
prehistory to the present and the scope is global.
More about scope and structure: The TGN is a hierarchical
database; its trees branch from a root called Top of the TGN
hierarchies (Subject_ID: 1000000). Currently there are two TGN facets, World and Extraterrestrial Places. Under the facet World, places are arranged in hierarchies generally representing the
current political and physical world, although some historical nations
and empires are also included. There may be multiple broader contexts,
making the TGN polyhierarchical.
Information in the Record (Fields)
- Language: Most fields in TGN records are written in English.
However, the structure of the TGN supports multilinguality; names and scope notes may be written and flagged in multiple
languages. The overall record-preferred name is written in the Roman alphabet. All data is in Unicode.
- Diacritics: The TGN names and other fields contain dozens
of different diacritics, expressed as codes (e.g., $00) in the
data files. The TGN diacritical codes are mapped to Unicode. The mapping is
distributed with the licensed data files. These codes should be
translated into the proper diacritical mark for end-users. A Unicode version of the data is now also available. In
Web displays, it may be impossible to display all diacritics.
If a box or illegible sign displays instead of a character in a name or term, this
means that your system cannot display the Unicode character represented.
You may view the full name or term with correct diacritics by
using Vista, Mac OS 10.5, or often by pasting the word into an
MS Word document.
- Fields: The TGN fields (i.e., discrete pieces of data)
are described below. Data dictionaries for the licensed files
are available at http://www.getty.edu/research/tools/vocabularies/obtain/download.html.
- Subject ID
Unique numeric identification for the TGN record. Each place in
the TGN database is uniquely identified by a numeric ID that serves
to link the names and all other pertinent information to the place
record. The ID is generally permanent. Occasionally an ID may
change due to the record being merged with another record; in
such cases, the new IDs are included in the licensed files, and
a mapping between defunct and new IDs is provided to licensees.
- Record Type
Type designation that characterizes the TGN record. Record types include the following:
- Label
Brief text identification of the place, concatenated from the
preferred Name, parent string, and preferred Place Type. Whereas
the Subject ID identifies the place in the database, the Label
serves as an easily legible heading to identify the place for
end-users. In the TGN Online display (an entry in a results list
display is illustrated below), the Label is displayed with the
hierarchy icon (to the left of the Label) in order to permit the
end-user to go to the hierarchy to browse for places.
Note that the above Label illustrates the parent string in descending
order, which is useful to allow sorting among homographs in results
lists. For other displays, it will be more user-friendly to display
the parents in ascending order.
- Note
Called the Descriptive Note, a note that describes the
history, physical location of the place, or the importance of
the place relative to art and architecture. Many, but not all,
TGN records include a note The example below is for the Roman
Empire.
- Coordinates
Geographic coordinates indicating the position of the place, expressed
in degrees/minutes and decimal fractions of degrees. Latitude
(Lat.) is the angular distance north or south of the equator,
measured along a meridian. Longitude (Long.) is the angular distance
east or west of the Prime Meridian at Greenwich, England. Bounding
coordinates and elevation may also be included (as in the example
for Great Lakes Region below).
NOTE: TGN is not a GIS: it is a thesaurus. While many records
in TGN include coordinates, these coordinates are approximate
and are intended for reference ("finding purposes")
only.
Geographic coordinates in TGN typically represent a single point,
corresponding to a point in or near the center of the inhabited
place, political entity, or physical feature. For linear features
such as rivers, the point represents the source of the feature.
Coordinates are expressed in degrees and minutes (as are used in
atlases); they are also expressed in decimal fractions of degrees.
In decimal fractions, west longitude and south latitude are expressed
as negative numbers (as in the example for Rio de Janeiro
below).
The primary sources of coordinates in TGN are either of the two large U.S. government databases, USGS (United States Geological Survey (USGS) and NGA (formerly NIMA; National Geospatial-Intelligence Agency (NGA).It can be assumed that these sources use the WGS84 standard.
- Names
Names and appellations referring to the place, including a preferred
name and variant names. All names in a record (i.e., all names
linked by a single Subject ID) are considered equivalents
(i.e., synonyms). A TGN record may contain the vernacular
and English names of the place, variant names in other languages,
and historical names. One name is flagged as the preferred name,
which is the indexing form of the name most often found in scholarly
or authoritative publications. In the example below, all names
refer to the same place, Munich, Germany.
- Term ID: Numeric ID that identifies the name in the database
(e.g., in the example above, Munich has the following Term_ID:
140499). Term IDs are unique; homographs have different
IDs. The Term_ID may be hidden from end-users.
- Display order of the names
Names are arranged in a particular order by the editors. The preferred
name is positioned first in a list of names for the place, with
the preferred English (if any) and other commonly-used current
names at the top of the list. Historical names appear at the bottom
of the list, sorted in reverse chronological order, if applicable
(see the example of Alexandria, Egypt below).
Implementers should sort the names by the Display_order
number, which is included in the data files, but typically hidden
from end-users.
- Flags for the Names
In the TGN data, there are various flags associated with each
name. In displays for the end-user, some of the flags may be suppressed.
For example, in the display below, the Vernacular flag is displayed
as a capital letter "V" in parentheses following the name; the
capital letter should be linked to an explanation of what the
flag means.
 |
|
 |
Preferred Name
The flag preferred following a name indicates that
the name is the so-called preferred name for the
record. (The flag non-preferred is hidden in the
display.)
Each record has one and only one default preferred name,
flagged in order to provide a default name for the hierarchical
and other displays (see also Language of the Names below).
The preferred name is the name most commonly used in the
vernacular (local) language (as in the example below,
transliterated from Arabic).
 |
Example
|
The preferred name for physical features is the inverted
form of the name, when applicable. This is the name that
would be preferred in alphabetical lists.
 |
Example
|
Display Name
There may be a name flagged Display, meaning that
this name should be used in horizontal displays (such
as a label or results list) where confusion may result
from using the preferred name. For example, when the name
of a city is the same as the name of a county, the name
of the county should include the word county for
clarity in a horizontal display. In the example below,
Los Angeles county is a display name, meaning that
this name should be used in a horizontal display when
it is a parent with the name of the city, as in the following:
Los Angeles (Los Angeles county, California,
United States). If the name is flagged Index,
this is the name that should be used in alphabetical lists.
The Preferred Name is the default Index Name and generally
not flagged Index.
 |
Y = Yes (i.e., this is the Display Name)
I = Index
N = No
NA = Not Applicable
Example
|
Other Flags
Indicates various characteristics of the Name.
 |
O = Official name
P = Pseudonym
PN = Provisional name
S = Site name
A = Abbreviation
ISO3L = ISO 3-letter code (International Organization
for Standardization)
ISO2L = ISO 2-letter code (International Organization
for Standardization)
ISO3N = ISO 3-number code (International Organization
for Standardization)
ISO2N = ISO 2-number code (International Organization
for Standardization)
USPS = US Postal Service code
FIPS = FIPS code (Federal Information Processing
Standards)
NA = Not Applicable
Example
|
LC flag
Also called the AACR Flag. Currently this flag is usually set to NA in TGN. Where it is used, it flags names that correspond
to Library of Congress Subject Headings.
 |
NA = Not Applicable
Y = Yes |
Name Type flag
Indicates the type of name or term.
 |
N = Noun form
A = Adjectival form
B = Both noun and adjectival
|
Historical flag
Indicates if the name is current or historical.
 |
C = Current
H = Historical
B = Both current and historical
U = Unknown
NA = Not Applicable
|
Vernacular flag
Indicates if the name is in the vernacular (local) language,
or some other language. There may be multiple vernacular
names. See also Language of the Names below.
 |
V = Vernacular
O = Other
U = Undetermined
|
 |
|
|
Part of Speech
In TGN, primarily used to flag adjectival name forms (e.g., Italian for Italy); the name field in TGN usually contains proper nouns.
 |
U = Undetermined
N = Noun
A = Adjectival
B = Both |
- Dates for the Names
Dates comprise a Display Date, which is a note referring
to a date or other information about the name, and Start Date
and End Date, which are years that delimit the span of
time referred to in the Display Date. Start and End Dates index
the Display Date for retrieval, but are hidden from end-users.
The dates refer to usage of the name, not necessarily tied to
the dates of existence the place.
Start and End Dates are years in the proleptic Gregorian calendar,
which is the calendar produced by extending the Gregorian calendar
to dates preceding its official introduction. Dates BCE are expressed
as negative numbers. If the name is currently used in literature
to refer to the place, the End Date is 9999.
- Language of the Names
If the vernacular language for a place is not English and there
is an English name for the place, the English name will be included
and flagged (as noted below). In addition, some TGN records currently
include names with other language designations. A single name
may have multiple language designations because it may have the
same spelling in multiple languages.
Languages are derived from a controlled list, which includes the
name of the language and a numeric code (e.g., French / 70271).
The code is hidden from end-users.
- Preferred flag for a given language
A "P" following the language in the examples indicates
that this is the preferred name in that language. In the TGN,
the preferred name (descriptor) is by default the preferred
name in the vernacular language. If there is an English equivalent,
it will be flagged. For example, the preferred English spelling
is marked with a "P" (English -P) in the example
above. For a given language, there is only one preferred name,
although there may be multiple non-preferred names in that language.
- Language status
Flag indicating loan words. Values are Undetermined, NA, Loan Term. Given that most names in TGN are not translated into other langauges, this flag is generally set to NA in TGN.
- Qualifier
Currently, qualifiers are rarely used in the TGN. A qualifier
is a word or phrase used to distinguish between homographs or
other confusing names. In the TGN data files, the Qualifier is stored in a separate
field, associated with the Language designation for the name.
- Hierarchical Positions / Parent ID
The hierarchy in the TGN refers to the method of structuring
and displaying the places within their broader contexts. Place
records in the TGN typically have a whole/part relationship (rather
than genus/species relationship). Hierarchies are built by using
the Parent_ID, which is linked to each Subject_ID; the Parent_ID
is hidden from end-users.
For end-users, the Hierarchical Position is typically indicated
in a display that shows broader contexts or parents of
the concept. In a vertical Hierarchy Display, whole/part relationships
are indicated with indention, as in the example below. (See also
Place Types below for further discussion of the main political
and administrative divisions of the hierarchy.)
In horizontal displays, the parents should be included in either
ascending or descending order. Displaying parents in descending
order is helpful to allow sorting among homographs in a results
list (as illustrated below), while displaying parents in ascending
order is more user-friendly in other displays.
- Multiple parents
The TGN is polyhierarchical. Each Subject_ID may be linked to
multiple Parent_IDs. If there are multiple parents, one is marked
as preferred. In displays, the preferred parent
is listed first or otherwise designated. The example below illustrates
the display of parents in a Full Record Display for Bermuda.
Unusual or complicated relationships of inhabited places to their
parents are represented in the TGN hierarchy through the
polyhierarchy. When a place has different political and physical
parents, the polyhierarchy is employed (as in the example above).
If the area of an inhabited place crosses administrative boundaries
(e.g. as happens in the USA when a city belongs to two counties),
the inhabited place appears under both administrative subdivisions.
Likewise, if jurisdiction over an area is disputed between two
nations, that area would appear as part of both nations. In the
full hierarchical view, it is recommended that implementers indicate
relationships to non-preferred parents with an "[N]", as illustrated
below.
- Physical Features Crossing National Boundaries: In the
TGN, if a physical feature crosses a boundary, it is placed under
the next highest level in the hierarchy. In other words, the river
or mountain range is placed under the level of the hierarchy that
entirely contains it. For example, the Amazon river crosses national
boundaries, so it is placed under the next highest level, the
continent of South America.
- Sort order in the hierarchy
Siblings in the hierarchies are usually arranged alphabetically.
However, they are sometimes arranged by another logical order,
for example, in chronological order, as in the example below.
For siblings at any level, implementers should build displays
using the Sort_order, followed by an alphabetical sort. (In an
alphabetical display all Sort_order designations are "1," and
will therefore be sorted alphabetically in the second sort.) The
Sort_order number is hidden from end-users.
- Historical flag for the Parent
Indicates if the link between the child and its parent is current
or historical. Most relationships in the TGN are flagged Current;
if the flag is Current, it is generally not displayed to
end-users unless there is a Display Date. If the flag is Historical,
it is displayed (e.g., "H" in the example below). Other flags
could be used in future versions of the TGN.
- Dates for the parent
Dates comprise a Display Date, which is a note referring
to a date or other information about the link between a child
and its parent, and Start Date and End Date, which
are years that delimit the span of time referred to in the Display
Date. Start and End Dates index the Display Date for retrieval,
but are hidden from end-users. The example below illustrates a
historical relationship between the Nubia and historic
Egypt.
Start and End Dates are years in the proleptic Gregorian calendar,
which is the calendar produced by extending the Gregorian calendar
to dates preceding its official introduction. Dates BCE are expressed
as negative numbers. If the relationship extends to the current
time, the End Date is 9999.
- Hierarchy Relationship Type
Indicates the type of relationship between a hierarchical child and its parent, expressed in the jargon of controlled vocabulary standards. An example of a whole/part relationship is Tuscany is a part of Italy (TGN). An example of genus/species relationship is calcite is a type of mineral (AAT). An example of the instance relationship is Rembrandt van Rijn is an example of a Person (ULAN). Most hierarchical relationships in TGN are Whole/Part.
- Place Types
Words or phrases describing a role or characteristic of the place
(e.g., inhabited place, cultural center). Places in the
TGN can be either physical or political entities. They include
physical features such as continents, rivers, and mountains; and
political entities, such as empires, nations, states, districts,
townships, cities, and neighborhoods. The place type in
the TGN is a term that characterizes a significant aspect of the
place, including its role, function, political anatomy, size,
or physical characteristics. Place types are indexing terms based
on the structured vocabulary of the AAT where possible. The example
below illustrates place types for Marakesh, Morocco.
- Place types for main political divisions: Place types
are used to mark the significant administrative levels of the
hierarchy. Given that there is no predictable number of levels
in the TGN hierarchy, certain place types are used in order to
allow some users to create hierarchies that have a set number
of levels, when necessary (i.e., some users have systems that
require a set number of levels). These designations are intended
to work with inhabited places, but not necessarily with physical
features. The place type is either the preferred place type or
the place type in position number 2: See also the discussion Hierarchical
Positions above. The divisions are the following, in descending
order:
- continent - (preferred place type)
- primary political unit - (place type in positon #2, for
nations, empires, etc.)
- first level subdivision - (place type in positon #2)
- second level subdivision - (place type in positon #2)
- inhabited place or deserted settlement - (preferred place
type)
- Preferred flag for Place Types
One place type is flagged preferred for each place, to
provide a default when creating displays. Preferred following
a place type (as seen in the examples) indicates that this is
the place type that should appear with the place in displays.
- Display order for Place Types
Place types are arranged in a particular order by the editors.
Implementers should sort the names by the Display_order
number, which is included in the data files, but typically hidden
from end-users.
- Dates for Place Types
Dates comprise a Display Date, which is a note referring
to a date or other information about the place relative to the
place type (e.g., for the place type inhabited place for
Delhi, India below), and Start Date and End Date,
which are years that delimit the span of time referred to in the
Display Date. Start and End Dates index the Display Date for retrieval,
but are hidden from end-users.
Start and End Dates are years in the proleptic Gregorian calendar.
Dates BCE are indexed with negative numbers. If the place type
is still applicable to the current place, the End Date is 9999.
- Related Places
Associative relationships to other places in the TGN, particularly
any important ties or connections between places, excluding hierarchical
whole/part relationships. The example below illustrates related
places in the record for South Sea Islands.
Each reference comprises a relationship type plus a link to the
related entity. For end-user displays, the related entity should
be represented by the preferred name, place type, parent string
(simply World in the examples above), and subject ID for
the related place.
- Relationship Type
A term or phrase characterizing the relationship between the place
at hand and the linked place. In the example above, the Relationship
Type in the record for South Sea Islands indicates that
in TGN this place is distinguished from Oceania and the
Pacific Islands. Relationship Types are reciprocal (that
is, linked to both records), drawn from a controlled list that
comprises the controlled phrase and a numeric code, as illustrated
below. The codes are hidden from end-users.
 |
|
 |
|
Code
|
Focus Entity
|
Related Code
|
|
3001
|
distinguished from
|
3001
|
|
3301
|
ally of
|
3301
|
|
3411
|
successor of
|
3412
|
|
3412
|
predecessor of
|
3411
|
|
 |
|
- Historical flag for the Related Place
Indicates if the link between the related places is current or
historical. Generally, if the flag is set to Current, it is not
displayed to end-users; if it is set to Historical, it is displayed
(as in the example for Florence, Italy below).
- Dates for the Related Place
Dates comprise a Display Date, which is a note referring
to a date or other information about the relationship between
the two places, and Start Date and End Date, which
are years that delimit the span of time referred to in the Display
Date. Start and End Dates index the Display Date for retrieval,
but are hidden from end-users. The example below illustrates a
related place in the record for Florence, Italy.
Start and End Dates are years in the proleptic Gregorian calendar.
- Contributors
The institutions or projects that contributed information to the
TGN record. In order to give due credit to the contributing institution,
it is required that implementers display a reference to the contributor
to end-users.
References to contributors are drawn from a controlled list comprising
a numeric ID, a brief name, and a full name. The end-user must
have access to the brief name and the full name. The Brief Name
is the initials, abbreviations, or acronyms for the contributing
projects or institutions (in square brackets in the display below).
Contributors may be linked to the record in three ways: with the
names, with the record as a whole (subject), and with the
note (scope note). In the example below, end-users may
click on the initials of the contributor in the Full Record Display,
which produces a fuller description of the contributor name.
- Sources
The TGN record generally includes the bibliographic sources for
the names. Most names were found in authoritative publications
on the given topic or in standard general reference works, including
dictionaries and encyclopedias. In order to give due credit to
published sources, it is required that implementers display a
reference to the published source to end-users.
References to sources are drawn from a controlled list comprising
a numeric ID, a brief citation, and a full citation. The end-user
must have access to the brief citation and the full citation.
Sources may be linked to the record in three ways: with the names,
with the record as a whole (subject), and with the note
(scope note). In the example below, end-users may click
on the brief citation in the Full Record Display, which displays
a full citation for that source.
- Page Number
A reference to a volume, page, date of accessing a Web site, or
heading reference in a source (as seen following the brief citation
(in black following the blue citations in the above example).
- Revision History
The editorial history of each TGN record is captured in the Revision
History, which identifies when records and names have been added,
edited, merged, etc. The Revision History is included with the
licensed files, but hidden from end-users. This information allows
implementers to update the TGN in their system with each new release.
Sample Record

|
 |