The Getty Previous
Home
Introduction
Setting the Stage
Metadata and the World Wide Web
Crosswalks: The Path to Universal Access?
Crosswalks
Glossary
Acronyms & URLs
Contributors
Printer Friendly PDFs



Introduction to Metadata


Glossary

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z See Also

1:1 principle
[Dublin Core terminology] The principle whereby related but conceptually different entities, for example a painting and a digital image of the painting, are described by separate metadata records.

administrative metadata
Metadata used in managing and administering information resources, e.g., location or donor information.

algorithm
A formula for solving a problem. An algorithm is a set of steps in a very specific order, such as a mathematical formula or the instructions in a computer program.

authentication
A human or machine process that verifies that an individual, computer, or information object is who or what it purports to be.

back-end database
A database that contains and manages data for an information system, distinct from the presentation or interface components of that system.

CGI script
A computer program, most frequently written in C, Perl, or a shell script, that uses the Common Gateway Interface (CGI) standard and provides an interactive interface between a user or an external computer application and a World Wide Web server. CGI scripts are most commonly used to develop forms that allow users to submit information to a Web server.

content model
A schema that defines data (including metadata) structures, including the types of elements, subelements, and values they can contain.

content standard
Standard authorities or sets of rules that determine the vocabulary, syntax, or format of what is entered into a data or metadata element, e.g., Art & Architecture Thesaurus, Library of Congress Subject Headings, Anglo-American Cataloging Rules, or Archives, Personal Papers, and Manuscripts.

crosswalk
A chart or table that represents the semantic mapping of fields or data elements in one data standard to fields or data elements in another standard that has a similar function or meaning. Crosswalks enable heterogeneous databases to be searched simultaneously with a single query as if they were a single database (semantic interoperability) and to effectively convert data from one metadata standard to another. See also metadata mapping below. Also known as field mapping.

default values
Values that are assumed or supplied automatically (for example, by a computer system) if a value is not specified.

digital signatures
A form of electronic authentication of a digital document. Digital signatures are created and verified using public key cryptography and serve to tie the document being signed to the signer.

domain name
The address that identifies an Internet or other network site. On the Internet, domain names act as mnemonic aliases for IP addresses, a hierarchical numeric addressing system that enables Internet hosts to be uniquely identified. Domain names consist of at least two parts; the top-level domain, which specifies host addresses at a national or broad sectoral level (e.g. ".com" for the commercial sector, ".edu" for the US education sector, ".uk" for the United Kingdom), and the sub-domain which is registered to a specific organization or individual within that domain (e.g. "getty" is registered to the Getty Trust within the .edu domain, and "bl" is registered to the British Library within the .uk domain). The hierarchical nature of the Domain Name System means that the authority for issuing sub-domain names is delegated down the hierarchy; for example, once the Getty Trust has registered the domain name "getty.edu", it is responsible for any sub-domain names such "www.getty.edu". "shiva.getty.edu" etc.

DTD (Document Type Definition)
A formal specification of the structural elements and markup definitions to be used in encoding certain types of documents in SGML (which see). Instances of DTDs include EAD (which see), HTML (which see) and TEI (which see).

directed labeled graph
A type of diagram, also referred to as a node and arc diagram, that allows objects to be defined in terms of their properties and their relationships to other objects.

Dublin Core
A minimal set of metadata elements that creators or catalogers can assign to information resources, regardless of the form of those resources, which can then be used for network resource discovery, especially on the World Wide Web.

"Dumb-down" Rule
[Dublin Core terminology] A rule for the application of Interoperability Qualifiers, which stipulates that qualifiers can refine but not extend the semantics of the element to which they are applied.

EAD
Encoded Archival Description, an SGML DTD that represents a highly structured way to create digital finding aids for a grouping of archival or manuscript materials.

element
A discrete component of data or metadata.

encoding analog
A mapping between a specific metadata element in an SGML DTD (which see) and an equivalent(s) in an alternative metadata set.

encoding ("marking up") information
A way for a creator of a digital object to structure and mark up text or other data so that it can be manipulated by a computer or a user, transmitted and searched over a network, or displayed to a user in the same way that it was viewed by the creator.

encryption
An encoding mechanism used to prevent non-authorized users from reading digital information and also for user and document authentication. Only designated users or recipients have the capability to decode encrypted materials.

federated repository
A collection of distributed databases that can be searched transparently as if they were a single database; made possible by metadata mapping or crosswalks (which see).

field mapping
See crosswalk.

file transfer protocol (FTP)
A method of transferring files between computers on the Internet.

finding aid
An archival descriptive tool such as an inventory or register. Finding aids typically take the form of hierarchical, narrative descriptions of aggregates of archival records or collections of manuscript materials.

granularity
The level of detail at which an information object or resource is viewed or described.

header metadata
Metadata embedded by the creator of a digital information resource into the header part of a file for description and management purposes.

"hidden Web"
The sum of the Web pages that are not accessible to Web crawlers, usually because they are either generated by querying a database in response to some user input or are password-protected.

hostname
An identifier for a specific machine on the Internet. The hostname identifies not only the machine, but also its subnet and domain. For example, www.getty.edu (See domain name).

HTML
HyperText Markup Language, an SGML-derived markup language used to create documents for World Wide Web applications. HTML has evolved to emphasize design and appearance rather than the representation of document structure and data elements.

HTTP
HyperText Transfer Protocol, the standard protocol that enables users with Web browsers to access HTML documents and external media.

hyperlink
An abbreviated reference to a "hypertext link," a method of creating nonlinear pathways between related digital documents, or to link to related objects such as image or audio files.

hypermedia
A technique that links multimedia information, frequently in nonlinear ways. In the HTML implementation, links are embedded in text and other media through the insertion of tags that are invisible to the user. Generally users are alerted to the existence of a link by differently colored and/or underlined text and a change in the mouse cursor when positioned over the "hot word" or "hot spot." When the user points to the link and selects it, the linkage is activated and the associated information is revealed.

information object
A digital item or group of items referred to as a unit, regardless of type or format, that a computer can address or manipulate as a single object.

Internet
A global collection of computer networks that exchange information by the TCP/IP suite of networking protocols (which see). See http://www.fnc.gov/Internet_res.html.

Internet directory
A thematically organized list of descriptive links to Internet sites, often created by humans who have classified sites by their content.

Internet search engine (spider, crawler, robot)
A software program that collects information taken from the content of files available on the Internet and places them in a database that Internet users can search in a variety of ways. The search results then provide links back to the original location of the files matching the user's search.

interoperability
The ability for two different systems, particularly computer-based systems, to work together correctly, particularly in the correct interpretation of data semantics.

Interoperability Qualifiers
[Dublin Core terminology] Additional metadata used either to refine the semantics of a Dublin Core metadata element's value, or to provide more information about the encoding scheme used for the value.

ISP
Internet Service Provider, an organization that provides access to the Internet, typically on a commercial basis.

legacy system
A computer system or database that has been developed and modified over a prolonged period and has become outdated and difficult and costly to maintain, but that holds information that is very important and involves processes that are deeply ingrained in an organization. Legacy systems usually end up being replaced by a new hardware/software configuration.

legal requirements metadata
Metadata documenting or tracking legal requirements associated with access to, or usage of, information resources, e.g., privacy and access or rights and reproduction requirements.

MARC
The MAchine-Readable Cataloging format, a set of standardized data structures used to describe bibliographic materials that facilitates cooperative cataloging and data exchange in bibliographic information systems.

markup language
A formal way of annotating a document or collection of digital data using embedded encoding tags to indicate the structure of the document or datafile and the contents of its data elements. This markup also provides a computer with information about how to process and display marked-up documents.

memory institution
A generic term used to describe an institution that has a responsibility to collect, care for and provide access to the human record - for example, museums, libraries, and archives.

metadata
Literally, "data about data," metadata includes data associated with either an information system or an information object for purposes of description, administration, legal requirements, technical functionality, use and usage, and preservation.

metadata mapping
A formal identification of equivalent or nearly equivalent metadata elements or groups of metadata elements within different metadata schemas, carried out in order to facilitate semantic interoperability.

meta tag
An HTML tag that enables metadata to be embedded invisibly on Web pages.

meta tag spamming
The deliberate misuse of meta tags in order to attract traffic to a site, for example by boosting its ranking in search results.

mirror server
A computer that contains an exact copy of the data stored on another machine, usually in order to provide faster access in another part of the world.

multimedia
Digital materials, documents, or products, such as World Wide Web pages, CD-ROMs, or components of digital libraries, archival information systems, and virtual museums that use any combination of text, numeric data, still and moving images, animation, sound, and graphics.

namespace
The set of unique names used to identify objects within a well-defined domain, particularly relevant for XML applications. For example, the following element names constitute the Dublin Core Metadata Element Set namespace: DC.Creator, DC.Title, DC.Contributor, DC.Description. DC.Subject, DC.Coverage, DC.Relation, DC.Publisher, DC.Date, DC.Format, DC.Identifier, DC.Rights, DC.Language, DC.Type, DC.Source.

nesting
The way in which subelements may be contained within larger elements, resulting in multiple levels of metadata.

network bandwidth
This expression is derived from the term used to describe the size or "width" of the frequencies used to carry analog communications such as television and radio. For Internet purposes, bandwidth is generally (and incorrectly) used to refer to the rate of data transfer.

object-oriented systems
Information systems in which classes of objects comprising both data and their data structures are defined together with the attributes and the types of operations or functions associated with each class of object. Relationships between objects are also defined and objects can inherit characteristics from other objects that exist higher in an object hierarchy. The information system is then able to manipulate either object classes or individual objects or "instances."

"on the fly"
An expression used to describe the process by which information is generated or compiled, formatted and transmitted on demand rather than from a static data source, e.g., the generation of a set of retrieved data customized according to the user's preferences; or the conversion upon request of SGML-encoded material to HTML for presentation to a user who lacks an SGML viewer.

OPAC
Online Public Access Catalog, a computerized list or inventory of a library's holdings,.

preservation metadata
Metadata related to the preservation management of information resources, e.g., metadata used to document, or created as a result of, preservation processes performed on information resources.

protocol
A specification - often a standard - that describes how computers will communicate with each other, for example the TCP/IP suite of communication protocolsm (see TCP/IP).

public key infrastructure
An infrastructure to support authentication through the use of digital signatures, based on public-key/private key pair encryption.

publicly indexable Web
The sum of the unique Web pages that are accessible to Web crawlers. In order to be accessible to Web crawlers, the Web pages must be static (i.e. not generated "on the fly" in response to user input) and not protected by a password.

RDF schema
A set of semantics within a defined namespace for use with specific applications of the Resource Description Framework (which see).

relevance ranking
The algorithmic process by which results in a result set are sorted or ranked according to their relevance.

Resource Description Framework (RDF)
An application of XML that enables the creation of rich structured Internet resource descriptions.

resource discovery
The process of searching for specific information objects on the Internet.

robot
See Web crawler.

schema
A set of rules for encoding information that supports specific communities of users. Also called "scheme."

schema registry
An authoritative source of names, semantics and syntaxes for one or more schemas.

search engine
A program that allows users to search a database. In the context of the World Wide Web, the term usually refers to a facility for searching a large index of Web pages generated by an automated Web crawler. See also Internet search engine.

semantic interoperability
The ability to seamlessly search for digital information across heterogeneous distributed databases through a federated search.

SGML
Standard Generalized Markup Language, ISO (International Standards Organization) standard ISO/IEC 8879:1986, first used by the publishing industry, for defining, specifying, and creating digital documents that can be delivered, displayed, linked, and manipulated in a system-independent manner.

spamming
(used in reference to meta tags) The abuse of metadata that creators include in the HTML header area of their Web pages in order to increase the number of visitors to a Web site. Keyword spamming entails repeating keywords multiple times in order to appear at the top of search engine result listings, or listing keywords that are irrelevant to the site in order to attract visitors under false pretenses. See http://www.thegrid.net/clear/spam.htm.

spider
See Web crawler.

tags
Short, formal mnemonics used to indicate data or metadata elements, especially in HTML and SGML markup (e.g,, <TITLE>, <META>).

TCP/IP
Transmission Control Protocol/ Internet Protocol, the ISO standardized suite of network protocols that enables information systems to link to other information systems on the Internet, regardless of their computer platform. TCP and IP are two software communication standards used to allow multiple computers to talk to each other in an error-free fashion.

technical metadata
Metadata created for, or generated by, a computer system, relating to how the system or its content behaves or needs to be processed.

TEI
Text Encoding Initiative, an international cooperative effort to develop generic guidelines for a standard encoding schemes (i.e., the TEI and TEI Lite DTDs) for scholarly text.

terabyte
two in the power of forty bytes, approximately one thousand gigabytes.

testbed
An experimental prototype created with real data and used by an information community for testing a new system architecture.

URI
Universal Resource Identifier. A general set of names or addresses consisting of a string of characters that refer to a resource. Also called "Uniform Resource Identifier." URLs and URNs are types of URIs.

URL
Uniform Resource Locator. also referred to as "Universal Resource Locator." A type of universal resource identifier. A URL is an Internet address that tells a user how and where to locate a specific file on the World Wide Web. A URL includes not only the name of a file, but also the name of the host computer, the directory path to get to that file, and the protocol needed in order to use it (e.g., http://www.getty.edu/research/institute/standards/intrometadata/toc.html specifies that the hypertext transfer protocol "http" should be used to retrieve the document toc.htm from the host www.getty.edu in the directory /research/institute/standards/intrometadata/).

URN
Uniform Resource Name. Also referred to as "Universal Resource Name/Number." A unique, location-independent identifier of a file available on the Internet. The file remains accessible by its URN regardless of changes that might occur in its host and directory path.

use metadata
Metadata, generally automatically created by the computer, that relates to the level and type of use of an information system.

vCard
A standard for storing descriptions of individuals akin to an electronic or "virtual" business card.

Web crawler
A software program that systematically traverses the Web unattended, either for the purpose of generating a searchable index of Web content or to gather statistics.

Web host
Web "hosting" refers to the storage of a Web site or home page on a server so that it can be accessed over the World Wide Web. High-quality Web hosting services are the foundation of a successful Internet presence. The quality of a Web hosting service is defined by several considerations: the speed of the Web server's connection to the Internet, the type of hardware and software used for the server, and the types of advanced services (such as CGI scripting) offered.

WHOIS++
A standard for an Internet directory services protocol.

World Wide Web
A vast distributed wide-area client-server architecture for retrieving hypermedia documents over the Internet.

XML
A simplified subset of SGML that is designed specifically for use with the World Wide Web and that provides for more sophisticated data structuring and validation than HTML. XML is widely held to be the successor to HTML as the language of the Web.

Z39.50
ISO 23950 and ANSI/NISO Z39.50 standard information retrieval protocol, a client/server-based protocol for searching and retrieving information from remote databases.

SEE ALSO:

Digital Library Initiative at the University of Illinois at Urbana-Champaign
http://dli.grainger.uiuc.edu/glossary.htm

Glossary of Networking Terms
http://www.cis.ohio-state.edu/htbin/rfc/rfc1208.html

NetGlos - The Multilingual Glossary of Internet Terminology (http://wwli.com/translation/netglos/netglos.html)

UKOLN Glossary
http://www.ukoln.ac.uk/metadata/glossary/

WWW Glossary
http://www.engr.newpaltz.edu/docs/EClass/www/www/wwwgloss.html

W3C Glossary of Hypertext Terms and Web Architecture
http://www.w3.org/Glossary

 
     

The J. Paul Getty Trust
The J. Paul Getty Trust
© J. Paul Getty Trust | Privacy Policy | Terms of Use