1:1 principle
[Dublin Core terminology] The principle whereby
related but conceptually different entities, for example a painting
and a digital image of the painting, are described by separate
metadata records.
administrative metadata
Metadata used in managing and administering information
resources, e.g., location or donor information.
algorithm
A formula for solving a problem. An algorithm is
a set of steps in a very specific order, such as a mathematical
formula or the instructions in a computer program.
authentication
A human or machine process that verifies that an
individual, computer, or information object is who or what it purports
to be.
back-end database
A database that contains and manages data for an
information system, distinct from the presentation or interface
components of that system.
CGI script
A computer program, most frequently written in
C, Perl, or a shell script, that uses the Common Gateway Interface
(CGI) standard and provides an interactive interface between a
user or an external computer application and a World Wide Web server.
CGI scripts are most commonly used to develop forms that allow
users to submit information to a Web server.
content model
A schema that defines data (including metadata)
structures, including the types of elements, subelements, and values
they can contain.
content standard
Standard authorities or sets of rules that determine
the vocabulary, syntax, or format of what is entered into a data
or metadata element, e.g., Art & Architecture Thesaurus, Library
of Congress Subject Headings, Anglo-American Cataloging Rules,
or Archives, Personal Papers, and Manuscripts.
crosswalk
A chart or table that represents the semantic mapping
of fields or data elements in one data standard to fields or data
elements in another standard that has a similar function or meaning.
Crosswalks enable heterogeneous databases to be searched simultaneously
with a single query as if they were a single database (semantic
interoperability) and to effectively convert data from one metadata
standard to another. See also metadata mapping below. Also known
as field mapping.
default values
Values that are assumed or supplied automatically
(for example, by a computer system) if a value is not specified.
digital signatures
A form of electronic authentication of a digital
document. Digital signatures are created and verified using public
key cryptography and serve to tie the document being signed to
the signer.
domain name
The address that identifies an Internet or other
network site. On the Internet, domain names act as mnemonic aliases
for IP addresses, a hierarchical numeric addressing system that
enables Internet hosts to be uniquely identified. Domain names
consist of at least two parts; the top-level domain, which specifies
host addresses at a national or broad sectoral level (e.g. ".com" for
the commercial sector, ".edu" for the US education sector, ".uk" for
the United Kingdom), and the sub-domain which is registered to
a specific organization or individual within that domain (e.g. "getty" is
registered to the Getty Trust within the .edu domain, and "bl" is
registered to the British Library within the .uk domain). The hierarchical
nature of the Domain Name System means that the authority for issuing
sub-domain names is delegated down the hierarchy; for example,
once the Getty Trust has registered the domain name "getty.edu",
it is responsible for any sub-domain names such "www.getty.edu". "shiva.getty.edu" etc.
DTD (Document Type Definition)
A formal specification of the structural elements
and markup definitions to be used in encoding certain types of
documents in SGML (which see). Instances of DTDs include EAD (which
see), HTML (which see) and TEI (which see).
directed labeled graph
A type of diagram, also referred to as a node and
arc diagram, that allows objects to be defined in terms of their
properties and their relationships to other objects.
Dublin Core
A minimal set of metadata elements that creators
or catalogers can assign to information resources, regardless of
the form of those resources, which can then be used for network
resource discovery, especially on the World Wide Web.
"Dumb-down" Rule
[Dublin Core terminology] A rule for the application
of Interoperability Qualifiers, which stipulates that qualifiers
can refine but not extend the semantics of the element to which
they are applied.
EAD
Encoded Archival Description, an SGML DTD that
represents a highly structured way to create digital finding aids
for a grouping of archival or manuscript materials.
element
A discrete component of data or metadata.
encoding analog
A mapping between a specific metadata element in
an SGML DTD (which see) and an equivalent(s) in an alternative
metadata set.
encoding ("marking up") information
A way for a creator of a digital object to structure
and mark up text or other data so that it can be manipulated by
a computer or a user, transmitted and searched over a network,
or displayed to a user in the same way that it was viewed by the
creator.
encryption
An encoding mechanism used to prevent non-authorized
users from reading digital information and also for user and document
authentication. Only designated users or recipients have the capability
to decode encrypted materials.
federated repository
A collection of distributed databases that can
be searched transparently as if they were a single database; made
possible by metadata mapping or crosswalks (which see).
field mapping
See crosswalk.
file transfer protocol (FTP)
A method of transferring files between computers
on the Internet.
finding aid
An archival descriptive tool such as an inventory
or register. Finding aids typically take the form of hierarchical,
narrative descriptions of aggregates of archival records or collections
of manuscript materials.
granularity
The level of detail at which an information object
or resource is viewed or described.
header metadata
Metadata embedded by the creator of a digital information
resource into the header part of a file for description and management
purposes.
"hidden Web"
The sum of the Web pages that are not accessible
to Web crawlers, usually because they are either generated by querying
a database in response to some user input or are password-protected.
hostname
An identifier for a specific machine on the Internet.
The hostname identifies not only the machine, but also its subnet
and domain. For example, www.getty.edu (See domain name).
HTML
HyperText Markup Language, an SGML-derived markup
language used to create documents for World Wide Web applications.
HTML has evolved to emphasize design and appearance rather than
the representation of document structure and data elements.
HTTP
HyperText Transfer Protocol, the standard protocol
that enables users with Web browsers to access HTML documents and
external media.
hyperlink
An abbreviated reference to a "hypertext link," a method
of creating nonlinear pathways between related digital documents,
or to link to related objects such as image or audio files.
hypermedia
A technique that links multimedia information,
frequently in nonlinear ways. In the HTML implementation, links
are embedded in text and other media through the insertion of tags
that are invisible to the user. Generally users are alerted to
the existence of a link by differently colored and/or underlined
text and a change in the mouse cursor when positioned over the "hot
word" or "hot spot." When the user points to the
link and selects it, the linkage is activated and the associated
information is revealed.
information object
A digital item or group of items referred to as
a unit, regardless of type or format, that a computer can address
or manipulate as a single object.
Internet
A global collection of computer networks that exchange
information by the TCP/IP suite of networking protocols (which
see). See http://www.fnc.gov/Internet_res.html.
Internet directory
A thematically organized list of descriptive links
to Internet sites, often created by humans who have classified
sites by their content.
Internet search engine (spider, crawler, robot)
A software program that collects information taken
from the content of files available on the Internet and places
them in a database that Internet users can search in a variety
of ways. The search results then provide links back to the original
location of the files matching the user's search.
interoperability
The ability for two different systems, particularly
computer-based systems, to work together correctly, particularly
in the correct interpretation of data semantics.
Interoperability Qualifiers
[Dublin Core terminology] Additional metadata used
either to refine the semantics of a Dublin Core metadata element's
value, or to provide more information about the encoding scheme
used for the value.
ISP
Internet Service Provider, an organization that
provides access to the Internet, typically on a commercial basis.
legacy system
A computer system or database that has been developed
and modified over a prolonged period and has become outdated and
difficult and costly to maintain, but that holds information that
is very important and involves processes that are deeply ingrained
in an organization. Legacy systems usually end up being replaced
by a new hardware/software configuration.
legal requirements metadata
Metadata documenting or tracking legal requirements
associated with access to, or usage of, information resources,
e.g., privacy and access or rights and reproduction requirements.
MARC
The MAchine-Readable Cataloging format, a set of
standardized data structures used to describe bibliographic materials
that facilitates cooperative cataloging and data exchange in bibliographic
information systems.
markup language
A formal way of annotating a document or collection
of digital data using embedded encoding tags to indicate the structure
of the document or datafile and the contents of its data elements.
This markup also provides a computer with information about how
to process and display marked-up documents.
memory institution
A generic term used to describe an institution
that has a responsibility to collect, care for and provide access
to the human record - for example, museums, libraries, and archives.
metadata
Literally, "data about data," metadata includes data associated
with either an information system or an information object for purposes
of description, administration, legal requirements, technical functionality,
use and usage, and preservation.
metadata mapping
A formal identification of equivalent or nearly
equivalent metadata elements or groups of metadata elements within
different metadata schemas, carried out in order to facilitate
semantic interoperability.
meta tag
An HTML tag that enables metadata to be embedded
invisibly on Web pages.
meta tag spamming
The deliberate misuse of meta tags in order to
attract traffic to a site, for example by boosting its ranking
in search results.
mirror server
A computer that contains an exact copy of the data
stored on another machine, usually in order to provide faster access
in another part of the world.
multimedia
Digital materials, documents, or products, such
as World Wide Web pages, CD-ROMs, or components of digital libraries,
archival information systems, and virtual museums that use any
combination of text, numeric data, still and moving images, animation,
sound, and graphics.
namespace
The set of unique names used to identify objects
within a well-defined domain, particularly relevant for XML applications.
For example, the following element names constitute the Dublin
Core Metadata Element Set namespace: DC.Creator, DC.Title, DC.Contributor,
DC.Description. DC.Subject, DC.Coverage, DC.Relation, DC.Publisher,
DC.Date, DC.Format, DC.Identifier, DC.Rights, DC.Language, DC.Type,
DC.Source.
nesting
The way in which subelements may be contained within
larger elements, resulting in multiple levels of metadata.
network bandwidth
This expression is derived from the term used to
describe the size or "width" of the frequencies used
to carry analog communications such as television and radio. For
Internet purposes, bandwidth is generally (and incorrectly) used
to refer to the rate of data transfer.
object-oriented systems
Information systems in which classes of objects
comprising both data and their data structures are defined together
with the attributes and the types of operations or functions associated
with each class of object. Relationships between objects are also
defined and objects can inherit characteristics from other objects
that exist higher in an object hierarchy. The information system
is then able to manipulate either object classes or individual
objects or "instances."
"on the fly"
An expression used to describe the process by which
information is generated or compiled, formatted and transmitted
on demand rather than from a static data source, e.g., the generation
of a set of retrieved data customized according to the user's preferences;
or the conversion upon request of SGML-encoded material to HTML
for presentation to a user who lacks an SGML viewer.
OPAC
Online Public Access Catalog, a computerized list
or inventory of a library's holdings,.
preservation metadata
Metadata related to the preservation management
of information resources, e.g., metadata used to document, or created
as a result of, preservation processes performed on information
resources.
protocol
A specification - often a standard - that describes
how computers will communicate with each other, for example the
TCP/IP suite of communication protocolsm (see TCP/IP).
public key infrastructure
An infrastructure to support authentication through
the use of digital signatures, based on public-key/private key
pair encryption.
publicly indexable Web
The sum of the unique Web pages that are accessible
to Web crawlers. In order to be accessible to Web crawlers, the
Web pages must be static (i.e. not generated "on the fly" in
response to user input) and not protected by a password.
RDF schema
A set of semantics within a defined namespace for
use with specific applications of the Resource Description Framework
(which see).
relevance ranking
The algorithmic process by which results in a result
set are sorted or ranked according to their relevance.
Resource Description Framework (RDF)
An application of XML that enables the creation
of rich structured Internet resource descriptions.
resource discovery
The process of searching for specific information
objects on the Internet.
robot
See Web crawler.
schema
A set of rules for encoding information that supports
specific communities of users. Also called "scheme."
schema registry
An authoritative source of names, semantics and
syntaxes for one or more schemas.
search engine
A program that allows users to search a database.
In the context of the World Wide Web, the term usually refers to
a facility for searching a large index of Web pages generated by
an automated Web crawler. See also Internet search engine.
semantic interoperability
The ability to seamlessly search for digital information
across heterogeneous distributed databases through a federated
search.
SGML
Standard Generalized Markup Language, ISO (International
Standards Organization) standard ISO/IEC 8879:1986, first used
by the publishing industry, for defining, specifying, and creating
digital documents that can be delivered, displayed, linked, and
manipulated in a system-independent manner.
spamming
(used in reference to meta tags) The abuse of metadata
that creators include in the HTML header area of their Web pages
in order to increase the number of visitors to a Web site. Keyword
spamming entails repeating keywords multiple times in order to
appear at the top of search engine result listings, or listing
keywords that are irrelevant to the site in order to attract visitors
under false pretenses. See http://www.thegrid.net/clear/spam.htm.
spider
See Web crawler.
tags
Short, formal mnemonics used to indicate data or
metadata elements, especially in HTML and SGML markup (e.g,, <TITLE>, <META>).
TCP/IP
Transmission Control Protocol/ Internet Protocol,
the ISO standardized suite of network protocols that enables information
systems to link to other information systems on the Internet, regardless
of their computer platform. TCP and IP are two software communication
standards used to allow multiple computers to talk to each other
in an error-free fashion.
technical metadata
Metadata created for, or generated by, a computer
system, relating to how the system or its content behaves or needs
to be processed.
TEI
Text Encoding Initiative, an international cooperative
effort to develop generic guidelines for a standard encoding schemes
(i.e., the TEI and TEI Lite DTDs) for scholarly text.
terabyte
two in the power of forty bytes, approximately
one thousand gigabytes.
testbed
An experimental prototype created with real data
and used by an information community for testing a new system architecture.
URI
Universal Resource Identifier. A general set of
names or addresses consisting of a string of characters that refer
to a resource. Also called "Uniform Resource Identifier." URLs
and URNs are types of URIs.
URL
Uniform Resource Locator. also referred to as "Universal Resource
Locator." A type of universal resource identifier. A URL is
an Internet address that tells a user how and where to locate a specific
file on the World Wide Web. A URL includes not only the name of a
file, but also the name of the host computer, the directory path
to get to that file, and the protocol needed in order to use it (e.g.,
http://www.getty.edu/research/institute/standards/intrometadata/toc.html
specifies that the hypertext transfer protocol "http" should
be used to retrieve the document toc.htm from the host www.getty.edu
in the directory /research/institute/standards/intrometadata/).
URN
Uniform Resource Name. Also referred to as "Universal Resource
Name/Number." A unique, location-independent identifier of a
file available on the Internet. The file remains accessible by its
URN regardless of changes that might occur in its host and directory
path.
use metadata
Metadata, generally automatically created by the
computer, that relates to the level and type of use of an information
system.
vCard
A standard for storing descriptions of individuals
akin to an electronic or "virtual" business card.
Web crawler
A software program that systematically traverses
the Web unattended, either for the purpose of generating a searchable
index of Web content or to gather statistics.
Web host
Web "hosting" refers to the storage of a Web site or home
page on a server so that it can be accessed over the World Wide Web.
High-quality Web hosting services are the foundation of a successful
Internet presence. The quality of a Web hosting service is defined
by several considerations: the speed of the Web server's connection
to the Internet, the type of hardware and software used for the server,
and the types of advanced services (such as CGI scripting) offered.
WHOIS++
A standard for an Internet directory services protocol.
World Wide Web
A vast distributed wide-area client-server architecture
for retrieving hypermedia documents over the Internet.
XML
A simplified subset of SGML that is designed specifically
for use with the World Wide Web and that provides for more sophisticated
data structuring and validation than HTML. XML is widely held to
be the successor to HTML as the language of the Web.
Z39.50
ISO 23950 and ANSI/NISO Z39.50 standard information
retrieval protocol, a client/server-based protocol for searching
and retrieving information from remote databases.
SEE ALSO:
Digital Library Initiative at the University of Illinois at Urbana-Champaign
http://dli.grainger.uiuc.edu/glossary.htm
Glossary of Networking Terms
http://www.cis.ohio-state.edu/htbin/rfc/rfc1208.html
NetGlos - The Multilingual Glossary of Internet Terminology (http://wwli.com/translation/netglos/netglos.html)