Glossary of Terms for Metadata, Taxonomies
and Digital Libraries
Maintenance of this glossary ceased
A glossary is a set of defined terms. Each term has a single
definition. The definition is relative to the domain,
in this case metadata, taxonomies and digital libraries. Some
terms may have multiple definitions according to a standard
dictionary. A glossary will use one of these definitions or will
create a separate definition.
- Advanced Distributed Learning Initiative. Manages the development of SCORM and the CORDRA system.
- To collect or gather together.
- "Any systematic method of obtaining evidence from tests, examinations, questionnaires, surveys and collateral sources used to draw inferences about characteristics of people, objects, or programs for a specific purpose." from the Testing and Assessment Glossary of Terms of QuestionMark: http://www.questionmark.com/us/glossary.htm. (Contributed by Eric Shepherd).
- Aviation Industry CBT Committee The Aviation Industry CBT (Computer-Based Training) Committee (AICC) is an international association of technology-based training professionals. The AICC develops guidelines for aviation industry in the development, delivery, and evaluation of CBT and related training technologies.
- The organizational structure of a computer system, including hardware and software. [from http://www.epa.gov/records/gloss/gloss02.htm.]
- Authority List
- A controlled list of terms, names, phrases or similar entries relative to a specific domain or scope. An authority list may or may not contain definitions or other information about each item. A glossary or dictionary contain definitions. Typically an authority list provides the standard spelling and form of terms. Authority lists have defined managers, e.g., the British Library's Name Authority List (http://www.bl.uk/services/bibliographic/authority.html). Authority Lists may have structure, such as the hierarchy of the Library of Congress' Subject Headings: http://www.loc.gov/cds/lcsh.html
- A Weblog of short, frequently updated entries, some of which allow responses.
- The complete clade of blogs.
- A selection, subset or subdivision of a classification.
See also Dr.
Tom’s Classification Guide
- "A database which contains descriptions of information
resources and their locations," Pipher,
Hayes & Davis (1998).
- European Committee for Standardization [Comité Européen de Normalisation]: Information Society Standardization System: Learning Technology Workshop
- Change Management
- Change management is a planned process for changing a core function or organization of an enterprise. Interestingly, the process of developing a taxonomy can result in the need for change. See Quality Assurance.
- A clade is a group of the same species that includes all ancestors. Adopted from the biological term.
- "1 : the act or process of classifying
2 a : systematic arrangement in groups or categories
according to established criteria; specifically : TAXONOMY
b : CLASS, CATEGORY "* Classification is both
the systematic arrangement of labels and the application
of those labels. Dr.
Tom’s Classification Guide
- Common Operating Environment: http://diicoe.disa.mil/coe/index.html.
See also Robin Cover's U.S. Federal CIO Council XML Working
Group Issues XML Developer's Guide: http://xml.coverpages.org/ni2002-01-16-a.html.
- A collection is an aggregation
of resources. It may—or may not—all be packaged
in one resource or content package.
It has no inherent navigational structure. It may have an
index and/or table of contents. Items
may be accessed on an individual basis. An example is a
collection of images. A collection may be housed in a library
or repository. Most collections
have a theme.
- A group with one or more common interests. The European SchoolNet has many learning communities supported by tools.
- Community of Practice (CoP)
- A community or group with a common interest. CoPs frequently use the internet to facilitate their activities. This facilitation may include forums, libraries, chat rooms, calendars and such. See the Community of Practice list http://groups.yahoo.com/group/com-prac/.
- 2 : an abstract or generic idea generalized from particular instances * See Note on Subjects and Concepts.
- Computational Linguistics
- Attemtps to use computers to process natural human language.
- A specific kind of resource that
is packaged into a usable state. A document is content.
- Content Package
- An assemblage of content items and
possible support files or headers. Support files may include
a manifest and/or sequencing files
- CORDRA is a project of the Advanced Distributed Learning (ADL) initiative. The Content Object Repository Discovery and Registration/Resolution Architecture (CORDRA) is designed to be an enabling model to bridge the worlds of learning content management and delivery, and content repositories and digital libraries. CORDRA will use the Handle System.
- Mapping of a term or concept between metadata formats. See also Thesaurus. The UKOLN has provided a list of many crosswalks, although this is not exhaustive. CanCore provides a Dublin Core to LOM crosswalk.
- An organized collection of data.
- Data Base
- A software structure that contains one or more data
sets. A data base that holds one data set may make the
two seem to be one and the same. A data base is an application
or tool, it is not its contents.
Question: do we consider the technical metadata to refer
to the data set or the data base?
- Data Model
- A description of the structure of data elements.
- An organized collection of definitions
and uses of terms. A term may have more than one definition
and more than one use. Contrast with glossary,
thesaurus, taxonomy and index.
- Digital Library
- "[N]etworked information in the research and education
communities." "[Digital] Libraries... [are] ...technological
and social developments that are fueled by information technology,
bioinformatics, and networked information." http://www.cni.org/,
an example of a digital library within the NSF's
- "[An] alphabetical or classified list (as of names
and addresses)".* A directory is simple and flat, usually
providing a set of name - location pairs. See also Index.
- Document Type Definition for describing
an XML structure: http://xml.coverpages.org/XMLSpecDTD.html.
XML-Schemas allow richer descriptions
at the price of complexity.
- Dublin Core (DC)
- The Dublin Core Metadata: http://dublincore.org/.
A core set of 15 elements or fields that are used to describe resources. It is a widely used basic
system of metadata good for wide
area searching. Often expressed as HTML metatags.
Sometimes "qualifiers" are added to refine the
definitions of each field. When in doubt, DC is a good starting
place for metadata, as it is widely used. DC is implemented with a variety of technical formats. DC is compatible with Z39.50, as DC defines descriptive labels and Z39.50 describes technical formats and transmission protocols.
- A component in a metadata structure.
Each element has a token.
- To go beyond the original form. For instance for XML see
Tom’s Guide to IMS XML Extensions and Incorporations
- A fundamental category by which an object or concept may be described. For example, a child’s ball may be described using the facets of size, weight, shape, color, texture, material and price. (http://www.boxesandarrows.com/view/all_about_facets_controlled_vocabularies)
- The FedoraTM Project
An Open-Source Digital Repository Management System.
- A bin or hole in a metadata structure
into which appropriate values are placed. This is not to
be confused with an element. The
simplest metadata field is a metatag.
- A list of terms with single definitions with respect
to a particular domain. "[A] collection
of ... specialized terms with their meanings".* Contrast
with dictionary, thesaurus and authority list.
- Level of graininess or size.
- Handle System
- "The Handle System is a comprehensive system for assigning, managing, and resolving persistent identifiers, known as ‘handles,’ for digital objects and other resources on the Internet. Handles can be used as Uniform Resource Names (URNs)". A handle is a unique identifier (UID) within a defined scope. The Handles System for the US Department of Defense (DOD) will be used by DTIC (http://www.dtic.mil/dtic/handles/) for the DOD's management of learning objects. This will interface with ADL's CORDRA system.
- The IEEE LTSC
- Institute of Electrical and Electronic
Engineers' Learning Technology Standards
- A pointer to a resource of some type or a list of such pointers. Typically an index is an alphabetical list of words and phrases. The terms in the index have no definitions assigned to them as opposed to a taxonomy in which terms may have meanings. A taxonomy does not point to specific references. See also Directory.
- IMS Global Learning Consortium
- "IMS Global Learning Consortium, Inc. (IMS) [is a
non-profit corporation that] is developing and promoting
open specifications for facilitating
online distributed learning activities such as locating
and using educational content, tracking learner progress,
reporting learner performance, and exchanging student records
between administrative systems. " These are busy people. http://www.imsglobal.org
- Intelligent Tutoring
- Intelligent tutoring systems attempt to duplicate the one-on-one teaching process using artificial intelligence. Online IT systems accomplish this in a distance education setting.
- A standard for quality assurance. http://www.iso.org/iso/en/iso9000-14000/index.html
- ISO 11179 (JTC1 SC32)
- Standard for describing data elements used in databases
body: ISO/IEC JTC1/SC32. As an example implementation see
the Australian Institute of Health and Welfare Data Standards:
- A unit or resource of arbitrary size.
Knowledge Management (KM)
- An organizational process for converting information into knowledge and making that knowledge accessible. It is more a state of mind and a commitment than a specific set of tools. Tools are used, however. See Brint: http://www.brint.com/km/. KM and online learning are merging in some enterprises.
- Learning Object
- The term "learning object" has a variety of definitions. I prefer a broad definition: a package of one or more resources that have educational utility. I am not going to try to specify "educational utility"; that is in the eye of the beholder, so to speak. Sometimes the definition states that the object must be specific for learning. Sometimes there is a requirement for an assessment component. The variety of definitions makes use of the term "learning object" unreliable without reference to a specific definition.
- The International Federation for
Learning, Education, and Training Systems Interoperability. "LETSI supports the long-term sustainability of eLearning initiatives by promoting and facilitating standards innovation and harmonization." "LETSI is a non-profit association of individuals and organizations who see the need for a sea change in the way technology is used in education and job training."
- Adapting to a praticular language, culture or region. For example, an enterprise may "localize" its taxonomies through a thesaurus or crosswalk.
- The Learning Object Metadata standard
of the IEEE LTSC. The IEEE LTSC
working group is #12: http://ltsc.ieee.org/wg12/index.html.
- Learning Technology Standard Observatory: http://www.cen-ltso.net. Not a standard, but a place to go to see international standards of interest related to online education and training. Also has other useful information and calendar of events related to online learning.
- A packing slip for a content
package. A manifest in a content package may
contain or point to metadata, define the resources
contained in the content package and define an intended
organization of the resources.
- The study of evolution in information systems,
of Memetics - Evolutionary Models of Information Transmission,
- Metadata (bibliographic)
- Information that catalogs
or describes a resource. This may include metadata defining
the resource subject, format, location, ownership, authorship
and so forth. Some may validly consider metadata a resource.
See also Dr.
Tom’s Metadata Guide, Wason, T., and Wiley, D.
- A named field. Typically metatags
are not structured, but are used as a simple name-value
pair datum. Metatags a normally associated with HTML as <meta> elements in the header. See for example the Meta Tag Tutorial at WebDeveloper.com. Some web search engines use the metatags, some do not.
- Multipurpose Internet Mail Extensions. Generally accepted designations of file format types. MIME is an example of a recommended standard that has not yet been approved but is widely used. This is an example of how the world works at "Web speed.". If you wait for the formal standard before proceeding, you may be left at the back of the pack. On the other hand, if you select a recommendation that is either not approved or undergoes significant change before approval, you may have a lot of expensive revision to do. Enterprises participate in standards groups to try to discern the future.
- The National Science Foundation's National Science
Digital Library Initiative: NSF
- Open Archives Initiative: http://www.openarchives.org/
The OAI has developed a metadata harvesting protocol, OAI-PMH. This
is used in the NSDL and elsewhere.
- Open Architecture Initiative Protocol for Metadata Harvesting: http://www.openarchives.org/OAI/openarchivesprotocol.html, http://www.openarchives.org/ (OAI).
- "OASIS (Organization for the Advancement of Structured Information Standards) is a not-for-profit, international consortium that drives the development, convergence, and adoption of e-business standards. The consortium produces more Web services standards than any other organization along with standards for security, e-business, and standardization efforts in the public sector and for application-specific markets."
- Open Knowledge Initiative: http://web.mit.edu/oki/
- 1 : a branch of metaphysics concerned with the nature and relations of being * See Note on Subjects and Concepts and Facet. An ontology is a concept map, e.g., source > power > electrical. An ontology is a representation of the relatioships among concepts. See also taxonomy, which is an organized set of terminology.
- Perceptual Coupling
- Matching the nature or structure of information to the precognitive processing of the human perceptual system to promote the rapid recognition or meaning of information. Perceptual coupling has a variety of applications generally involving situation management.
- Refines or limits the definition of a metatag.
DC provides qualifiers.
- Quality Assurance
- Quality assurance (QA) is process for ensuring that an enterprise’s products or services are provided according to well-defined standards, specifications or methods. The most highly regarded and formal standard for a qualtiy control process is ISO 9000 and its relatives. The development of metadata and taxonomies can often result in changes in an enterprise's business processes. QA can be effective as a component of change management. See Change Management.
- Resource Description Framework of
the W3C: http://www.w3.org/RDF/.
Uses XML for a self-describing data structure
- Information that defines what is stored in one or more
repositories. It may contain specialized
vocabularies and taxonomies. A registry may contain metadata describing other registries, repositories or objects within repositories. A "registered" repository may
have to adhere to certain rules in order to be registered. The difference between "registry"
and "repository" is not consistent in use.
- A storage site for resources, which may
include metadata. A repository
may house a collection. Contrast
with a registry or a collection.
- Any asset that can be bounded in some manner. It
may be a data stream, but its parameters are well defined.
sets and data streams are resources. A tool may be considered
a resource. A source of supply or support.* Stuff.
- Security and Authentication Markup Language: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=security Defining and maintaining a standard, XML-based framework for creating and exchanging security information between online partners
- Sharable Content Object Reference
- The study of meanings: a : the historical and psychological
study and the classification of changes in the signification
of words or forms viewed as factors in linguistic development.*
- Semantic Web
- The gleam in Tim
Berners-Lee's eye for a unified Web without metadata:
- An acronym for Science, Mathematics, Engineering
and Technology, sometimes with "
Education" included at the end.
- "a detailed precise presentation of something or
of a plan or proposal for something -- usually used in plural".*
A specification is created by an enterprise, as opposed
to a standard, which is maintained
by an official body. For example the IMS
Global Learning Consortium and Intel
develop specifications. Examples are the IMS
Meta-Data Specification and the Intel
Audio Codec Specifications.
- There are two definitions of "service oriented architecture" (SOA) in circulation. Some refer to SOA as a system of enterprise services for business integration, others as a Web based architecture of services with standard interfaces that supports services that are independently developed to interoperate. Know which is being discussed. CORDRA is the latter.
- Search/Retrieve Web service: http://xml.coverpages.org/ni2004-03-04-a.html. SRW is "an XML-based protocol designed to be a low-barrier-to-entry solution for searching and other information retrieval operations across the Internet."
- "[S]omething set up and established by authority
as a rule for the measure of quantity, weight, extent, value,
or quality ".* IEEE and ISO
develop standards. An example is the IEEE
LTSC Learning Object Metadata standard (under ballot). Recommended, but not yet approved, standards such as MIME are widely used. Not yet approved standards may be designated RFC (Request for Comments).
- 3 a : a department of knowledge or learning * See Note on Subjects and Concepts.
- See Metatag.
- A self referential definition or system of logic. See
- A taxonomy is a terminology map of a topic or discipline. It is a structured vocabulary that embodies relationships among
terms. Each term is contained in a taxon. Denise A. D. Bedford, Ph.D. of the world bank enumerates four types of taxonomies: flat, hierarchical, faceted and network (Taxonomies for Information & Knowledge Management Architectures, URL: http://www.sla.org/chapter/cdc/presentations/20030204_taxonomies.ppt).
The most common relationship is a hierarchy (tree structure), e.g., electronics > power supply.
Other forms may include cross linking and poly-hierarchical
structure. A vocabulary is a single
level taxonomy. See also Dr.
Tom’s Taxonomy Guide. Some major taxonomies are:
MARC, LCSH, MeSH. For a long list of many taxonomies in many domains see: Controlled vocabularies, thesauri and classification systems available in the WWW. DC Subject (http://www.lub.lu.se/metadata/subject-help.html) Originally referred to biological classifications:
"[O]rderly classification of plants and animals according
to their presumed natural relationships".* Compare
with thesaurus. See also ontology.A taxonomy has or infers a glossary. A glossary has only one definition per term as opposed to a dictionary that may have more than one definition per term, e.g., fast. The glossary may be an authority list. An ontology, on the other hand, is an organization of concepts. It may govern the application of terms in a taxonomy.
- A node in a taxonomy. A taxon
may contain a term, and reference label and links to other
- A collection of terms, their
definitions (that may be multiple) and relationships to
other terms. Contrast with taxonomy,
dictionary and glossary.
" [A] list of subject headings or descriptors usually
with a cross-reference system for use in the organization
of a collection of documents for reference and retrieval".*
- An element's name or set of characters
that are the logical equivalent of a name.
- Universal Description Discovery & Integration (http://uddi.org/pubs/uddi-v3.0.2-20041019.htm) of OASIS. "...the definition of a set of services supporting the description and discovery of (1) businesses, organizations, and other Web services providers, (2) the Web services they make available, and (3) the technical interfaces which may be used to access those services. Based on a common set of industry standards, including HTTP, XML, XML Schema, and SOAP, UDDI provides an interoperable, foundational infrastructure for a Web services-based software environment for both publicly available services and services only exposed internally within an organization.
- Unique Object Identifier. A UID provides a token designating a specific object. Each UID has only one occurrence within its defined scope of use. Uniqueness may be specified by a statistical probability of recurrence or through an assignment system, such as the Handle System, that enforces uniqueness through policy. A globally unique identifier theoretically has no duplication within the known set of all UIDs.
- A list of terms, sometimes with definitions. A single
level taxonomy. A vocabulary can
be considered a very localized glossary. Or not.
- Web Services Description Language, http://www.w3.org/TR/wsdl.
- eXtensible Markup Language of the
World Wide Web Consortium:
A method for serializing structured data for use on the
Internet. This is not yet an official standard, but is in
a "recommended" state. See also http://www.xml.org.
- A control system for XML: http://www.w3.org/XML/Schema.
This is not yet an official standard, but is in a "recommended"
state. See Dr.
Tom’s Guide to XML-Schema , IMS XML bindings,
DTDs and Examples.http://www.imsglobal.org/xsd/.
- The Extensible Stylesheet Language
for creating different views of XML documents: http://www.w3.org/Style/XSL/.
- The Extensible Stylesheet Language
for Translating XML documents
to different formats: http://www.w3.org/Style/XSL/.
This is part of the XSL specification.
- " A technical data format and transmission protocol specification. "Z39.50" refers to the International
Standard, ISO 23950: ’Information Retrieval (Z39.50):
Application Service Definition and Protocol Specification‘,
and to ANSI/NISO Z39.50. The Library of Congress is the
Maintenance Agency and Registration Authority for both standards,
which are technically identical (though with minor editorial
"The standard specifies
a client/server-based protocol for searching and retrieving
information from remote databases.
Last edit: 2010/08/01
A glossary is forever a work in progress.
Note that I have not attempted definitions of mode,
genre or theme. I welcome clear, commonly
accepted differential definitions of these—and any other—terms. Comments on the above definitions are welcome, too.
This glossary has been prepared by Thomas
D. Wason. Some definitions are taken from the Merriam-Webster
Online Dictionary at http://www.m-w.com/dictionary.htm.
If multiple definitions are available, the most appropriate
one is selected. Entries from the Merriam-Webster Online Dictionary
are designated with an asterisk (*).