STDS: Glossary Obscura
Exposing Key Concepts & Definitions

As of 1965 all data storage access strategies depended on prior knowledge of user application intentions. Many government agencies were collecting large volumes of data, but had no idea of how applications might use the accumulated data. The Advanced Research Projects Agency, ARPA, reasoned that information contained in the data relationships was intrinsic to the data itself and should not be diminished by a computer representation.

ARPA: Data Relationships Are Independent of Data Representations.

An application requesting data should receive the same data independent of how the data is physically represented on any specific machine. Therefore, any proposed machine-independent data structure must preserve data relationships in such a way that they can be shared by any requesting application.

Machine-Independent Data Structure
In 1965 ARPA initiated the CONCOMP Project to investigate the conversational use of computers. One research objective was to define and implement a machine-independent data structure. A definition was obvious, an implementation was not.

Machine-Independent Data Structure: Any implementation for representing data relationships that can be accessed and manipulated by applications without any reliance on how the data relationships are physically represented and preserved.

Data Structure
The primary purpose of a data structure is to provide applications access to preserved data. There are three major capabilities of a data structure:
                 1) Support ability for applications to manipulate abstract representations of data relationships.
                 2) Support ability of system to manage physical representations of application data.
                 3) Mapping strategies for embedding and extracting application data in and from storage data.

For a data structure to be machine-independent all data access reliance on physical properties of the stored data must reside on the storage side of the mapping interface. If any data access strategy requires knowledge of storage organization, on the application side of the mapping interface, the data structure is physically-dependent.

Set-Theoretic Data Structure
By 1968 a Set-Theoretic Data Structure, STDS, (satisfying the conditions of a machine-independent data structure) had been implemented at the University of Michigan. The architecture of STDS uses structured sets to represent application data, storage data, and to support I/O mappings. Structured Sets are sets with two membership conditions, as defined under the axioms of Extended Set Theory, XST. STDS capabilities are:
                 1) Set operations for manipulating application data as sets.
                 2) Set operations for manipulating storage data as sets.
                 3) All data I/O performed using set operations.

The initial STDS implementation only supported data represented as labeled arrays, but it demonstrated the feasibility of a machine-independent data structure.

STDS Libraries
XML documents and semi-structured data can be represented by structured sets, and are now supported by STDS libraries. STDS libraries are available to any interested in using set-accessing I/O capabilities to access Big Data.

Copyright 2017   INTEGRATED INFORMATION SYSTEMS   Last modified on 08/07/2017