ACM Computing Surveys
28A(4), December 1996,
http://www.acm.org/surveys/1996/RamamrithamApplication/. Copyright ©
1996 by the Association for Computing Machinery, Inc. See the permissions statement below.
Application-Oriented Database Support
Krithi Ramamritham
Department of Computer Science
University of Massachusetts
Amherst MA 01003
krithi@cs.umass.edu
http://www-ccs.cs.umass.edu/krithi/home.html
Almost all applications today utilize databases in one form or
another. Many databases are deployed in applications where there is a
mismatch
-- both in functionality and in performance -- between what is needed
and what is supported in off-the-shelf database systems. Notable
examples include information retrieval, workflow, multimedia,
telephone switching (more generally, real-time applications), earth
observation systems, and genome mapping applications:
-
In information retrieval applications, databases have been used to
store the metadata, in particular, the index structures. Traditional
database indexing schemes are not suitable for managing these index
structures. Also, the required correctness properties in the presence
of updates are not supported by any of the built-in transactional
support available in traditional databases.
- In workflow applications, data consistency is important, but so is the
conformance with the steps involved in the process defined by the
workflow. The data-centric approach taken by databases makes them
inappropriate, as is, for workflow applications.
-
Multimedia databases involve data items which are of continuous
types. Their usage patterns are very different from applications for
which LRU buffer management policies are suited. The data items are
also large and when they are updated, it is inefficient to log the
contents of the whole object for recovery purposes.
-
Earth observation systems and several scientific projects like genome
mapping require the storage of peta-bytes of data with tera-bytes
being added every day. To store, retrieve and manage such
massive amounts of digital data, hierarchical storage systems are to
be used to achieve a better price-performance ratio. Traditional
schemes for query optimization are not suitable for such systems.
-
Embedded applications often involve the use of data with validity
intervals attached to them. Data must be correctly accessed while it
is valid but does not have persistence properties.
-
While throughput and response times are the metrics of interest for
traditional applications, in embedded and multimedia applications, real-time
responsiveness is the goal. Quite often, complex Quality of Service (QoS)
guarantees are sought.
The mismatch (both in performance and functionality) between the needs
of applications and the facilities offered by existing database
technology is significant. This mismatch is usually resolved by
building the required database support from scratch again and again
for each application or by deciding to live with the
deficiencies. Because this route is both wasteful and error-prone, and
the mismatch problem cannot be solved solely through incremental
approaches such as tuning, it is important and far from solved. Two
important issues need to be addressed.
The first and the foremost is the need to migrate ideas
developed by the research community to practice. For example, a lot of
work has recently gone into the development of advanced concurrency
control, recovery, and, more generally, transaction processing
techniques to meet the demands of applications with long-running
activities. Unfortunately, not many of the new ideas have migrated
into commercial offerings and in fact very few prototypes have been
built.
The second issue is how should the database systems be structured to
meet the needs of these applications? From the application
requirements we reviewed earlier, we observe that almost every
component of the database system --- modeling capabilities, indexing
schemes, transaction processing techniques, buffer management
mechanisms, and query optimization schemes need some type of
enhancement to meet the demands of new applications. This being the
case, a tool-box type of approach may be suitable. In this
approach, commercial systems will offer a package of libraries,
containing components that offer the traditional properties as well as
some new ones that are tuned to needs of different applications. An
application developer can then specify the requirements and the needed
support can be tailored by combining these different database
components. An identification of the building blocks and a way to
integrate them to meet the specified functionality and performance
requirements are obvious steps in this direction.
Permission to make digital
or hard copies of part or all of this work for personal or classroom
use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for
components of this work owned by others than ACM must be honored.
Abstracting with credit is permitted. To copy otherwise, to
republish, to post on servers, or to redistribute to lists, requires
prior specific permission and/or a fee. Request permissions from
Publications Dept, ACM Inc., fax +1 (212) 869-0481, or
permissions@acm.org.
krithi@cs.umass.edu