ACM Computing Surveys 28A(4), December 1996, http://www.acm.org/surveys/1996/RamamrithamApplication/. Copyright © 1996 by the Association for Computing Machinery, Inc. See the permissions statement below.


Application-Oriented Database Support


Krithi Ramamritham

Department of Computer Science
University of Massachusetts
Amherst MA 01003
krithi@cs.umass.edu
http://www-ccs.cs.umass.edu/krithi/home.html

Almost all applications today utilize databases in one form or another. Many databases are deployed in applications where there is a mismatch -- both in functionality and in performance -- between what is needed and what is supported in off-the-shelf database systems. Notable examples include information retrieval, workflow, multimedia, telephone switching (more generally, real-time applications), earth observation systems, and genome mapping applications:

  1. In information retrieval applications, databases have been used to store the metadata, in particular, the index structures. Traditional database indexing schemes are not suitable for managing these index structures. Also, the required correctness properties in the presence of updates are not supported by any of the built-in transactional support available in traditional databases.
  2. In workflow applications, data consistency is important, but so is the conformance with the steps involved in the process defined by the workflow. The data-centric approach taken by databases makes them inappropriate, as is, for workflow applications.
  3. Multimedia databases involve data items which are of continuous types. Their usage patterns are very different from applications for which LRU buffer management policies are suited. The data items are also large and when they are updated, it is inefficient to log the contents of the whole object for recovery purposes.
  4. Earth observation systems and several scientific projects like genome mapping require the storage of peta-bytes of data with tera-bytes being added every day. To store, retrieve and manage such massive amounts of digital data, hierarchical storage systems are to be used to achieve a better price-performance ratio. Traditional schemes for query optimization are not suitable for such systems.
  5. Embedded applications often involve the use of data with validity intervals attached to them. Data must be correctly accessed while it is valid but does not have persistence properties.
  6. While throughput and response times are the metrics of interest for traditional applications, in embedded and multimedia applications, real-time responsiveness is the goal. Quite often, complex Quality of Service (QoS) guarantees are sought.
The mismatch (both in performance and functionality) between the needs of applications and the facilities offered by existing database technology is significant. This mismatch is usually resolved by building the required database support from scratch again and again for each application or by deciding to live with the deficiencies. Because this route is both wasteful and error-prone, and the mismatch problem cannot be solved solely through incremental approaches such as tuning, it is important and far from solved. Two important issues need to be addressed.

The first and the foremost is the need to migrate ideas developed by the research community to practice. For example, a lot of work has recently gone into the development of advanced concurrency control, recovery, and, more generally, transaction processing techniques to meet the demands of applications with long-running activities. Unfortunately, not many of the new ideas have migrated into commercial offerings and in fact very few prototypes have been built.

The second issue is how should the database systems be structured to meet the needs of these applications? From the application requirements we reviewed earlier, we observe that almost every component of the database system --- modeling capabilities, indexing schemes, transaction processing techniques, buffer management mechanisms, and query optimization schemes need some type of enhancement to meet the demands of new applications. This being the case, a tool-box type of approach may be suitable. In this approach, commercial systems will offer a package of libraries, containing components that offer the traditional properties as well as some new ones that are tuned to needs of different applications. An application developer can then specify the requirements and the needed support can be tailored by combining these different database components. An identification of the building blocks and a way to integrate them to meet the specified functionality and performance requirements are obvious steps in this direction.


Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org.

krithi@cs.umass.edu