Home Contact

about us
white paper

Micro-biological Environmental Data Analysis

(Chesapeake Bay area)

Object of Investigation

A bacteriological data base was collected during several years by the EPA and was given to George Mason University, VA (Department of Biology) for data analysis. EPA presumed the data potentially might reveal the causes of high bacteria concentration and bacteria production rate observed during certain periods of time. The database contained measurements of biochemical oxygen demand, chlorophyll concentration, organic carbon particulate and several other characteristics.

A pilot project was initiated by professor Dr. Robert Jonas (GMU) and Dr. N. N. Lyashenko to analyze a part of the Chesapeake Bay data set with Dr. Lyashenko’s, Knowledge Extraction Technology.

Primary Goals


To assess the potential predictability of Bacterial Abundance and Production variables.


To evaluate the quality of the database in relation to the previous problem.


To identify an adequate processing strategy among possible KET strategies.


To classify the database as essentially dynamic or static in relation to the possibility to predict Bacterial


Abundance and Production.


To estimate the potential possibility to optimize the data collection process.


To formulate recommendations for more substantial research in the future.


Before Dr. Jonas and Dr. Lyashenko performed the project, many researchers had analyzed the data set and applied conventional statistical methods. The results were controversial, three types of difficulties were identified: data was noised and contained many missing values; the database was mixed, i.e., consisted of both numerical and qualitative parameters; the underlying numerical dependencies were certainly non-linear and dynamic.

The Result

Variables of Bacteria Abundance and Production were predicted. In the static model, the first variable was predicted within 99.7% and the second within 81% accuracy. The major results were as follows:


Nonlinear Predictors were essentially most accurate. ·


Chesapeake Bay Ecological System was essentially dynamic.


Dynamics of the system was essentially nonlinear.


Most unstable processes occurred in May.


Nearly all unstable areas were located in the Western part of the river.


As a result of the KET analysis, considerable reductions and relocation to measurement activities in the area were recommended without the loss of prediction accuracy. Predictors allowed to create a dynamic model that can be used to evaluate different cleaning procedures in the future.

How it was Done

An Information Analysis Module from the KET Tool Kit was used to identify informative variables from which predictors were constructed. An analytical descriptors were used to obtain the final results.


KET analysis and directed the construction of an optimal data collection map to identifying the prime locations, schedules, and measure pollutant concentrations. KET Logic Descriptors could identified the type, character and quantity of unstable environmental conditions.


  New Feature !!

A new KET module was introduced recently to support interaction with "Mathematica 8" system.


  News !!

KET, LLC joined BioMed Content Group, Inc. in initiative of using AI agents to facilitate work of physicians and educators.

Copyright 2002-2206, Knowledge Extraction Tools, LLC. All rights reserved