HOME
PRODUCTS AND SERVICES
RESEARCH AND DEVELOPMENT
CONTACT

Copula Enterprise ETL Appliance


Copula ETL Appliance
- Copula Pricing
- On user platform
- On IBM platform
- Reporting with Copula
- ETL project managing
- On site ETL seminars
Enterprise Metadata
Research

Copula ETL Appliance

Experience has shown that in the complex real world production environment major ETL tools can achieve ETL processing speed of at most 10,000 records per minute, which usually is not enough to meet the demands of today's financial or telecommunication industry. The main reasons for such poor performance of major ETL tools are as follows:

  • Major ETL tools are built on a relational (SQL) engine that utilizes a comparison algorithm for sorting and a B-tree algorithm for indexing, which is perfectly suitable for transactional processing of one record at a time, but is a huge disadvantage and bottleneck for massive simultaneous processing of millions of records in the demanding Business Reporting, Data Warehousing, Business Intelligence, or Data Mining applications.
  • Major ETL tools are record based, designed to handle just one record at a time.
  • Major ETL tools heavily utilize the so called lookup concept borrowed from ancient desktop database GUI tools that dramatically slows down processing of two or more datasets.
  • Major ETL tools lack a business rules engine and at the same time underestimate the knowledge and experience of their users by offering GUI interfaces for implementation of business rules. Such an approach only serves to limit creativity and encourage inferior intuitive solutions.
  • Major ETL tools' endless generalization and preference for an easy to use drag and drop interface results in a high cost of four data copping (internal moving) per each input and four data copping per each output link per instantiated component, even if metadata are by chance identical and data does not need to be transformed. Technically this has the same effect as moving data through four SQL tables each time for each side of a link, or eight times per link.
  • Major ETL tools force processing in dependent sequences, where any possible error can bring down the entire ETL process, making personal attendance and monitoring unavoidable.
  • Major ETL tools ignore basic laws of the entropy. For example, mapping from an m – dimensional space to an n – dimensional space ends up significantly degrading the performance of the entire system. One major ETL vendor even classifies their ETL components as either active or passive, where active components are those that provide mapping from an m to an n dimensional space.
  • Major ETL tools are so inefficient and expensive (tool, environment, learning curve, implementation, maintenance) that the cost of ETL processing can be estimated to be as high as $1 per record.

Data Integrity Institute Inc.'s Copula Enterprise ETL Appliance is the most advanced solution for the Extract, Transform and Load (ETL) of massive data vaults achieving processing speeds of a billion records per hour. While no additional ETL tools are required, our solution can also be used to improve performance of major ETL tools such as IBM Ascential DataStage or Informatica PowerCenter:

  • Copula's ETL engine utilizes one of the first feasible non-comparison sort algorithms. This approach is especially suitable for demanding Business Reporting, Data Warehousing, Business Intelligence, or Data Mining applications where sorting, indexing and processing millions of records simultaneously is crucial.
  • Since Copula's ETL engine is table based, it is designed to simultaneously handle all affected records.
  • Copula's ETL engine does not utilizes the so called lookup concept, often borrowed from ancient desktop database GUI tools, that dramatically slows down processing of two or more datasets.
  • Copula's ETL engine has its own business rules engine that dramatically simplifies ETL development and unleashes creativity during the implementation of business rules.
  • Internally, Copula's ETL engine uses only one instance of data at all times and never copies or moves data during actual processing.
  • Copula's ETL engine processes all records at table level entirely asynchronously. This means that when an error occurs, the record in question is saved for additional management while all other records continue. Therefore, ETL processing can be run completely unattended.
  • Copula's ETL engine was designed to take into account the laws of entropy. The underlying metadata handling and table level processing enforces an m to m (direct, same) dimensional mapping, which is conditio sine qua for superconductivity of massive data through ETL processing without impedance lost, heating and performance degradation.
  • ETL processing using the Copula Solution can be estimated to cost about $0.01 per record.

Contact Data Integrity Institute Inc.

For further information on how Data Integrity Institute Inc. can help you to implement, save or maintain an ETL project with the Copula Enterprise ETL Appliance, please send a detailed inquiry to: info@DataIntegrityInstitute.com


©2004 Data Integrity Institute Inc.