|
|
 |
 |
 |
PRODUCTS > ALTIBASE DATA INTEGRATOR > BACKGROUND |
 |

The Necessity of Data Integration |
Along with the expansion of distributed computing environment, IT systems in a corporation are becoming more and more complex and extensive.
Corporations are using countless heterogeneous DBMS, OS, server systems, storages and applications.
In proportion to the complexity of the system, the concern of the corporation becomes worse. It is because the data, spread around multiple
systems, should be connected to and integrated with each other for the consistency in decision making and in the service process for customers.
The data spread around multiple OLTP should be integrated into OLAP system to provide a single view, and the information should be shared freely
among HR, accounting, and sales information system.
The need and the importance of high availability result in the request for data integration. As Internet becomes more and more common,
it requires the corporations to provide 24x7 services, and thus, the service should be available 365 days a year in every corporation.
24x7 non-stop service means the system should not be asleep in any case, such as when the system is upgraded or replaced, in the event of a
disaster, and when the data in the operating system are integrated to DW for analysis.
The data should be integrated so that the changes of data in the head office can be notified to every remote branch offices,
and that the changes in the branch offices can be applied to the server in the head office in real time.
|
The Limit of Conventional Integration Technology |
Various technologies were researched and developed to satisfy the need for data integration, and ETL (Extract, Transform, Load) was
utilized as the most powerful tool available. However, it could not completely satisfy the demand of corporations. It has limitations in two
aspects:
First of all, it lacks real-time response. The data integration using ETL is processed on daily, weekly, or monthly basis. Due to such
periodic batch tasks, the changes in the source system are fundamentally not able to be promptly applied to the target system.
However, corporations increasingly demand real-time data processing. They want to make analysis and decisions based not on the information
from yesterday, or from the last week, but on the latest data created and revised right now. The ‘Latest data’ can be those
from a few hours ago, or from a second ago.
Secondly, 24x7 service becomes difficult. When using ETL for data integration, the system should be down in midnight or in specific
timeframe with fewer connections, so that the data can be extracted from the system. Due to the characteristics of ETL, it cannot track the
changes only, but the whole source data should be extracted. However, IT system should be available 24 hours, and the time required for
batch tasks should be minimized, or be 0. The service should recover in the least time from power failure, disaster, or the errors of
system or applications.
Such limitations of ETL and the new requests from corporations resulted in the development of newly advanced CDC (Change Data Capture) technology.
|
‘CDC’, the Alternative of Real-time Integration |
CDC is recently drawing more and more attention to supplement ETL. CDC technology captures the changed data in the source system,
and transfers them in real time to various operating and analysis systems. Located in the beginning point of data integration process,
it supplements the limitations of ETL, and satisfies the unmet demands of corporations for data integration.
< Demands for Real-Time Enterprise >
|
|
|