Overview:
SQData’s Big Data Streaming feature provides near-real-time changed data capture (CDC) and replication of mainframe operational data; IMS, VSAM or DB2, directly into Hadoop or Kafka.
Big Data Streaming takes the complexity out of older mainframe data with auto-generation of JSON/Avro messages to Hadoop and/or Kafka without any mapping. Now, you have the ability to quickly deploy mainframe data streams without having to be an expert on mainframe data formats.
Value:
- Keeps Data Lakes stocked with fresh information
- Trending / time series analytics is now an option
- Reduce the dependence on costly ETL
- Analytics run against the most current data
- Immediate results driven by a rapid deployment model
- Simplifies complex mainframe data into a common format
Key Features:
- Point-to-point replication → eliminates the need for intermediate components
- Near-real-time streaming of IMS, DB2 and VSAM changes via high-performance data capture
- JSON and Avro schema generated from source copybooks or relational DDL
- Automatically converts mainframe data to JSON/Avro → no mapping required
- Handles high transaction rates with a scaleable architecture and utility-like operation
Hadoop HDFS:
- File rotation based on time and/or number of records
- Parallel apply threads for maximum throughput
Kafka:
- Dynamic topic selection at runtime
- Message keys can be set to support topic partitioning
- Supports streaming from distributed databases including Oracle, DB2 LUW, SQL Server