Real-Time Streaming to Big Data

Overview

SQData’s Big Data Streaming feature provides near-real-time changed data capture (CDC) and replication of mainframe operational data; IMS, VSAM or DB2, directly into Hadoop or Kafka.

Big Data Streaming takes the complexity out of older mainframe data with auto-generation of JSON/Avro messages to Hadoop and/or Kafka without any mapping. Now, you have the ability to quickly deploy mainframe data streams without having to be an expert on mainframe data formats.

Value

  • Keeps Data Lakes stocked with fresh information
  • Trending / time series analytics is now an option
  • Reduce the dependence on costly ETL
  • Analytics run against the most current data
  • Immediate results driven by a rapid deployment model
  • Simplifies complex mainframe data into a common format

Key Features

  • Point-to-point replication → eliminates the need for intermediate components
  • Near-real-time streaming of IMS, DB2 and VSAM changes via high-performance data capture
  • JSON and Avro schema generated from source copybooks or relational DDL
  • Automatically converts mainframe data to JSON/Avro → no mapping required
  • Handles high transaction rates with a scaleable architecture and utility-like operation

Hadoop HDFS:

✔ File rotation based on time and/or number of records

✔ Parallel apply threads for maximum throughput

Kafka

Dynamic topic selection at runtime

✔ Message keys can be set to support topic partitioning

✔ Supports streaming from distributed databases including Oracle, DB2 LUW, SQL Server