Offloading RDS historical data to Redshift for further analytics

OrboGraph has provided technology solutions to automate paper processing, detect fraudulent transactions, and mitigate risk for more than 25 years. OrboGraph assists over 4,000 financial institutions and corporations in automating the process of depositing paper-originated negotiable items (checks, money orders, preauthorized drafts, etc.) and increasing check fraud detection capabilities for deposit and on-us fraud.

 

The Challenge

Orbograph’s workload in AWS contains several AWS services. Among them are an operational MSSQL RDS database used by Orbograph’s application, and a Redshift instance used as DWH for BI purposes.

Orbograph wished to develop a robust and seamless mechanism solution based on AWS services to offload historical data from the operational RDS to Redshift and enable data modeling on top of it for future consumption (in-house application dashboard).

 

The solution

  1. Step-Functions used as an orchestration service for the whole process, providing visibility, steps-dependencies, and control mechanism.
  2. AWS Glue job used as the data processing engine to support incremental approach.
  3. Dynamodb used as the control table to support the metadata’s data pipeline processes (tables, watermarks bookmarks).
  4. To model the new data, Redshift Stored procedures were used.
  5. Recurring daily task triggered by EventBridge to kick the whole process.

 

The result

  • An automation platform for ingestion of historical data from RDS to RedShift powered by Step-functions.
  • Result in a dataset in Redshift to consume by in-house application dashboards (modeled data)