Skip to main content

Overview

The corsa-data-pipeline skill teaches your AI coding assistant how to build production-grade data pipelines that sync your existing data into Corsa. It’s designed for exchanges, payment processors, and fintechs that already have users, transactions, and accounts in their own systems and need to push everything into Corsa for compliance monitoring.

What It Helps With

  • Mapping your existing data model (users, transactions, accounts) to Corsa entities
  • Building historical backfill pipelines for millions of records
  • Setting up real-time event-driven sync after the initial backfill
  • Managing rate limits (500 req/60s) with throttled queues
  • Maintaining ID mappings between your system and Corsa
  • Handling retries, resumability, and error recovery

Quick Start

After installing the skill, ask your AI assistant questions like:
  • “Build a pipeline to sync all our users into Corsa as clients”
  • “Backfill our historical transactions into Corsa deposits and withdrawals”
  • “Set up real-time sync when a new payment is completed”
  • “Map our exchange order model to Corsa trades”
  • “How do I handle rate limits during a large backfill?”

What the Skill Knows

Data Model Mapping

Complete mapping from common fintech entity types to Corsa entities:
Your SystemCorsa Entity
Users / Customers (individuals)IndividualClient
Users / Customers (businesses)CorporateClient
Corporate officers, UBOsIndividualMember / CorporateMember
Bank accountsBankAccount + client association
Crypto walletsBlockchainWallet + client association
Fiat deposits / incoming paymentsDeposit operation
Fiat withdrawals / outgoing paymentsWithdrawal operation
Crypto receivesDeposit with txHash and blockchainNetworkId
Crypto sendsWithdrawal with txHash and blockchainNetworkId
Trades / swaps / conversionsTrade operation with fill transactions
Compliance alertsAlert (batch up to 50)
Investigation casesCase

Ingestion Dependency Order

The skill enforces the strict entity dependency order: Clients first, then members, accounts, wallets, sessions, operations, alerts, and finally cases. It knows that initiatedBy on operations expects a Corsa-generated UUID, not your internal user ID.

Historical Backfill

  • Throttled queue pattern that respects 500 req/60s
  • Resumable cursor-based pagination with progress checkpoints
  • Time estimates: 100K records takes ~3.5 hours, 1M+ takes ~35 hours
  • Alert batch endpoint for faster alert ingestion (50 per call)

Real-Time Sync

  • Event-driven handler pattern for domain events (user created, transaction completed)
  • Retry with exponential backoff for 429s and 5xx errors
  • Bidirectional sync via Corsa webhooks to receive events back

ID Mapping

SQL schema and repository pattern for maintaining internal_id to corsa_id mappings, with fallback to referenceId lookups.

Production Gotchas It Catches

MistakeWhat the skill does
Using internal user ID as initiatedByReminds that operations expect a Corsa-generated client UUID
Forgetting to link accounts/wallets to clientsIncludes the associateWithClients call after creation
Expecting batch endpoints for all entitiesClarifies that only alerts have a batch endpoint
Not normalizing referenceIdWarns that upsert uses exact string match
Not handling rate limits during backfillProvides a throttled queue with token bucket and retry logic

Source

GitHub Repository

View the full skill source, including production code templates for throttled queues, backfill pipelines, and entity mapping.