Case StudyTransUnion

TB-Scale Search Product

Built a search product handling multiple terabytes of data with under 7-second latency using DuckDB, Google Cloud Dataproc, and Trino query engines.

DuckDBTrinoGoogle Cloud DataprocKubernetesJavaSpring Boot
TB-Scale Search Product

The Challenge

Users needed reliable search on multi-terabyte proprietary datasets with strict latency targets and region-sensitive data constraints.

System Architecture

TB-Scale Search Product system architecture diagram

Architecture & Approach

Hybrid query stack combining DuckDB for local analytical paths, Trino for federated distributed queries, and Dataproc for heavy workloads, orchestrated by a routing and scaling layer.

Profiled query classes, matched each class to the best execution engine, and introduced autoscaling policies based on estimated scan volume before execution.

My Role & Contributions

Built core query-routing logic, implemented dynamic pod scaling strategy, and developed geofencing behavior for user and data locality constraints.

Key Technical Decisions

  • Used engine specialization (DuckDB vs Trino vs Dataproc) instead of a single universal engine for better latency consistency.
  • Introduced predictive pod scaling on estimated data volume to reduce cold-start impact on large searches.
  • Implemented geofencing rules at routing time to enforce data residency constraints before execution.

Results & Impact

<7s

Search Latency

TB-Scale

Data Volume

Dynamic

Pod Scaling Model

  • Delivered under 7-second search latency for core user journeys.
  • Reduced infra waste during low traffic with adaptive scaling.
  • Enabled compliant region-aware search over proprietary datasets.

The system met enterprise performance expectations at scale while preserving compliance constraints and cost-aware operation.

Lessons Learned

At very large data volumes, intelligent workload routing and elasticity policies drive performance more than micro-optimizations in a single query engine.

Ayush Jaipuriar

AI Agent Engineer & Senior Full-Stack Developer

jaipuriar.ayush@gmail.com

Currently exploring Senior SWE & AI Engineering roles

Connect

© 2026 Ayush Jaipuriar. All rights reserved.

Built with Vue.js & Nuxt 3. Deployed on GitHub Pages.