TB-Scale Search Product
Built a search product handling multiple terabytes of data with under 7-second latency using DuckDB, Google Cloud Dataproc, and Trino query engines.

The Challenge
Users needed reliable search on multi-terabyte proprietary datasets with strict latency targets and region-sensitive data constraints.
System Architecture
Architecture & Approach
Hybrid query stack combining DuckDB for local analytical paths, Trino for federated distributed queries, and Dataproc for heavy workloads, orchestrated by a routing and scaling layer.
Profiled query classes, matched each class to the best execution engine, and introduced autoscaling policies based on estimated scan volume before execution.
My Role & Contributions
Built core query-routing logic, implemented dynamic pod scaling strategy, and developed geofencing behavior for user and data locality constraints.
Key Technical Decisions
- Used engine specialization (DuckDB vs Trino vs Dataproc) instead of a single universal engine for better latency consistency.
- Introduced predictive pod scaling on estimated data volume to reduce cold-start impact on large searches.
- Implemented geofencing rules at routing time to enforce data residency constraints before execution.
Results & Impact
<7s
Search Latency
TB-Scale
Data Volume
Dynamic
Pod Scaling Model
- Delivered under 7-second search latency for core user journeys.
- Reduced infra waste during low traffic with adaptive scaling.
- Enabled compliant region-aware search over proprietary datasets.
The system met enterprise performance expectations at scale while preserving compliance constraints and cost-aware operation.
Lessons Learned
At very large data volumes, intelligent workload routing and elasticity policies drive performance more than micro-optimizations in a single query engine.