Enhancing Query Performance Through Intelligent Data Co-location
Enhancing Query Performance Through Intelligent Data Co-location - Rishav Sagar & Tejas Shah, Amazon Web Services In OpenSearch, a typical workload involves log analytics and metrics data, where for the majority of search queries, only a subset of the data is more relevant. However, the current segment management system doesn’t account for these query patterns when organising data, resulting in more relevant information being scattered across multiple segments and leading to suboptimal query performance. This session presents a novel segment creation and merging strategy for OpenSearch that leverages anticipated query patterns to intelligently co-locate more relevant and related data within the same physical segments. Key topics we’ll explore: * Reduced segment scanning through intelligent data co-location * Tenant Aware segment grouping to improve vector search accuracy and efficiency * Enhanced storage optimization opportunities by storing more relevant data on hot storage * Potential for pre-computed aggregations for frequently executed queries. Whether you’re managing large-scale log analytics systems, working with time-series metrics or semantic search, you’ll learn how this new segment management approach can enhance your OpenSearch performance and efficiency.
