Orchestrating Multimodal RAG With Agentic Workflows
Orchestrating Multimodal RAG With Agentic Workflows - Bobby Mohammed & Surya Kari, Amazon Web Services Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for grounding large language models (LLMs) in external knowledge, enabling more accurate and contextually relevant responses. However, traditional RAG pipelines often operate in a static, pre-defined manner, limiting their ability to adapt to complex queries and dynamically explore information spaces. This talk introduces “Multimodal Agentic RAG”, an innovative approach that integrates autonomous agents into the RAG framework to enhance its reasoning and retrieval capabilities. This talk will delve into the architecture and implementation of multimodal RAG with agentic workflows, showcasing its potential to address limitations of traditional RAG. The architecture is based on OpenSearch’s multimodal capabilities, DeepSeek-R1’s reasoning capabilities, and custom-tuned embedding models. Furthermore, we will discuss the challenges and future directions of Agentic RAG, including the development of robust planning mechanisms, efficient knowledge source management, and effective agent coordination.
