INSIGHTS Blog

Langchain and Datastax Astra DB Powers esynergy’s Retrieve-Augment Architecture for Sales Copilot

Prasad Prabhakaran

Head of AI practice

esynergy

partnerships

Datastax

copilot

At esynergy, we have leveraged Langchain and AstraDB to develop Sales Copilot – an AI-powered conversational assistant for sales teams to access customer data and generate informed responses. In this post, we’ll dive into the technical architecture and how Langchain enabled rapid prototyping of a retrieval-augmented chatbot.

Unveiling the Sales Copilot

Sales Copilot ingests customer data from SharePoint, encodes it into vectors, indexes in AstraDB for low-latency search, retrieves relevant chunks for a query, and passes them to Claude2 to generate a response.

This retrieval-augmented setup allows generating contextual responses by conditioning on relevant data. Langchain’s modular components made implementing it straightforward.

The application seamlessly integrates with your organization’s SharePoint instance, allowing it to ingest customer profiles, communication records, and case studies. This raw data undergoes a meticulous preprocessing stage, where Langchain plays a crucial role. Langchain’s capabilities enable the application to efficiently split files into manageable chunks, optimizing them for subsequent processing.

Following this initial step, these chunks are transformed into dense vector representations using AWS Bedrock, a suite of cutting-edge text embedding models. This process allows the application to represent information semantically, facilitating a more nuanced understanding of customer data.

At the heart of the application lies the “retrieve-augment” architecture. When a user initiates a query through the Streamlit frontend, esynergy, a customized retrieval component, takes center stage. esynergy leverages advanced techniques like query augmentation and MMR (Maximum Marginal Relevance) reranking to ensure the retrieval of the most relevant data points from the vector database, AstraDB.

These retrieved data nuggets then serve as the foundation for response generation. Anthropic’s Claude2 large language model, integrated through AWS Bedrock, steps in to craft informative responses tailored to the user’s query and informed by the retrieved customer data

Langchain in Action

Data Ingestion

The `DocumentLoader` extracted SharePoint data into common text formats. The `RecursiveTextSplitter` chunked documents into smaller passages for processing.

Vector Encoding

AWS Bedrock Titan model encoded chunks into dense vectors capturing semantic meaning. This allowed indexing in AstraDB for efficient similarity search.

Flexible Retrieval

We built custom retrieval components using Langchain’s `Retrievers` abstraction. Techniques like paraphrasing, duplicate removal, and MMR reranking retrieved diverse, relevant passages for a query.

Integrations with LangSmith and LangServe allowed rapidly experimenting with different strategies.

Response Generation

The Claude2 model, via Bedrock API, generated informed responses conditioned on the retrieved passages. Prompt engineering further improved response coherence.

Deployment

The Langserve/ FastAPI web framework exposed the pipeline as an API. Streamlit provided a quick way to build and deploy the frontend.

Outcomes

Using Langchain accelerated developing a production-grade conversational assistant. Benefits included:

Faster prototyping by building on modular components
Swapping retrieval strategies with LangSmith experiments
Easy integration of Claude2 for natural dialogue
Changing components without impacting others
Quickly validating capabilities before customization
Reduced effort on data preprocessing and ingestion

A Glimpse into the Benefits

The Sales Copilot application empowers sales representatives with a plethora of advantages, including:

Rapid Retrieval of Case Studies: Effortlessly access relevant case studies, projects, and customer success stories to bolster product credibility during conversations.
Swift Access to Project Details: Obtain specifics like timelines, milestones, and deliverables promptly, eliminating the need to scour through multiple systems.
Effortless Case Study Generation: Leverage the system’s ability to auto-generate draft case studies, saving valuable time and streamlining the content creation process.
Enhanced Case Study Quality: Seek suggestions on improving existing case studies, allowing for the creation of even more compelling narratives.
Streamlined Communication Drafting: Generate customized emails, chat messages, and social media posts tailored to individual customers, fostering stronger relationships.
Data-Driven Report Generation: Gain valuable insights by requesting customized reports or comparisons, empowering informed decision-making.
Comprehensive Analytics: Leverage the system’s ability to provide insightful analytics on various aspects, including campaigns, lead conversion rates, and customer sentiment.

By acting as an AI assistant, the Sales Copilot application equips sales representatives with the knowledge and tools necessary to elevate their interactions with customers. Data-driven responses not only minimize repetitive tasks but also contribute to fostering more meaningful and productive conversations.

We’re excited to see what more we can build by leveraging additional Langchain tools like LangServe and Langsmith. The framework has opened up new possibilities for us to create enterprise AI solutions.

Authors

Edwin Jose

Lead Associate- AI, esynergy and Quantum AI researcher, Western Michigan University