Tag: AI

Five Generative AI Trends Product Teams Must Watch

Generative artificial intelligence is advancing so quickly that Nvidia—the maker of the H-series GPUs inside most training clusters—briefly overtook Amazon’s market cap this spring. New models, plug-ins, and AI-native start-ups appear every week, making it hard to separate hype from signal. The five forces below are the ones most likely to reshape product roadmaps over the next 12–18 months.

1. Data Turns Into Revenue

“Data is the new oil” finally has a dollar sign. In February, Reddit reportedly licensed 18 years of forum conversations to an unnamed AI company for USD 60 million per year. X (Twitter) and Stack Overflow are said to be exploring similar arrangements, and publishers such as The Financial Times have already inked deals with OpenAI. If your company owns a large, well-labeled corpus—support tickets, sensor logs, maintenance manuals—you may be sitting on a salable asset.
Beyond cash, licensing can fund better infrastructure and raise your brand’s profile inside the AI ecosystem, but it also creates legal and reputational risk around privacy and consent.
PM tip: Inventory proprietary datasets, rank them by sensitivity, and involve counsel early. Decide whether the bigger upside is outside revenue or internal fine-tuning that sharpens your own product.

2. The Rise of Open-Source Models

Closed weights from OpenAI still dominate mindshare, yet open alternatives are sprinting forward. Meta’s Llama 3, Mistral’s Mixtral-8x7B, and Databricks’ DBRX hit strong benchmarks with weights anyone can download and inspect. Freedom to run the model anywhere means you control latency, can fine-tune without sending data to a third party, and avoid per-token API fees.
Open licenses also spark a plug-and-play ecosystem: retrieval frameworks, vector databases, and guardrail libraries often ship adapters for Llama or Mixtral first.
PM tip: Budget a benchmarking sprint. Even if an open model trails GPT-4 by a few points, cost savings and privacy control can be decisive in regulated markets.

3. Small Language Models (SLMs) Move to the Edge

LLMs wowed the public, but their compute appetite makes them expensive and sometimes slow. Small Language Models—roughly 7 billion parameters or fewer—run on phones and even micro-servers. Microsoft’s Phi-3-mini handles homework locally; Google’s T5-small translates offline; Apple’s Ajax prototypes hint at on-device personal assistants.
Why it matters: on-device inference slashes latency to milliseconds, works without a network, and keeps sensitive data on hardware you control. Cloud bills drop because generation shifts from usage fees to a one-time silicon cost.
PM tip: Pinpoint user journeys where instant response, offline mode, or strict privacy are mandatory—field service, hospitals, commuter apps. An SLM may hit quality targets while cutting 80 percent of your cloud spend.

4. Securing the Prompt Supply Chain

The more people rely on generative UX, the more attackers probe it. Jailbreaks can elicit disallowed content, prompt injections hide malicious instructions in PDFs, and data-poisoning campaigns seed falsehoods for crawlers to ingest. Start-ups such as Lakera and PromptGuard now sell “LLM firewalls,” while NIST’s red-team playbook and the OWASP Top 10 for LLMs are becoming default checklists.
PM tip: Version prompts in Git, sandbox untrusted inputs, and schedule regular red-team drills. Treat generative systems like any other executable code exposed to the public internet.

5. Regulation and Investment Chess

Governments are shifting from hearings to rules. The EU’s AI Act introduces graded risk tiers and mandatory transparency; the U.S. Executive Order on Safe, Secure, and Trustworthy AI calls for watermarking and safety evaluations; China’s Generative AI measures require security reviews before public release. Regulators are also scrutinizing equity deals—Microsoft-OpenAI, Amazon/Google-Anthropic—that let incumbents lock in model access without outright acquisitions. Forced divestitures or tighter disclosure rules could arrive with little notice.
PM tip: Track the provenance of every model you ship, log user interactions for audits, and design a modular stack so a supplier swap won’t break the customer experience.

The Bottom Line

Data licensing, open models, edge inference, security hardening, and policy oversight are converging fast. Product managers who watch these trends now can choose the right model, negotiate favorable licenses, and build safeguards before regulators—or attackers—demand them. Staying informed keeps teams focused on shipping durable customer value instead of scrambling after the next flash-in-the-pan release.

Nataraj is a Senior Product Manager at Microsoft Azure and the Author at Startup Project, featuring insights about building the next generation of enterprise technology products & businesses.

Listen to the latest insights from leaders building the next generation products on Spotify, Apple, Substack and YouTube.

January 14, 2024
100 Days of AI Day 10: How to use AI to do Design thinking using GPT4 & Semantic Kernel
About 100 Days of AI:

I am Nataraj, I decided to spend 100 days in starting Jan 1st 2024 to learn about AI. With 100 days of AI is my goal to learn more about AI specifically LLMs and share ideas, experiments, opinions, trends & learnings through my blog posts. You can follow along the journey here.

In this post we will look at how to use AI to do design thinking for a given business problem. For the sake of this example we define design thinking as a series of steps show below. You can also extend this idea to add more steps and write logic for them.

Design Thinking

To set context, lets take an example of a Coffee Shop which has received some customer feedback recently and use it to apply AI design thinking and come up with ways to improve the business.

We will use Open AI’s gpt-4 model and use Microsoft’s Semantic Kernel to do design thinking. Along the way we will also explore how we can use the concept of Plugins in the Kernel which makes it easy to reuse Semantic Functions.

So let’s get into it.

Step 1 – Setup the Kernel:

The first step is to load the secret key of Open AI from local .env file, and then create new Kernel instance. Then add OpenAIChatCompletion service to the Kernel.

Setup Semantic Kernel

Step 2 – Add the use feedback & SWOT analysis of the Coffee Shop business:
```
# SWOT questions
strength_questions = ["What unique recipes or ingredients does the coffee shop use?","What are the skills and experience of the staff?","Does the coffee shop have a strong reputation in the local area?","Are there any unique features of the shop or its location that attract customers?", "Does the coffee shop have a strong reputation in the local area?", "Are there any unique features of the shop or its location that attract customers?"]
weakness_questions = ["What are the operational challenges of the coffee shop? (e.g., slow service, high staff turnover, not having wifi)","Are there financial constraints that limit growth or improvements?","Are there any gaps in the product offering?","Are there customer complaints or negative reviews that need to be addressed?"]
opportunities_questions = ["Is there potential for new products or services (e.g., delivery, food along with coffee)?","Are there under-served customer segments or market areas?","Can new technologies or systems enhance the business operations?","Are there partnerships or local events that can be leveraged for marketing?"]
threats_questions = ["Who are the major competitors and what are they offering?","Are there potential negative impacts due to changes in the local area (e.g., construction, closure of nearby businesses)?","Are there economic or industry trends that could impact the business negatively (e.g., increased ingredient costs)?","Is there any risk due to changes in regulations or legislation (e.g., health and safety, employment)?"]

# SWOT answers
strengths = [ "Unique coffee recipe that wins top awards","Owner trained in Sicily","Strong local reputation","Prime location on university campus" ]
weaknesses = [ "High staff turnover","Floods in the area damaged the seating areas that are in need of repair","Absence of popular mocha latte from menu","Negative reviews from younger demographic for lack of hip ingredients" ]
opportunities = [ "Untapped work from anywhere potential as they dont have wifi","Growing local tech startup community","Unexplored online presence and order capabilities","Upcoming annual food fair" ]
threats = [ "Competition from big coffee chains nearby","There's nearby street construction that will impact foot traffic","Rising cost of coffee beans will increase the cost of coffee","No immediate local regulatory changes but it's election season" ]

# Customer comments some positive some negative
customer_comments = """
Customer 1: The seats look really raggedy.
Customer 2: The americano is the best on this earth.
Customer 3: I've noticed that there's a new server every time I visit, and they're clueless.
Customer 4: Why aren't there any snacks?
Customer 5: I love the coffe blend they use and can't get it anywhere else.
Customer 6: The dark roast they have is exceptional.
Customer 7: why is there no wifi?
Customer 8: Why is the regular coffee so expensive?
Customer 9: There's no way to do online ordering.
Customer 10: Why is the seating so uncomfortable and dirty?
"""
```
Step 3 – Creating Plugins for Design Thinking:

What is a Plugin? Semantic Kernel has this feature called Plugins where you can define Semantic Functions their inputs and can re-use them repeatedly. A plug in is made up of two files a .json (contains config info for LLM & input params) & .txt (contains a custom prompt). For design thinking use case we are going to create 4 plugins. You can find the code for all 4 plugins here.
- Empathize: Takes customer feedback and isolates sentiment and gives a concise summary for each sentiment expressed by the customer.
- Define: Takes output from empathize step, categorizes the analysis in a markdown table. Defines the problems & its possible source.
- Ideate: Takes the output from the above step and generates ideas in the form of a markdown table with two columns called Low hanging fruit & Higher-hanging fruit.
Result from Ideate Plugin
- PrototypeWithPaper: This plugin takes in the ideas generated in previous step and gives a low-resolution prototype, so the solution can be tested.
Step 4 – Bring it all together:

Note that in the previous steps even though I have given the code for four plugins, explained what they do in the context of Design Thinking. I have also displayed the output they will generate. But we have not actually called those plugins from our code. Lets do that now as shown below.
```
## access design thiking plugin
pluginsDirectory = "./plugins-sk"
pluginDT = kernel.import_semantic_skill_from_directory(pluginsDirectory, "DesignThinking");
async def run_designthinking_async():
    my_result = await kernel.run_async(pluginDT["Empathize"], pluginDT["Define"], pluginDT["Ideate"], pluginDT["PrototypeWithPaper"], input_str=customer_comments)
    display(my_result)

asyncio.run(run_designthinking_async())
```
You have already seen the output that all the 4 steps generate in the previous step. Note how simple Kernel makes calling one plug in after the other all in a single call.

To conclude, here is what we did. We wrote custom prompts and made them plugins and put them in a folder called plugins-sk. And then used Kernel to call them using the SWOT analysis & Customer feedback for the Coffee Shop. Now by changing the SWOT analysis and taking a customer feedback for a different business problem you can do design thinking and come up with an MVP solution to fix your problem.

Even though at the core of it its 4 custom prompts, this method highlights how Kernel makes developing complex goals with AI easy & manageable with plugins.

That’s it for Day 10 of 100 Days of AI.

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI. If you are in tech you might be interested in joining my community of tech professionals here.
January 10, 2024

100 Days of AI Day 9: How to create SWOT analysis for your business idea using AI?

About 100 Days of AI:

I am Nataraj, I decided to spend 100 days in starting Jan 1st 2024 to learn about AI. With 100 days of AI is my goal to learn more about AI specifically LLMs and share ideas, experiments, opinions, trends & learnings through my blog posts. You can follow along the journey here.

In this post we will look at how to create a swot analysis for any business. If you are not familiar with SWOT analysis here is a quick intro.

Introduction to SWOT:

SWOT stands for Strengths, Weaknesses, Opportunities & Threats. Its a simple way of evaluating any business and get ideas about how to improve it. Once you have done a SWOT analysis of a business, you can choose to compound on the strengths and create more differentiation from your competitor. You can find the weaknesses and create an action plan to fix them. You can find new areas to expand into using the opportunities as a starting point. It is essentially one of the many mental models used by business owners.

Here is an example of SWOT analysis for a pizza business.

Strengths	Weaknesses
Unique garlic pizza recipe that wins top awards	High staff turnover
Owner trained in Sicily at some of the best pizzerias	Floods in the area damaged the seating areas that are in need of repair
Strong local reputation	Absence of popular calzones from menu
Prime location on university campus	Negative reviews from younger demographic for lack of hip ingredients

Opportunities	Threats
Untapped catering potential	Rising competition from cheaper pizza businesses nearby
Growing local tech startup community	There’s nearby street construction that will impact foot traffic
Unexplored online presence and order capabilities	Rising cost of cheese
Upcoming annual food fair	No immediate local regulatory changes but it’s election season

How to Generate a SWOT?

To generate above SWOT, we are essentially answering questions in the following template.

Strengths
- What unique recipes or ingredients does the pizza shop use?
- What are the skills and experience of the staff?
- Does the pizza shop have a strong reputation in the local area?
- Are there any unique features of the shop or its location that attract customers?
Weaknesses
- What are the operational challenges of the pizza shop? (e.g., slow service, high staff turnover)
- Are there financial constraints that limit growth or improvements?
- Are there any gaps in the product offering?
- Are there customer complaints or negative reviews that need to be addressed?
Opportunities
- Is there potential for new products or services (e.g., catering, delivery)?
- Are there under-served customer segments or market areas?
- Can new technologies or systems enhance the business operations?
- Are there partnerships or local events that can be leveraged for marketing?
Threats
- Who are the major competitors and what are they offering?
- Are there potential negative impacts due to changes in the local area (e.g., construction, closure of nearby businesses)?
- Are there economic or industry trends that could impact the business negatively (e.g., increased ingredient costs)?
- Is there any risk due to changes in regulations or legislation (e.g., health and safety, employment)?”””

Our goal is to use Open AI & Semantic Kernel and be able to generate SWOT analysis for any given business. Why use Semantic Kernel? Part of the goal of this post is also to explore more functionality of Semantic Kernel. We could also achieve the same end goal using Langchain. If you like Langchain over Semantic Kernel feel free to use that.

Step 1 – Initialize Semantic Kernel With Open AI Chat Completion:

For this step you will need Open AI secret key. Note that Semantic Kernel can work with other LLMs and their corresponding chat completion APIs. See the documentation to find out what they support.

Step 2 – Create a Semantic Function that does SWOT analysis

Semantic functions are a way to leverage custom prompts in the Kernel world. More about it here. We will create a semantic function that takes a custom prompt for SWOT analysis of a Pizza business, give an instruction in the prompt to convert the analysis to the a given domain that is given as an input to the prompt. Here’s how it looks.

Step 3 – Call the Semantic Function to Generate a SWOT analysis:

To call the semantic function that is registered with the Kernel, we need to create a context and pass it. The context also includes the new domain we want the SWOT analysis to be applied to, in this case I am using Newsletter. Since every one is starting a newsletter lets try to get a SWOT analysis template for starting a newsletter. Here’s the code for step 3.

Here’s the out put:

Not bad, the out put generated gives a great SWOT template for whether or not you should start a newsletter.

You can expand this experiment further and generate a 2*2 matrix like the pizza example I shared above as well.

AI PRODUCT IDEA ALERT: A website where user can enter their idea and get an output for all the business mental models that exist including SWOT.

That’s it for Day 9 of 100 Days of AI.

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI. If you are in tech you might be interested in joining my community of tech professionals here.

January 9, 2024

Day 8: How to Build on Gpt-4 Using MSFT’s Semantic Kernel
About 100 Days of AI:

AI, with the rise of chat-GPT, captured attention of everyone. Unlike other tech bubbles, AI wave will be enduring. Whether you’re a developer, product manager, marketer, or in any knowledge role, investing time in learning AI will be worth your time for your career. 100 days of AI is my goal to learn more about AI specifically LLMs and share ideas, experiments, opinions, trends & learnings through my blog posts. You can follow along the journey here.

Introduction to Semantic Kernel:

Semantic Kernel is an open source SDK from Microsoft that helps developers create AI applications including chatbots, RAGs, Copilots & agents. It is similar to what langchain does. We can probably call it as Microsoft’s answer to langchain.

It is designed to make existing software extensible and easy to expose to AI features. It is also designed anticipating that applications would like to update their AI models to the latest and greatest versions over time.

Semantic Kernel

Although the space is evolving very rapidly here are some definitions to keep in mind as we explore Semantic Kernel.
- Chat Bot: Simple chat with the user.
- RAGs: Simple chat bot but grounded in real time & private data.
- Copilots: Meant to be assisting us side-by-side to accomplish tasks by recommending & suggesting.
- Agents: Respond to stimuli with limited human intervention. Agents executes tasks like sending emails, booking tickets on behalf of the user.
How Does Semantic Kernel Work?

To explain how Semantic Kernel works, lets take an example of taking a piece of text and converting it into 140 character tweet. But we will do this using Semantic Kernel. We have done a similar summarization in previous posts here.

I will be using the python library of the Semantic Kernel, but since Semantic Kernal is created by Microsoft you can also do this in C# as well. Check public documentation from Microsoft on how to do this.

Step 1: Initiate Semantic Kernel

Below we are initiating the semantic kernel and setting up its text completion service by telling it to use OpenAI’s gpt-4 model as the LLM to use for text completion.
```
import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, OpenAIChatCompletion
import os
from IPython.display import display, Markdown
import asyncio

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
api_key = os.environ['OPENAI_API_KEY']

kernel = sk.Kernel()
kernel.add_text_completion_service("openai", OpenAIChatCompletion("gpt-4",api_key))
print("Kernel Initiated")
```
Step 2: Understanding a Semantic Function:

In the Semantic Kernel world, we have a concept of Semantic function which is different from a native function. A native function is the regular functions that we write in any programming language. Semantic functions are encapsulations of repeatable LLM Prompts that can be orchestrated by the Kernel. You will get a better idea of what semantic function is in the next step where we will write one.

Native & Semantic Functions

Step 3: Create a Semantic Function

Here we create a prompt sk_prompt that summarizes the connect in less than 140 characters (that’s our goal with this exercise). We then pass the prompt as an input to create a semantic function with the kernel and store which gives us in return the object summary_function which represents the semantic function we created and that can be repeatedly accessed via the kernel. Note that when we created a semantic function we are using a customer prompt and also giving LLM config information like max_tokens, temperature etc., Now go back to the previous image of native vs semantic functions and it will make more sense.
```
sk_prompt = """
{{$input}}

Summarize the content above in less than 140 characters.
"""
summary_function = kernel.create_semantic_function(prompt_template = sk_prompt,
                                                    description="Summarizes the input to length of an old tweet.",
                                                    max_tokens=200,
                                                    temperature=0.1,
                                                    top_p=0.5)       
print("A semantic function for summarization has been registered.")
```
Step 4: Using the Semantic Function to Summarize text into a tweet of 140 characters.

Now we create the text that we want to summarize using the variable sk_input and call the sematic function via kernal and then display the result.
```
sk_input = """
Let me illustrate an example. Many weekends, I drive a few minutes from my house to a local pizza store to buy 
a slice of Hawaiian pizza from the gentleman that owns this pizza store. And his pizza is great, but he always 
has a lot of cold pizzas sitting around, and every weekend some different flavor of pizza is out of stock. 
But when I watch him operate his store, I get excited, because by selling pizza, he is generating data. 
And this is data that he can take advantage of if he had access to AI.

AI systems are good at spotting patterns when given access to the right data, and perhaps an AI system could spot 
if Mediterranean pizzas sell really well on a Friday night, maybe it could suggest to him to make more of it on a 
Friday afternoon. Now you might say to me, "Hey, Andrew, this is a small pizza store. What's the big deal?" And I 
say, to the gentleman that owns this pizza store, something that could help him improve his revenues by a few 
thousand dollars a year, that will be a huge deal to him.
"""

# using async to run the semantic function
async def run_summary_async():
     summary_result = await kernel.run_async(summary_function, input_str=sk_input)
     display(summary_result)

asyncio.run(run_summary_async())
```
Here’s the output I got:

AI can analyze sales data to help a small pizza store owner optimize his stock, potentially increasing his annual revenue.

Semantic Kernel has more capabilities like using Semantic Functions & Native functions together and is designed to create AI applications that are powerful. I will write more about them in future posts.

That’s it for Day 8 of 100 Days of AI.

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI. If you are in tech you might be interested in joining my community of tech professionals here.
January 8, 2024
Day 7: How to Build Chat-GPT for your Data using Langchain?

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

I think one of the most common use-case everyone wanted after chat-gpt broke out in the public imagination was to have a chat-gpt like experience on top of their own data.

In this example, I will use Langchain (which raised $10M at a $100M valuation) to access open AI’s API.

As a Product Manager at Azure Files, the most important information I would like to chat with is the publicly available Azure Files documentation. I downloaded this in the form of a PDF for the purpose of this exercise. If you are following along download what ever information you want to build your chat bot on in the form of one or more PDFs. You can also use other formats but I will be sticking to PDF format in this example.

The process is which we will build this chatbot is often referred as retrieval augmented generation (RAG).

The following image explains different steps involved to create the chat bot that will help me do my job better and faster.

Building a RAG using Langchain

So lets write the code to create our chat bot. I am using langchain along with Open AI to create the chatbot. So you would need Open AI secret key and IDE of your choice to follow along. I am using VS Code and virtual python environment.

Step 1 – Load the PDFs: The first step is to load data from the folder data using document loaders available in langchain. Langchain provides data loaders for 90+ sources, so you can load data not just from PDFs but anything that you want.

Loading PDFs

Step 2 – Splitting: Split the data into smaller chunks with chunk size 1500 & chunk overlap of 150 which means each consecutive chunk will have 150 tokens common with the previous chunk to make sure the context doesn’t get abruptly split. There are different ways to split your data and each of them have different pros & cons. Check out langchain splitters to get more ideas on which splitter to use.

Splitting Data into Chunks

Step 3 – Store: In this step we convert each split into an embedding vector. An embedding vector is a mathematical representation of text.

Embedding text into vectors

An embedding vector of a text block captures the context/meaning of the text. Texts that have similar meaning/context will have similar vectors. To convert the splits into their embedding vector versions we use OpenAIEmbeddings. A special type of database is required to store these embeddings, known as vector database. In our example we will chrome as it can be stored in memory. There are other options you can use like pinecone, weaviate & more. After storing embedding for my splits in the chroma vector db I also persist it to reuse in the next steps.

Embeddings in Vector Store

Step 5 & 6 – Retrieval & Generate Output: We have stored our embedding in chroma db. In our chat bot experience when the user asks a question. We send that question to the vector db and retrieve data splits that might have the answer. Retrieving the right data splits is a huge and evolving subject. There are lot of ways you can retrieve based on your application needs. I am using a common method of retrieval called Maximal Marginal Relevance (MMR). You can learn more techniques like basic semantic similarity, LLM aided retrieval & more. I will write a separate post talking about MMR and others in a separate post. For this post consider retrieval as we are getting top 3 data chunks that could have the context/answer for the question that user asked the chat bot. Once we retrieve that relevant chunks and pass it to Open AI LLM as ask it to generate an answer by using prompt. See my previous post about writing good prompts.

Retrieval & Generate output

The result I got is very accurate. Not only did the LLM correctly identify that there is no feature called polling it also found contextually a relevant feature called change detection which is similar to what polling refers to in a lot of products.

Result from LLM

If you need the full code for this and images are not helping reach out to me on twitter. That’s it for Day 6 of 100 Days of AI.

If you understand RAG and the concepts associated with it here like embeddings, vector databases, retrieving techniques you can generate a lot of ideas to build interesting chat bots. Feel free to reach out to me and share those ideas with me.

AI PRODUCT IDEA ALERT 1: All organization would want chat with your data applications. And would also want their employees create customer chat with your data applications based on their needs with out writing any code. Microsoft and other companies are launching products and features to enable large organizations to do this via Azure Open AI. But I think there will be startups competing for this space as well.

AI PRODUCT IDEA ALERT 2: Chat Gpt for doctors not trained on all the internet, but curated data that is from all the text books, recent research, best practices picked from a board etc., I could see a highly curated LLM tuned for doctors.

AI PRODUCT IDEA ALERT 3: Similar to idea 2 there will be LLM’s fine tuned to education use case. Where all the info has to be accurate. I think there are other verticals which needs curated data sets instead of the whole internet.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI or bookmark this page.

January 7, 2024
Day 6: What are different retrieval techniques & why they are useful?
I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

What is Retrieval in the context of building RAGs?

In RAG (Retrieval Augmented Generation) applications, retrieval refers to the process of extracting the most relevant data chunks/splits from Vector databases based on the question received from the user. If your retrieval technique is not good it effects how good the information you can give to the user as a reply. The data chunks retrieved from the vector db are sent to the LLM as a context to generate the final answer that will be sent to the user as an output.

Different type of retrieval techniques:
1. Basic Semantic Similarity: In this algorithm you are retrieving data chunks from vector db that are most semantically close to the question asked by the user. For example, if the user question was “Tell me about all-white mushrooms with large fruiting bodies”. A simple semantic similarity will get you the answer but will not give the information about how they might also be piousness. Some edge cases you would see using semantic similarity would be:
  1. If we load duplicate files in data loading and our answer exists in both, the result will give both of them. So we need to get relevant & distinct results from vector embedding when we ask a question.
  2. If we ask a question where all answers should come from doc-2, but we do get answers from doc-1 as well because this is a semantic search and we haven’t explicitly controlled which doc to look into and which to skip.
2. Maximal Marginal Relevance (MMR): In all cases of retrievals we may not want just similarity but also divergence. In the above example of Tell me about all-white mushrooms with large fruiting bodies. If we do just similar we will not give the info on how poisonous. We can instead use MMR to have some divergence in our retrievals. Here’s the algorithm of MMR on how it picks the relevant data chunks.
  1. Query the vector store
  2. Choose the fetch_K most similar responses
  3. In those responses choose K most diverse
Maximal Marginal Relevance (MMR) Retrieval

3. LLM Aided Retrieval: We can also use LLM to do retrieval by splitting the question into filter and the search term. Once we split the query using the LLM we pass the filter to the vector db as a metadata filter which most vector DB’s support.

LLM Aided Retrieval

Note that there are retrieval techniques that do not use vector databases like SVM, TF-IDF etc.,

Retrieval is where a lot of innovation is currently happening and is changing rapidly. I will be using retrieval technique in the upcoming blog to build a chat with your data application. Keep an eye out for it.

That’s it for Day 6 of 100 Days of AI.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI or bookmark this page.
January 6, 2024