100 Days of AI, Day 18: How to develop a RAG using Microsoft Semantic Kernel?

About 100 Days of AI:

Hey everyone! I’m Nataraj, and just like you, I’ve been fascinated with the recent progress of artificial intelligence. Realizing that I needed to stay abreast with all the developments happening, I decided to embark on a personal journey of learning, thus 100 days of AI was born! With this series, I will be learning about LLMs and share ideas, experiments, opinions, trends & learnings through my blog posts. You can follow along the journey on HackerNoon here or my personal website here. In today’s article, we’ll be looking at how to develop a RAG using Microsoft’s Semantic Kernel.

Retrieval augmented generation (RAG) is one the most common application that is being developed using different LLMs. We have previously explored how to develop a RAG using langchain. In this post we will create a RAG using Microsoft’s Semantic Kernel.

To follow along this post you will need an API for OpenAI API.

Step 1: Initialize Semantic Kernel

The first step is to initialize semantic kernel and tell the kernel that we want to use Open AI’s chat completion and Open AI’s embedding model which we will later use to create embeddings. We will also tell the kernel that we want to use a memory store which will be Chroma DB in our case. Note that we are also instructing the Kernel that this memory store needs to be persistent.

kernel = sk.Kernel()

kernel.add_text_completion_service("openai", OpenAIChatCompletion("gpt-4",api_key))

kernel.add_text_embedding_generation_service("openai-embedding", OpenAITextEmbedding("text-embedding-ada-002", api_key))

# chrome db
kernel.register_memory_store(memory_store=ChromaMemoryStore(persist_directory='mymemories2'))

print("Made two new services attached to the kernel and made a Chroma memory store that's persistent.")

Step 2 – Create Embeddings

In this example we are creating a RAG that can answer questions on a SWOT analysis that we created for a pizza business. So in order to do that we take the SWOT analysis and get the embeddings for them and store the corresponding embeddings in a collection called “SWOT” in the persistent datastore that we created in the last step.

strength_questions = ["What unique recipes or ingredients does the pizza shop use?","What are the skills and experience of the staff?","Does the pizza shop have a strong reputation in the local area?","Are there any unique features of the shop or its location that attract customers?", "Does the pizza shop have a strong reputation in the local area?", "Are there any unique features of the shop or its location that attract customers?"]
weakness_questions = ["What are the operational challenges of the pizza shop? (e.g., slow service, high staff turnover)","Are there financial constraints that limit growth or improvements?","Are there any gaps in the product offering?","Are there customer complaints or negative reviews that need to be addressed?"]
opportunities_questions = ["Is there potential for new products or services (e.g., catering, delivery)?","Are there under-served customer segments or market areas?","Can new technologies or systems enhance the business operations?","Are there partnerships or local events that can be leveraged for marketing?"]
threats_questions = ["Who are the major competitors and what are they offering?","Are there potential negative impacts due to changes in the local area (e.g., construction, closure of nearby businesses)?","Are there economic or industry trends that could impact the business negatively (e.g., increased ingredient costs)?","Is there any risk due to changes in regulations or legislation (e.g., health and safety, employment)?"]

strengths = [ "Unique garlic pizza recipe that wins top awards","Owner trained in Sicily at some of the best pizzerias","Strong local reputation","Prime location on university campus" ]
weaknesses = [ "High staff turnover","Floods in the area damaged the seating areas that are in need of repair","Absence of popular calzones from menu","Negative reviews from younger demographic for lack of hip ingredients" ]
opportunities = [ "Untapped catering potential","Growing local tech startup community","Unexplored online presence and order capabilities","Upcoming annual food fair" ]
threats = [ "Competition from cheaper pizza businesses nearby","There's nearby street construction that will impact foot traffic","Rising cost of cheese will increase the cost of pizzas","No immediate local regulatory changes but it's election season" ]

print("✅ SWOT analysis for the pizza shop is resident in native memory")

memoryCollectionName = "SWOT"

# lets put these in memory / vector store
async def run_storeinmemory_async():
    for i in range(len(strengths)):
        await kernel.memory.save_information_async(memoryCollectionName, id=f"strength-{i}", text=f"Internal business strength (S in SWOT) that makes customers happy and satisfied Q&A: Q: {strength_questions[i]} A: {strengths[i]}")
    for i in range(len(weaknesses)):
        await kernel.memory.save_information_async(memoryCollectionName, id=f"weakness-{i}", text=f"Internal business weakness (W in SWOT) that makes customers unhappy and dissatisfied Q&A: Q: {weakness_questions[i]} A: {weaknesses[i]}")
    for i in range(len(opportunities)):
        await kernel.memory.save_information_async(memoryCollectionName, id=f"opportunity-{i}", text=f"External opportunity (O in SWOT) for the business to gain entirely new customers Q&A: Q: {opportunities_questions[i]} A: {opportunities[i]}")
    for i in range(len(threats)):
        await kernel.memory.save_information_async(memoryCollectionName, id=f"threat-{i}", text=f"External threat (T in SWOT) to the business that impacts its survival Q&A: Q: {threats_questions[i]} A: {threats[i]}")

asyncio.run(run_storeinmemory_async())

print("😶‍🌫️ Embeddings for SWOT have been generated and stored in vector db")

Step 3 – Ask your question

Now that we have the embeddings of our data stored in the chrome vector store. We can now ask a question related to the pizza business and get an answer.

ask questions on swot
potential_question = "What are the easiest ways to make more money?"
counter = 0

async def run_askquestions_async():
    memories = await kernel.memory.search_async(memoryCollectionName, potential_question, limit=5, min_relevance_score=0.5)
    display(f"### ❓ Potential question: {potential_question}")
    for memory in memories:
        if counter == 0:
            related_memory = memory.text
        counter += 1
        print(f"  > 🧲 Similarity result {counter}:\n  >> ID: {memory.id}\n  Text: {memory.text}  Relevance: {memory.relevance}\n")

asyncio.run(run_askquestions_async())

This is a grossly simplified version of how a RAG can be created using Semantic Kernel. The most popular choice of framework to build right now using LLMs is langchain and we have previously seen how to build a RAG using langchain. Although Langchain is more popular as we see more and more companies building tools there will be more sophisticated tools out there and I have found that Semantic Kernel has a few special features that make it stand out.

That’s it for Day 18 of 100 Days of AI.

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI. If you are in tech you might be interested in joining my community of tech professionals here.