Day 8: How to Build on Gpt-4 Using MSFT’s Semantic Kernel

About 100 Days of AI:

AI, with the rise of chat-GPT, captured attention of everyone. Unlike other tech bubbles, AI wave will be enduring. Whether you’re a developer, product manager, marketer, or in any knowledge role, investing time in learning AI will be worth your time for your career. 100 days of AI is my goal to learn more about AI specifically LLMs and share ideas, experiments, opinions, trends & learnings through my blog posts. You can follow along the journey here.

Introduction to Semantic Kernel:

Semantic Kernel is an open source SDK from Microsoft that helps developers create AI applications including chatbots, RAGs, Copilots & agents. It is similar to what langchain does. We can probably call it as Microsoft’s answer to langchain.

It is designed to make existing software extensible and easy to expose to AI features. It is also designed anticipating that applications would like to update their AI models to the latest and greatest versions over time.

Although the space is evolving very rapidly here are some definitions to keep in mind as we explore Semantic Kernel.

Chat Bot: Simple chat with the user.
RAGs: Simple chat bot but grounded in real time & private data.
Copilots: Meant to be assisting us side-by-side to accomplish tasks by recommending & suggesting.
Agents: Respond to stimuli with limited human intervention. Agents executes tasks like sending emails, booking tickets on behalf of the user.

How Does Semantic Kernel Work?

To explain how Semantic Kernel works, lets take an example of taking a piece of text and converting it into 140 character tweet. But we will do this using Semantic Kernel. We have done a similar summarization in previous posts here.

I will be using the python library of the Semantic Kernel, but since Semantic Kernal is created by Microsoft you can also do this in C# as well. Check public documentation from Microsoft on how to do this.

Step 1: Initiate Semantic Kernel

Below we are initiating the semantic kernel and setting up its text completion service by telling it to use OpenAI’s gpt-4 model as the LLM to use for text completion.

import semantic_kernel as sk
from semantic_kernel.connectors.ai.open_ai import AzureChatCompletion, OpenAIChatCompletion
import os
from IPython.display import display, Markdown
import asyncio

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv()) # read local .env file
api_key = os.environ['OPENAI_API_KEY']

kernel = sk.Kernel()
kernel.add_text_completion_service("openai", OpenAIChatCompletion("gpt-4",api_key))
print("Kernel Initiated")

Step 2: Understanding a Semantic Function:

In the Semantic Kernel world, we have a concept of Semantic function which is different from a native function. A native function is the regular functions that we write in any programming language. Semantic functions are encapsulations of repeatable LLM Prompts that can be orchestrated by the Kernel. You will get a better idea of what semantic function is in the next step where we will write one.

Step 3: Create a Semantic Function

Here we create a prompt sk_prompt that summarizes the connect in less than 140 characters (that’s our goal with this exercise). We then pass the prompt as an input to create a semantic function with the kernel and store which gives us in return the object summary_function which represents the semantic function we created and that can be repeatedly accessed via the kernel. Note that when we created a semantic function we are using a customer prompt and also giving LLM config information like max_tokens, temperature etc., Now go back to the previous image of native vs semantic functions and it will make more sense.

sk_prompt = """
{{$input}}

Summarize the content above in less than 140 characters.
"""
summary_function = kernel.create_semantic_function(prompt_template = sk_prompt,
                                                    description="Summarizes the input to length of an old tweet.",
                                                    max_tokens=200,
                                                    temperature=0.1,
                                                    top_p=0.5)       
print("A semantic function for summarization has been registered.")

Step 4: Using the Semantic Function to Summarize text into a tweet of 140 characters.

Now we create the text that we want to summarize using the variable sk_input and call the sematic function via kernal and then display the result.

sk_input = """
Let me illustrate an example. Many weekends, I drive a few minutes from my house to a local pizza store to buy 
a slice of Hawaiian pizza from the gentleman that owns this pizza store. And his pizza is great, but he always 
has a lot of cold pizzas sitting around, and every weekend some different flavor of pizza is out of stock. 
But when I watch him operate his store, I get excited, because by selling pizza, he is generating data. 
And this is data that he can take advantage of if he had access to AI.

AI systems are good at spotting patterns when given access to the right data, and perhaps an AI system could spot 
if Mediterranean pizzas sell really well on a Friday night, maybe it could suggest to him to make more of it on a 
Friday afternoon. Now you might say to me, "Hey, Andrew, this is a small pizza store. What's the big deal?" And I 
say, to the gentleman that owns this pizza store, something that could help him improve his revenues by a few 
thousand dollars a year, that will be a huge deal to him.
"""

# using async to run the semantic function
async def run_summary_async():
     summary_result = await kernel.run_async(summary_function, input_str=sk_input)
     display(summary_result)

asyncio.run(run_summary_async())

Here’s the output I got:

AI can analyze sales data to help a small pizza store owner optimize his stock, potentially increasing his annual revenue.

Semantic Kernel has more capabilities like using Semantic Functions & Native functions together and is designed to create AI applications that are powerful. I will write more about them in future posts.

That’s it for Day 8 of 100 Days of AI.

I write a newsletter called Above Average where I talk about the second order insights behind everything that is happening in big tech. If you are in tech and don’t want to be average, subscribe to it.

Follow me on Twitter, LinkedIn for latest updates on 100 days of AI. If you are in tech you might be interested in joining my community of tech professionals here.