ChatGPT meets Azure Advisor
In this blog post, I will share with you some of my learnings from this week’s hackathon, where I spent a couple of days building an Azure Advisor ChatGPT plugin using Semantic Kernel. You can find all of the source code and the information about the implementation details in this GitHub repo.
I will summarize my thoughts and observations on the following topics:
- Creating ChatGPT plugins
- Using Semantic Kernel
- Learnings from using Azure Advisor API
To test my plugin, I used the Chat Copilot demo application, which turned out to be very handy since it is free, can be easily modified, and run locally. At the time of writing, I would otherwise need to wait for the OpenAI folks to grant me access to the plugin functionality in ChatGPT or pay around 20 bucks for the Plus version.
A high resolution version of the demo can be found here. Please, also be aware of these current limitations!
Here is what the high-level architecture looks like:
Creating ChatGPT plugins
The overall experience has been great! I am not going to explain the mechanics since it is all well documented here. Instead let’s talk about the things I ran into:
- Descriptions of the plugin (
ai-plugin.json
) and the available operations and parameters in the OpenAI specification (swagger.json
) matter a lot and will most probably require some fine tuning. Those are used by ChatGPT to actually decide if a plugin should be considered for the user prompt at hand and what parameters should be passed in the query string and/or the body of the request. In my case, I struggled with the issue of the actual user prompt not being passed into one of the operations (it passed some keywords only) because the description of the request body in the OpenAI specification was not precise enough. Here is a good summary of the best practices for writing good descriptions. - Size matters: depending on the model there are different prompt size limits you need to be aware of (GPT-4 limits, GPT-3.5 limits). Your plugin is likely to return some data to allow ChatGPT to continue a user conversation. If your plugin returns too much data, there might be errors or inaccurate responses. Also, your plugin itself might use one of the models to shape a response. In the Azure Advisor Plugin I am saving data to the Kernel Memory, which automatically uses Azure OpenAI (
text-embedding-ada-002
) to create embeddings. This method shows how you can use theTextChunker
helper class to split the text into smaller chunks and avoid hitting the size limit. Another example is running a semantic search query on the cached embeddings and letting Azure OpenAI (gpt-35-turbo
) create a completion before it is sent back to ChatGPT. This gives me more control over the returned data as well as more flexibility when it comes to querying the data and summarizing the result (e.g., rewriting the prompt, doing multi-shot prompting etc.). You should be in control of the prompt sizes and know the limits to ensure your plugin delivers best results. - Fine tuning: as the last point in this section I’d like to emphasize the importance of fine tuning your plugin. It may take some time to figure out the best descriptions so the plugin and its operations are invoked correctly most of the times. There is still a possibility that your plugin will not be invoked based on a user prompt. Trying out many prompts helps increase confidence in the metadata you provide to ChatGPT. Prompt size control is no less important. It has impact on the quality of the completions and summaries and can also be a cost driver.
The list above is not an exhaustive one, as we have not even considered rate limiting, security, availability, performance, etc. These topics are general concerns when it comes to API development.
Using Semantic Kernel
Semantic Kernel is a powerful SDK that makes it easier to create AI-infused applications by orchestrating multiple AI plugins, providing connectors for different AI models, (vector) databases and implementing design patterns such as prompt chaining, recursive reasoning, summarization, contextual memory, long-term memory, embeddings, semantic indexing etc. Please refer to the official documentation for more details.
In the Azure Advisor ChatGPT plugin I only used a handful of Semantic Kernel capabilities but here are my thoughts:
- Connectors: those are really great! I am using the Azure OpenAI service to create embeddings and completions as part of the QueryRecommendations operation but switching to OpenAI would only require a small change in the configuration. For simplicity, I am using the
VolatileMemoryStore
embeddings store but I could have used one of the other stores with the available connectors. This would only require me to include the right NuGet Package and provide a configuration. These abstractions remind me a bit of Dapr. Maybe in the future we’ll see some kind of a symbiosis but even now these abstractions make it easier to leverage all those different models and external stores. - Templatizing semantic functions: it is totally possible and easy to define a semantic function in code but templates simplify this even further. I used this approach to implement the MemoryQuery semantic function. It queries the Kernel Memory using the
recall
function of the built inTextMemorySkill
and asks Azure OpenAI to generate a completion based on the retrieved data and the user prompt. While this is quite straightforward I lacked the ability to debug such templatized semantic functions invoking other skills (or plugins, Semantic Kernel is not very clear on the naming at the moment). To understand what was going on and to see the results of therecall
function, I actually ended up rebuilding everything in code until I understood what the issue was and reverted back to using the template thereafter. - Semantic Kernel usage: I used a handful of Semantic Kernel features to develop this ChatGPT plugin but it is much more powerful than that. Chat Copilot is a great example of a more complex application that uses such core concepts of Semantic Kernel as the Planner. This shows that Semantic Kernel can be used to build a variety of different application types. It also encourages building modular applications consisting of multiple skills or plugins that work in conjunction and are orchestrated by the Kernel.
Using Azure Advisor API
This is a brief summary of my learnings from using the Azure Advisor APIs:
- Recommendations: I was happy to see that recommendations can be queried using the
ArmClient
as part of the preview NuGet packageAzure.ResourceManager.Advisor.
Soon I realized that some of the properties (e.g.,properties.learnMoreLink
,properties.potentialBenefits
,properties.remediation
and others) that should be part of the response weren’t there. I used Postman to confirm that. Some of this information is available in the portal though. Sometimes, there are discrepancies between the data returned through the API and the state in the Azure portal (maybe caching issues?). These issues resulted in a reduced amount of data for user queries. An alternative approach could be to use Azure Resource Graph queries for Azure Advisor. - Scores: this API is not yet available in the
ArmClient
, so I worked with theHttpClient
directly. The only issue here is that the API response contains a bunch of unknown categories with Guids for names in addition to the expected ones (Cost, Security, Performance, Operation Excellence, High Availability and Advisor). - Cost Savings: to get this information I used
ArmClient
to submit this Azure Resource Graph query. It worked quite well. The only issue is that it takes quite a lot of time to load the tenants. This may be a design issue, as I decided not to include thetenantId
as a parameter of this plugin operation in favor of improving user experience (less information to type/include in the prompt).
Conclusion
Hacking with Semantic Kernel and Azure OpenAI was a lot of fun and helped me realize which areas deserve additional attention. Going forward I plan on mitigating the identified limitations and trying out this plugin directly with ChatGPT (vs. Chat Copilot).