Skip to content
CTCO
Go back

C# and the AI tooling renaissance: Practical libraries you should try

Published:  at  10:00 AM
9 min read

feature image

Ever feel like every week there’s a new AI library promising to save you hours of work? You’re not alone. I spend a fair bit of time trying things out so you don’t have to. In this post I summarise the most useful, practical C# tooling I’ve used or researched recently and show tiny, runnable examples so you can try them quickly.

Table of contents

Open Table of contents

Introduction / context

You can reasonably build modern AI features in C# today without stepping outside the .NET ecosystem. There are two common trajectories:

Both are valid. Which you choose depends on latency, privacy, and cost.

I’ll be pragmatic and honest: I’ll point out where a library is mature and where it still feels experimental. Try the small examples below and adapt them as you see fit. I’ve used several of these on and off in projects for a while now.

Quick checklist: what I picked and why

Packages and how they fit together

Below I summarise each option with a quick example and a short note on where it fits.

Microsoft.SemanticKernel

What it is: a Microsoft-built SDK for composing prompts, chaining prompts with code, and building agent-like flows in .NET. It’s great when you want to call models but keep logic and tooling inside your application.

Why use it: opinionated patterns for building Copilot-style assistants and good integration with Azure and local models.

NuGet:

dotnet add package Microsoft.SemanticKernel

Illustrative usage (very small):

// Build a kernel and call an LLM completion service (illustrative)
using Microsoft.SemanticKernel;

var kernel = Kernel.Builder
    .WithDefaultAIService("openai", new /* provider config */ {})
    .Build();

var result = await kernel.RunAsync("Write a short summary of Dependency Injection in C#");
Console.WriteLine(result);

Notes: Semantic Kernel is richer than this snippet shows; it provides memory, functions and plan-based orchestration.

Microsoft.Extensions.AI and Azure.AI.OpenAI (Azure-first)

What they are: Microsoft.Extensions.AI is a set of .NET abstractions for AI services; Azure.AI.OpenAI is the official Azure client for OpenAI-style models on Azure.

Why use them: if you host models with Azure (OpenAI or Azure AI Model Catalog), these packages give you a safe, .NET-idiomatic way to interact with models and embeddings.

NuGet:

dotnet add package Azure.AI.OpenAI
dotnet add package Microsoft.Extensions.AI

Embeddings example (Azure OpenAI via the Azure SDK):

using Azure;
using Azure.AI.OpenAI;

var endpoint = new Uri("https://your-azure-openai-endpoint");
var client = new OpenAIClient(endpoint, new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_OPENAI_KEY")));

var input = "The quick brown fox jumps over the lazy dog";
var embeddingsResponse = await client.GetEmbeddingsAsync("text-embedding-3-small", new EmbeddingsOptions(input));
var vector = embeddingsResponse.Value.Data[0].Embedding;

Console.WriteLine($"Vector length: {vector.Count}");

Check docs: Azure SDK method names and types evolve; consult the official Azure.AI.OpenAI docs for exact signatures and model IDs.

OpenAI .NET SDKs (official & community)

What they are: client libraries for calling OpenAI’s APIs from .NET. There’s an official library maintained for .NET and a few community alternatives.

Why use them: if you want direct OpenAI usage (not Azure) with idiomatic async C#.

NuGet (example):

dotnet add package OpenAI

Tiny example (HTTP-style flow):

// A lightweight, package-agnostic approach using HttpClient to call the REST API
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text.Json;

var http = new HttpClient();
http.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", Environment.GetEnvironmentVariable("OPENAI_API_KEY"));

var payload = new { model = "gpt-4o-mini", prompt = "Explain SOLID in plain English", max_tokens = 300 };
var res = await http.PostAsJsonAsync("https://api.openai.com/v1/completions", payload);
var json = await res.Content.ReadAsStringAsync();
Console.WriteLine(json);

This is the most portable approach and helps you understand the underlying API when things go sideways.

LangChain.NET and AutoGen.Net / Orchestrators

What they are: .NET ports / implementations of orchestration frameworks (LangChain patterns, AutoGen). They provide higher-order constructs like chains, tool-using agents, and memory management.

Why use them: if you want pre-built patterns for retrieval chains, QA assistants, or multi-step agent workflows.

Status note: some of these projects are early-stage; check maturity before committing to them in production.

ONNX Runtime and TorchSharp: run models locally

What they are: runtimes for running ML models locally in .NET. Microsoft.ML.OnnxRuntime is stable and great for optimised inference. TorchSharp exposes libtorch for .NET and is useful for running certain models (Llama family support via community tooling).

NuGet:

dotnet add package Microsoft.ML.OnnxRuntime
dotnet add package TorchSharp

ONNX Runtime example (inference skeleton):

using Microsoft.ML.OnnxRuntime;
using Microsoft.ML.OnnxRuntime.Tensors;

using var session = new InferenceSession("path/to/model.onnx");
// Create input tensors then run
var inputs = new List<NamedOnnxValue> {
    NamedOnnxValue.CreateFromTensor("input", new DenseTensor<float>(new [] {1, 3, 224, 224}))
};

using var results = session.Run(inputs);
// Handle outputs

Notes: converting large LLMs to ONNX and getting good performance takes work; for many workloads (embeddings, smaller specialised models) ONNX is an excellent choice.

Vector stores: Redis (NRedisStack / Redis OM), Qdrant, Pinecone

What they are: specialised stores for high-dimensional vectors with similarity search. Redis offers modules for vector search; Qdrant and Pinecone are popular vector-first DBs with .NET clients.

Why use them: fast similarity search for RAG, semantic search and recommendations.

Quick install examples:

dotnet add package NRedisStack
dotnet add package Qdrant.Client
dotnet add package Pinecone.Client

Example flow (conceptual):

  1. Create embedding for a document with Azure.AI.OpenAI or OpenAI SDK
  2. Upsert vector into your vector DB (Redis / Qdrant / Pinecone)
  3. Query nearest neighbours for a user query and use retrieved documents as prompt context

Small conceptual snippet for upsert/query (pseudo-C#):

var embedding = await GetEmbeddingAsync(text);
await vectorDb.UpsertAsync(id, embedding, metadata);
var neighbours = await vectorDb.SearchAsync(queryEmbedding, topK:5);

Each vector store has its own API; check the client docs for exact method names and types.

Practical snippets: a quick RAG example

Below is a compact end-to-end sketch of a retrieval-augmented generation flow using Azure OpenAI for embeddings, Redis (NRedisStack) for vectors and an LLM for completion.

// 1. Create embedding for a document
var openAiClient = new OpenAIClient(new Uri(azureEndpoint), new AzureKeyCredential(azureKey));
var embedResp = await openAiClient.GetEmbeddingsAsync("text-embedding-3-small", new EmbeddingsOptions("Important text to store"));
var docVector = embedResp.Value.Data[0].Embedding;

// 2. Upsert into Redis (pseudocode; follow client docs)
// await redisClient.VectorUpsertAsync("docs_index", docId, docVector, metadata);

// 3. For a query: create query embedding and search
var qEmb = (await openAiClient.GetEmbeddingsAsync("text-embedding-3-small", new EmbeddingsOptions("How do I do X?"))).Value.Data[0].Embedding;
// var hits = await redisClient.VectorSearchAsync("docs_index", qEmb, topK:5);

// 4. Build a prompt using top hits and call the LLM for completion
// var context = string.Join("\n---\n", hits.Select(h => h.Metadata.text));
// var completion = await llmClient.CreateCompletionAsync(new { model = "gpt-4o-mini", prompt = $"Use the following context:\n{context}\nAnswer: {userQuery}" });

This is intentionally compact; the real work is in metadata, chunking strategy, prompt engineering and handling model costs and latency.

Where this fits in a real project

Edge cases to watch for:

Comparison: cost, latency, privacy and maturity

Below is a compact table to help you pick a stack based on the constraints that matter to you.

StackCostLatencyPrivacyMaturity
Cloud (Azure OpenAI / OpenAI)Medium–High: per-call charges; pay-as-you-goLow–Medium: depends on region and model sizeLow–Medium: provider processes data; enterprise isolation possibleHigh: stable SDKs and SLAs
Cloud + Semantic Kernel / Microsoft.Extensions.AIMedium–High: underlying model costs remainLow–Medium: small orchestration overheadLow–Medium: same provider caveats; better integration on AzureHigh: Microsoft-supported patterns
Managed Vector DBs (Pinecone, Qdrant SaaS)Medium: storage and query billingLow: optimised for similarity searchMedium: vendor hosts data; check residency policiesMedium–High: Pinecone mature; Qdrant rapidly maturing
Self-hosted Vector DBs (Redis, Qdrant self-hosted)Low–Medium: infra and ops costsLow: fast when deployed near your appHigh: you control residency and accessHigh for Redis; Qdrant adoption growing
Local models (ONNX Runtime, TorchSharp)Low–Medium: no per-call fees; hardware costs applyVery low with GPU; slower on CPUVery high: data stays on your infraMedium: ONNX stable; local LLMs need engineering
Orchestration / Chains (LangChain.NET, AutoGen.Net)Varies: depends on model and storageVaries: orchestration adds overheadVaries: depends on hostingEarly–Medium: .NET ports maturing

Quick recommendations:

Concluding Remark

There’s no single correct stack. If you want to move quickly and are comfortable with the cloud, start with Azure + Semantic Kernel. If you need to run local models for privacy, look into ONNX and TorchSharp and combine them with a vector store.



Previous Post
From Lead Dev to Head of Software Development: Fast Tracks, Real Shifts, Practical Moves
Next Post
Being Glue in Software Engineering: When Technical Leadership Becomes Your Superpower