DEV Community

# llm

Posts

👋 Sign in for the ability to sort posts by relevant, latest, or top.
🛠 Local LLM Ops 2025: A Developer's Guide to Running Pocket-Sized Neural Networks

🛠 Local LLM Ops 2025: A Developer's Guide to Running Pocket-Sized Neural Networks

Comments
2 min read
The Hidden Switchboard Behind vLLM Attention

The Hidden Switchboard Behind vLLM Attention

Comments
10 min read
Let’s talk about: Goose!

Let’s talk about: Goose!

Comments
15 min read
The Orphan Axiom Problem in Ontology-Based RAG

The Orphan Axiom Problem in Ontology-Based RAG

Comments
6 min read
⚙️ One Tool, Many Brains: Building a Multi-Model DevOps Architect

⚙️ One Tool, Many Brains: Building a Multi-Model DevOps Architect

Comments
7 min read
A Guide to HITL, HOTL, and HOOTL Workflows

A Guide to HITL, HOTL, and HOOTL Workflows

1
Comments
3 min read
How to Implement Observability for AI Agents with LangGraph, OpenAI Agents, and Crew AI

How to Implement Observability for AI Agents with LangGraph, OpenAI Agents, and Crew AI

Comments
6 min read
Optimal Chunking for Ontology RAG: Empirical Analysis & Orphan Axiom Problem

Optimal Chunking for Ontology RAG: Empirical Analysis & Orphan Axiom Problem

Comments
12 min read
How to Build Multi-Provider Failover Strategies with Bifrost for Ultra‑Reliable AI Applications

How to Build Multi-Provider Failover Strategies with Bifrost for Ultra‑Reliable AI Applications

5
Comments
8 min read
Semantic Caching with Bifrost: Reduce LLM Costs and Latency by Up to 70%

Semantic Caching with Bifrost: Reduce LLM Costs and Latency by Up to 70%

Comments
7 min read
Dec 19, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Dec 19, 2025 | The Tongyi Weekly: Your weekly dose of cutting-edge AI from Tongyi Lab

Comments
5 min read
📌 Most models use Grouped Query Attention. That doesn’t mean yours should.📌

📌 Most models use Grouped Query Attention. That doesn’t mean yours should.📌

1
Comments
1 min read
How to Evaluate Your RAG System: A Complete Guide to Metrics, Methods, and Best Practices

How to Evaluate Your RAG System: A Complete Guide to Metrics, Methods, and Best Practices

Comments
18 min read
Mooncake Memory Deep Dive: KVCache, Token Cost, DRAM Usage, and Saturation Analysis

Mooncake Memory Deep Dive: KVCache, Token Cost, DRAM Usage, and Saturation Analysis

Comments
5 min read
How to Use Synthetic Data to Evaluate LLM Prompts: A Step-by-Step Guide

How to Use Synthetic Data to Evaluate LLM Prompts: A Step-by-Step Guide

Comments
8 min read
I Didn’t Build a Chatbot — I Built an AI That Runs the System

I Didn’t Build a Chatbot — I Built an AI That Runs the System

Comments
2 min read
A/B Testing Prompts: A Complete Guide to Optimizing LLM Performance

A/B Testing Prompts: A Complete Guide to Optimizing LLM Performance

Comments
7 min read
Local Lock Down Lobe Chat Setup

Local Lock Down Lobe Chat Setup

Comments
4 min read
OWL-Aware Chunking Strategies: A Comprehensive Performance Analysis

OWL-Aware Chunking Strategies: A Comprehensive Performance Analysis

Comments
12 min read
Por Qué el 83% de Herramientas de Detección de Alucinaciones RAG Fallan en Producción

Por Qué el 83% de Herramientas de Detección de Alucinaciones RAG Fallan en Producción

Comments
3 min read
The Transformer Architecture: A Deep Dive into How LLMs Actually Work

The Transformer Architecture: A Deep Dive into How LLMs Actually Work

6
Comments
25 min read
Beyond Sharper Images: How LLM-Guided Super-Resolution Transforms Geo-Spatial Analysis

Beyond Sharper Images: How LLM-Guided Super-Resolution Transforms Geo-Spatial Analysis

Comments
7 min read
Routing, Load Balancing, and Failover in LLM Systems

Routing, Load Balancing, and Failover in LLM Systems

5
Comments
3 min read
I Built an ETL Pipeline That Actually Thinks & And Cut Token Costs by 52% (And Here's What I Learned)

I Built an ETL Pipeline That Actually Thinks & And Cut Token Costs by 52% (And Here's What I Learned)

1
Comments
17 min read
How to Make Your AI Strictly Follow Rules: Building a Robust Rule System

How to Make Your AI Strictly Follow Rules: Building a Robust Rule System

Comments
3 min read
loading...