Navigate the Cloud Universe
StackLens is your intelligent companion for DevOps, SecOps, ML, and AI Engineering. Search hand-picked resources across the technical ecosystem.
AI Engineering Resources
Anthropic's Claude 3.5 Sonnet sets new industry benchmarks
The latest model from Anthropic outperforms competitors in coding, reasoning, and visual processing tasks.
OpenAI launches SearchGPT, a prototype of new AI search features
OpenAI is testing a new AI-powered search engine that aims to give users fast and timely answers with clear and relevant sources.
AWS Announces New Generative AI Capabilities for Amazon Bedrock
Amazon Bedrock now includes new models and features to help customers build and scale generative AI applications more easily.
A 9-point eval gain vanished when we deduped train against test
TL;DR: We fine-tuned an 8B model for an enterprise ticket-routing task and saw accuracy jump from 71%...
Winograd convolutions cost us 2 mAP and we didn't notice for a month
TL;DR: We turned on Winograd convolution to shave latency off a pedestrian detector running on a...
nvidia-smi Reports 97% Utilization While the GPU Sits Idle
TL;DR A GPU shows 97% utilization in nvidia-smi, but training throughput is a...
Quantization formats compared: GGUF vs GPTQ vs AWQ vs NF4
A practical comparison of the four major LLM weight quantization formats — which one to use for CPU, GPU serving, and fine-tuning, with current version numbers and deployment guidance.
I Processed 2.4 Billion Tokens Across 52 AI Models for $0.52. Here's the Full Breakdown.
I run a production multi-agent AI system on a single M1 Mac in Jamaica. 6 autonomous agents. 26 cron...
What DevOps Taught Me About AI Governance
Platform engineers already have the governance instincts AI adoption needs. The gap isn't knowledge. It's the organizational decision to apply them.
OpenAI Already Told Us the Kubernetes Scaling Story, Most People Just Did Not Read It Closely
Series links Part 1: Everything You Know About Scaling Web Apps Breaks When You Serve an LLM Part...