skyllm: Cheap On-Demand Cloud LLMs for When Local Isn't Enough
I’ve written previously about running LLMs locally on my RTX 3060. Local is great when the model fits, but my 12GB ceiling rules out a lot of interesting models. The one that fi...
What I Learned Making a Local LLM Do Real Work
In my previous post, I described building an AI agent for Harvest time tracking using Pydantic AI — driven partly by security concerns with the skill-based approach. The agent w...
From Skill to Agent: When a Text File Isn't Enough
A coworker of mine built a Go CLI for the Harvest time-tracking API. It’s a solid tool, and I wanted to make it even easier to use from Claude Code. So I wrote a skill — essenti...
Running a Local Coding Agent with Qwen3-Coder-Next
Open-source coding models have gotten seriously good. Qwen3-Coder-Next is an 80B parameter Mixture-of-Experts model that only activates 3B parameters per token, and according to...
