Latest Posts (6 found)
wh. Yesterday

SFT, RL, and On-Policy Distillation Through a Distributional Lens

I have been thinking about post-training methods in terms of distributions. A language model is a distribution over sequences. When we post-train it and attempt to teach it a task, we are reshaping this distribution. Different post-training methods differ in how they reshape this distribution, what they treat as the target and how directly they define this target. This is neither a very precise statement nor is it meant to be fully rigorous.

0 views
wh. 3 weeks ago

Coding Models Are Doing Too Much

Code for this post is available here. AI-assisted coding has become the norm and with tools like Cursor, GitHub Copilot, Claude Code, Codex, we are increasingly letting models touch our code. If you have used any of these tools in the past year, you have probably experienced something like this: you ask the model to fix a simple bug (perhaps a single off-by-one error, or maybe a wrong operator). The model fixes the bug but half the function has been rewritten.

0 views
wh. 6 months ago

Evaluating Long Context (Reasoning) Ability

Pass@1 scores on the 128k subset of LongCodeEdit. Reasoning models and long agent trajectories are eating up valuable space in the context window. In response, models are being released with ever-increasing context windows; the latest, Grok 4 Fast, has a 2 million token window. Unfortunately, as anyone who has worked with these models knows, the number of tokens a model can accept as input is not the same as the number of tokens it can reason over.

0 views
wh. 9 months ago

Flow Matching in 5 Minutes

In this post, I will try to build an intuitive understanding to the Flow Matching, a framework used to train many state-of-the-art generative image models. In generative modelling, we start with 2 probability distributions: (1) an easily sampled distribution $p_{\text{source}}$ (e.g. a Gaussian distribution) and (2) a target distribution $p_{target}$ containing data points (e.g. images). Our goal is to transform a point sampled from $p_{\text{source}}$ to a point that could have been reasonably sampled from $p_\text{target}$.

0 views
wh. 1 years ago

The State of Generative Models

In the face of disruptive technologies, moats created by closed source are temporary. Even OpenAI’s closed source approach can’t prevent others from catching up. So we anchor our value in our team — our colleagues grow through this process, accumulate know-how, and form an organization and culture capable of innovation. That’s our moat. - Liang Wenfeng, CEO of DeepSeek 2024 has been a great year for AI. In both text and image generation, we have seen tremendous step-function like improvements in model capabilities across the board.

0 views
wh. 1 years ago

Taking PyTorch for Granted

A while back I challenged myself to implement micrograd in Rust using only the standard library. Along the way, I thought that it would be a fun to attempt to implement a fully functioning Tensor library on top of micrograd. I thought that my familarity with PyTorch will make this easier but having to do so without the higher-level abstractions of Python turned out much harder than expected. In this post, I hope to share some of my learnings throughout this process which forced me to think deeply about how PyTorch actually works under the hood.

0 views