How "Thinking" Models Actually Work
Lilian Weng's Why We Think is a survey of test-time compute and chain-of-thought reasoning. Here's what I pulled out of it.
Things I've learned or things I find interesting will be logged here. For long-form content, you might want to check out my newsletter.
Lilian Weng's Why We Think is a survey of test-time compute and chain-of-thought reasoning. Here's what I pulled out of it.
I think we are still too loose with the phrase “world model”.
Quantisation is really a precision-allocation problem.
A study of 52 developers found that using AI to learn a new Python library led to worse comprehension scores, with no speed improvement. Here's what actually works.