LLMs

Here I put all my thoughts, my resarch, things I've learned and read about LLMs.

Resources

LLM models so far are decrtibed as reasoning and models that are good for creative tasks and agentic planning.

reasoning model with 671 billion parameters. Under MIT License.

Another factor of 10 might be the differene between an undergraduate and PhD skill level.

improvement on model architecture (tweaks ontransforers that the the underlying of today's models)
engineering improvements, finding a way to run the model more efficiently on the underlying hardware
CM = "compute multiplier". Frontier AI companies are able to find those compute multiplers

From 2020-2023 the main thing being scaled was pretrained models
In 2024 the idea of using reinforecement learning to train models to generate chains of thought has become the new focus of scaling
new paradigm involves starting with the ordinary type of pretrained models, and then as a second stage using RL to add the reasoning skills
as of 2025 we are at a unique "crossover point" where thre is a powerful new paradigm that is early on the scaling curve and threfore can make big gains quickly.

The "developer loop" might change substantially. I.e., today if you're doing a large task you might do something like:

Where in the future/soon/now it might be more like

Work with PM (and maybe the AI) to figure out what they want
Work with AI and your technical stakeholders to translate that PM plan to create a technical plan
Have AI implement a large fraction of the technical plan via chaos coding
Do some cleanup on the PR

median 5% , 14.4% average from AI CEOs and researchers responses.