This link section is inspired by the ones from my favourite bloggers such as gwern, guzey or nintil. It presents a semi up-to-date list of my most interesting reads of the last few months.
June 2025
- Thiel’s infamous essay from 2009:
- “I stand against confiscatory taxes, totalitarian collectives, and the ideology of the inevitability of the death of every individual.”
- ”The fate of our world may depend on the effort of a single person who builds or propagates the machinery of freedom that makes the world safe for capitalism.”
- https://www.cato-unbound.org/2009/04/13/peter-thiel/education-libertarian/
February 2025
October 2023
- Phi-1.5 Model: A Case of Comparing Apples to Oranges?
- Flash-Decoding for long-context inference
- RingAttention
- https://arxiv.org/abs/2310.01889
- The urge to go full tri dao et al and port that thing from Jax to a CUDA/Triton kernel…
- This would not only enable RingAttention to scale the sequence length by the number of devices used during training, but potentially also achieve higher a Model FLOPs utilization than FlashAtention-2 by computing the full transformer block in a blockwise manner in one kernel
- You could fine-tune a CodeLLaMA 7B to a 4million token context window with just 32x A100s to literally fit every code repository in the context…
- It’s time to be a definite techno-optimist
June 2023
- Large Language Models can Simulate Everything
- Large Language Models as Tool Makers
- Blockwise Parallel Transformer for Long Context Large Models:
May 2023
April 2023
- Scaffolded LLMs are not just cool toys but actually the substrate of a new type of general-purpose natural language computer
March 2023