Weekly Extract

The LLM week, compressed.

A focused weekly brief of the AI model, research, safety, and product updates worth reading. Built from LLMgram's canonical AI Signal pipeline, ranked for source quality, event relevance, and usefulness to builders. Click any item to open its full AI Signal card without leaving LLMgram.

10signals selected
7dranking window
May 25, 2026 · 23:45 UTCgenerated
fresh sourceAI Signal data
May 25, 2026 · 22:02 UTCsource refreshed
Top 10 This Week
01
arXiv cs.CL · model · May 25, 2026

Memorization Dynamics of Fill-in-the-Middle Pretraining

arXiv:2605.22981v1 Announce Type: new Abstract: Fill-in-the-middle (FIM) is a pretraining objective widely used to equip causal language models with infilling ability, yet its effect on verbatim memorization remains underexplored. We study…

02
Simon Willison · model · May 19, 2026

llm-gemini 0.32

Release: llm-gemini 0.32 New model gemini-3.5-flash for Gemini 3.5 Flash . See also my notes on Gemini 3.5 Flash , and the pelican I drew using this upgrade to the plugin. Tags: llm , gemini

03
Google AI · model · May 19, 2026

Gemini 3.5: frontier intelligence with action

At Google I/O we released Gemini 3.5, our latest series of models combining frontier intelligence with action.

04
Gary Marcus · model · May 22, 2026

I have to eat crow on this, in light of further information. whatever OpenAI spent on Erdos using a new model, apparently you can get GPT 5.5 to do something s…

I have to eat crow on this, in light of further information. whatever OpenAI spent on Erdos using a new model, apparently you can get GPT 5.5 to do something similar; @emollick ’s presumably estimates more or less apply there (even if not…

05
OpenAI · model · May 20, 2026

How Ramp engineers accelerate code review with Codex

How Ramp engineers use Codex with GPT-5.5 to review code and ship improvements, allowing them to get substantive feedback in minutes instead of hours.

06
Qwen (X) · model · May 25, 2026

✅Implicit caching is now live on Qwen3.7-Max — kicks in automatically, no setup needed. ⚡️Faster + cheaper out of the box. Need higher, more deterministic hit…

✅Implicit caching is now live on Qwen3.7-Max — kicks in automatically, no setup needed. ⚡️Faster + cheaper out of the box. Need higher, more deterministic hit rates? Try explicit caching instead. 🙌 🔗Best practices 🔗 : alibabacloud.com/help…

07
arXiv cs.AI · model · May 21, 2026

Evaluating the Utility of Personal Health Records in Personalized Health AI

arXiv:2605.18937v1 Announce Type: new Abstract: Patient-managed Personal Health Records (PHRs) promises to empower patients to better understand their health; but information in the record is complex, potentially hindering insights. In thi…

08
arXiv cs.LG · model · May 20, 2026

Compositional Literary Primitives in Instruction-Tuned LLMs: Cross-Architectural SAE Features for Self, Style, and Affect

arXiv:2605.18808v1 Announce Type: new Abstract: We characterize a compositional architecture of literary primitives in two instruction-tuned large language models (Llama 3.1 8B-Instruct and Gemma 2 9B-IT) via sparse autoencoders on mid-dep…

09
Ethan Mollick · model · May 19, 2026

Also had some early access to Gemini 3.5 Flash. Very fast for a flash model and very capable, though not as powerful as a full frontier model. I added it to th…

Also had some early access to Gemini 3.5 Flash. Very fast for a flash model and very capable, though not as powerful as a full frontier model. I added it to the gallery or procedurally generated one-shot towns (it made one error that it co…

10
Bindu Reddy (X) · model · May 23, 2026

Best Model For The Use Case Front-end coding - Opus 4.7 Back-end coding - GPT 5.5 xHigh Visual understanding- Flash 3.5 Cheap - DeepSeek Flash Video - Seedance…

Best Model For The Use Case Front-end coding - Opus 4.7 Back-end coding - GPT 5.5 xHigh Visual understanding- Flash 3.5 Cheap - DeepSeek Flash Video - Seedance 2.0 Image - GPT Image-2.0 Voice - Flash Live Writing - Gemini 3.1 Pro Real Time…