Weekly Extract

The LLM week, compressed.

A focused weekly brief of the AI model, research, safety, and product updates worth reading. Built from LLMgram's canonical AI Signal pipeline, ranked for source quality, event relevance, and usefulness to builders. Click any item to open its full AI Signal card without leaving LLMgram.

10signals selected
7dranking window
May 17, 2026 · 23:15 UTCgenerated
fresh sourceAI Signal data
May 17, 2026 · 22:02 UTCsource refreshed
Top 10 This Week
01
OpenAI · model · May 15, 2026

Databricks brings GPT-5.5 to enterprise agent workflows

Databricks uses GPT-5.5 for enterprise agent workflows after the model set a new state of the art on the OfficeQA Pro benchmark.

02
LessWrong · model · May 16, 2026

Trying to use NLAs to find out how Qwen 2.5 7B does multiplication

Neural language autoencoders were just introduced by Anthropic. In a fascinating paper , they showed that you can take the residual stream activations of a language model and then train two instantiations of that same model (an encoder and…

03
Simon Willison · model · May 14, 2026

datasette-ip-rate-limit 0.1a0

Release: datasette-ip-rate-limit 0.1a0 The datasette.io site was being hammered by poorly-behaved crawlers, so I had Codex (GPT-5.5 xhigh) build a configurable rate limiting plugin to block IPs that were hammering specific areas of the sit…

04
🫂 X/@aaron_epstein · other · May 12, 2026

@aaron_epstein: New model just released that beats sonnet 4.6, gemini 3 flash, and gpt 5.4 mini on OCR, vision, and

New model just released that beats sonnet 4.6, gemini 3 flash, and gpt 5.4 mini on OCR, vision, and STT tasks @interfaze_ai

05
arXiv cs.CL · model · May 14, 2026

HEBATRON: A Hebrew-Specialized Open-Weight Mixture-of-Experts Language Model

arXiv:2605.11255v1 Announce Type: new Abstract: We present Hebatron, a Hebrew-specialized open-weight large language model built on the NVIDIA Nemotron-3 sparse Mixture-of-Experts architecture. Training employs a three-phase easy-to-hard c…

06
Bindu Reddy (X) · model · May 14, 2026

Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques... Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on co…

Gemini 3.2 Flash - Capitalizing on DeepMind's clever distillation techniques... Rumors are that benchmarks show it's hitting 92% of GPT 5.5's performance on coding and reasoning tasks while being 15-20x cheaper on inference costs. The late…

07
arXiv cs.LG · model · May 12, 2026

Weight Pruning Amplifies Bias: A Multi-Method Study of Compressed LLMs for Edge AI

arXiv:2605.08137v1 Announce Type: new Abstract: Weight pruning is widely advocated for deploying Large Language Models on resource-constrained IoT and edge devices, yet its impact on model fairness remains poorly understood. We conduct a c…

08
Cursor (X) · model · May 12, 2026

Fast mode for Claude Opus 4.7 is now available in Cursor! It's 2.5x the speed at 6x the cost. For most tasks, we recommend using the standard speed.

Fast mode for Claude Opus 4.7 is now available in Cursor! It's 2.5x the speed at 6x the cost. For most tasks, we recommend using the standard speed.

09
LessWrong · model · May 16, 2026

A Year Late, Claude Finally Beats Pokémon

Credit: ClaudePlaysPokemon Elevator Shanty by Kurukkoo Disclaimer: like some previous posts in this series, this was not primarily written by me, but by a friend. I did substantial editing, however. ClaudePlaysPokemon feat. Opus 4.7 has fi…

10
OpenAI · model · May 12, 2026

How NVIDIA engineers and researchers build with Codex

Teams use Codex with GPT-5.5 to ship production systems and turn research ideas into runnable experiments.