The Information reported earlier this month that OpenAI engineers told colleagues they developed several new optimization techniques that could cut model inference costs by more than half. When applied to ChatGPT usage by unauthenticated visitors, th

2026-06-30

The Information reported earlier this month that OpenAI engineers told colleagues they developed several new optimization techniques that could cut model inference costs by more than half. When applied to ChatGPT usage by unauthenticated visitors, the optimizations temporarily reduced required NVIDIA GPUs to only a few hundred.