[mypyc] Use cached ASCII characters in `CPyStr_GetItem` by VaggelisD · Pull Request #21035 · python/mypy

VaggelisD · 2026-03-18T09:19:05Z

For characters < 256, use PyUnicode_FromOrdinal() which returns CPython's cached single-char Latin-1 string objects instead of allocating a new PyUnicode object on every str[i] access. This avoids allocation+deallocation overhead in character-scanning hot loops.

Characters >= 256 (BMP, supplementary) keep the original PyUnicode_New allocation path unchanged.

I ran the following micro-benchmark: Scan a 50k-character string with s[i] in a loop (repeated the benchmark 5000 times):

String type	Before (ms/iter)	After (ms/iter)	Speedup
ASCII (0–127)	0.651	0.166	3.9x (-75%)
Latin-1 (128–255)	0.752	0.162	4.6x (-78%)
BMP (256–65535)	0.901	0.809	no change
Supplementary (>65535)	0.842	0.743	no change
Mixed (25% each)	0.817	0.542	1.5x (-34%)

This was coauthored with @tobymao

For characters < 256, use PyUnicode_FromOrdinal() which returns CPython's cached single-char Latin-1 string objects instead of allocating a new PyUnicode object on every str[i] access. This avoids allocation+deallocation overhead in character-scanning hot loops. Characters >= 256 (BMP, supplementary) keep the original PyUnicode_New allocation path unchanged.

ilevkivskyi

LG, thanks!

ilevkivskyi approved these changes Mar 18, 2026

View reviewed changes

ilevkivskyi merged commit 6bcd02e into python:master Mar 18, 2026
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mypyc] Use cached ASCII characters in `CPyStr_GetItem`#21035

[mypyc] Use cached ASCII characters in `CPyStr_GetItem`#21035
ilevkivskyi merged 1 commit intopython:masterfrom
VaggelisD:str-getitem-cache

VaggelisD commented Mar 18, 2026 •

edited

Loading

Uh oh!

ilevkivskyi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

VaggelisD commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ilevkivskyi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

VaggelisD commented Mar 18, 2026 •

edited

Loading