Bug report
Bug description
The C implementation of pickle.Unpickler, when reading from a file-like
object that provides readinto(), hands that method a temporary memoryview
created over an internal buffer:
PyObject *buf_obj = PyMemoryView_FromMemory(buf, n, PyBUF_WRITE);
...
PyObject *read_size_obj = _Pickle_FastCall(self->readinto, buf_obj);
(Modules/_pickle.c, _Unpickler_ReadIntoFromFile.)
buf points into a short-lived buffer (e.g. the bytes object allocated in
load_counted_binbytes, which may also be reallocated by _PyBytes_Resize or
freed when unpickling ends). The memoryview is never released or invalidated
after readinto() returns. A readinto() implementation that keeps a
reference to the view can therefore use it to read or write the buffer after
it has been freed, which is a use-after-free at the C level.
This only requires a pure-Python file-like object -- no ctypes. A pure-Python
program should not be able to make the interpreter read or write freed memory.
Reproducer
import pickle, struct, gc
stashed = []
class EvilFile:
def __init__(self):
self._h = b"\x80\x05" + b"\x8e" + struct.pack("<Q", 200_000)
self._p = 0
def read(self, n=-1):
d = self._h[self._p:] if (n is None or n < 0) else self._h[self._p:self._p+n]
self._p += len(d); return d
def readline(self):
return self.read(-1)
def readinto(self, view):
stashed.append(view) # keep the view past readinto()
view[:] = b"A" * len(view); return len(view)
up = pickle.Unpickler(EvilFile())
try:
up.load() # stream ends after the payload
except EOFError:
pass
del up; gc.collect() # free the backing buffer
_ = [bytes(200_000) for _ in range(8)] # churn the allocator
stashed[0][0] # <-- use-after-free read
On a --with-address-sanitizer --with-pydebug build this reports a clean
heap-use-after-free (READ in unpack_single, the buffer freed via
Pdata_dealloc and originally allocated in load_counted_binbytes).
Root cause and fix direction
The view is a non-owning window over a raw pointer that the unpickler does not
keep alive. Other CPython sites that drive a user readinto() (e.g.
_io.RawIOBase.read()) hand out an owning object (a bytearray) whose buffer
protocol prevents it from being freed while still exported. The pickle path
should release the temporary memoryview as soon as readinto() returns, so a
surviving reference raises ValueError: operation forbidden on released memoryview object instead of dereferencing freed memory.
I have a patch with a regression test and will open a PR.
(For context: this was originally raised privately with the Python Security
Response Team, who advised opening a public issue.)
CPython versions tested on
3.16.0a0 (main, commit 5755d0f).
Operating systems tested on
macOS (arm64), --with-address-sanitizer --with-pydebug build.
Linked PRs
Bug report
Bug description
The C implementation of
pickle.Unpickler, when reading from a file-likeobject that provides
readinto(), hands that method a temporarymemoryviewcreated over an internal buffer:
(
Modules/_pickle.c,_Unpickler_ReadIntoFromFile.)bufpoints into a short-lived buffer (e.g. thebytesobject allocated inload_counted_binbytes, which may also be reallocated by_PyBytes_Resizeorfreed when unpickling ends). The memoryview is never released or invalidated
after
readinto()returns. Areadinto()implementation that keeps areference to the view can therefore use it to read or write the buffer after
it has been freed, which is a use-after-free at the C level.
This only requires a pure-Python file-like object -- no
ctypes. A pure-Pythonprogram should not be able to make the interpreter read or write freed memory.
Reproducer
On a
--with-address-sanitizer --with-pydebugbuild this reports a cleanheap-use-after-free(READ inunpack_single, the buffer freed viaPdata_deallocand originally allocated inload_counted_binbytes).Root cause and fix direction
The view is a non-owning window over a raw pointer that the unpickler does not
keep alive. Other CPython sites that drive a user
readinto()(e.g._io.RawIOBase.read()) hand out an owning object (abytearray) whose bufferprotocol prevents it from being freed while still exported. The pickle path
should release the temporary memoryview as soon as
readinto()returns, so asurviving reference raises
ValueError: operation forbidden on released memoryview objectinstead of dereferencing freed memory.I have a patch with a regression test and will open a PR.
(For context: this was originally raised privately with the Python Security
Response Team, who advised opening a public issue.)
CPython versions tested on
3.16.0a0 (main, commit 5755d0f).
Operating systems tested on
macOS (arm64),
--with-address-sanitizer --with-pydebugbuild.Linked PRs