Python Pickle format serializes Python objects to a byte stream, as a sequence of operations to run on the Pickle Virtual Machine.
The format is mostly implementation defined, there is no formal specification.
Pickle data types are closely coupled to the Python object model.
Python singletons, and most builtin types (e.g.
are serialised using dedicated Pickle opcodes.
Other builtin types, and all classes (e.g.
serialised by encoding the name of a constructor callable.
They are deserialised by importing that constructor, and calling it.
So, unpickling an arbitrary pickle, using the Python's stdlib pickle module
can cause arbitrary code execution.
Pickle format has evolved with Python, later protocols add opcodes & types. Later Python releases can pickle to or unpickle from any earlier protocol.
This page hosts a formal specification of Python pickle serialization format using Kaitai Struct. This specification can be automatically translated into a variety of programming languages to get a parsing library.