跳到主要内容

Descriptor Guide

https://docs.python.org/3/howto/descriptor.html

Primer

Managed attributes

A popular use for descriptors is managing access to instance data. The descriptor is assigned to a public attribute in the class dictionary while the actual data is stored as a private attribute in the instance dictionary. The descriptor’s __get__() and __set__() methods are triggered when the public attribute is accessed

Customized names

class LoggedAccess:

def __set_name__(self, owner, name):
self.public_name = name
self.private_name = '_' + name

Closing thoughts

Descriptors are used throughout the language. It is how functions turn into bound methods. Common tools like classmethod(), staticmethod(), property(), and functools.cached_property() are all implemented as descriptors

Technical Tutorial

Definition and introduction

In general, a descriptor is an attribute value that has one of the methods in the descriptor protocol. Those methods are __get__(), __set__(), and __delete__(). If any of those methods are defined for an attribute, it is said to be a descriptor

Descriptor protocol

descr.__get__(self, obj, type=None)

descr.__set__(self, obj, value)

descr.__delete__(self, obj)

If an object defines __set__() or __delete__(), it is considered a data descriptor. Descriptors that only define __get__() are called non-data descriptors (they are often used for methods but other uses are possible)

If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence. If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence

Invocation from an instance

Instance lookup scans through a chain of namespaces giving data descriptors the highest priority, followed by instance variables, then non-data descriptors, then class variables, and lastly __getattr__() if it is provided

The logic for a dotted lookup is in object.__getattribute__(). Here is a pure Python equivalent:

def find_name_in_mro(cls, name, default):
"Emulate _PyType_Lookup() in Objects/typeobject.c"
for base in cls.__mro__:
if name in vars(base):
return vars(base)[name]
return default

def object_getattribute(obj, name):
"Emulate PyObject_GenericGetAttr() in Objects/object.c"
null = object()
objtype = type(obj)
cls_var = find_name_in_mro(objtype, name, null)
descr_get = getattr(type(cls_var), '__get__', null)
if descr_get is not null:
if (hasattr(type(cls_var), '__set__')
or hasattr(type(cls_var), '__delete__')):
return descr_get(cls_var, obj, objtype) # data descriptor
if hasattr(obj, '__dict__') and name in vars(obj):
return vars(obj)[name] # instance variable
if descr_get is not null:
return descr_get(cls_var, obj, objtype) # non-data descriptor
if cls_var is not null:
return cls_var # class variable
raise AttributeError(name)

Note, there is no __getattr__() hook in the __getattribute__() code. That is why calling __getattribute__() directly or with super().__getattribute__ will bypass __getattr__() entirely.

Instead, it is the dot operator and the getattr() function that are responsible for invoking __getattr__() whenever __getattribute__() raises an AttributeError. Their logic is encapsulated in a helper function:

def getattr_hook(obj, name):
"Emulate slot_tp_getattr_hook() in Objects/typeobject.c"
try:
return obj.__getattribute__(name)
except AttributeError:
if not hasattr(type(obj), '__getattr__'):
raise
return type(obj).__getattr__(obj, name) # __getattr__

Summary of invocation logic

The important points to remember are:

  • Descriptors are invoked by the __getattribute__() method.
  • Classes inherit this machinery from object, type, or super().
  • Overriding __getattribute__() prevents automatic descriptor calls because all the descriptor logic is in that method.
  • object.__getattribute__() and type.__getattribute__() make different calls to __get__(). The first includes the instance and may include the class. The second puts in None for the instance and always includes the class.
  • Data descriptors always override instance dictionaries.
  • Non-data descriptors may be overridden by instance dictionaries.

Automatic name notification

When a new class is created, the type metaclass scans the dictionary of the new class. If any of the entries are descriptors and if they define __set_name__(), that method is called with two arguments. The owner is the class where the descriptor is used, and the name is the class variable the descriptor was assigned to

ORM example

class Field:

def __set_name__(self, owner, name):
self.fetch = f'SELECT {name} FROM {owner.table} WHERE {owner.key}=?;'
self.store = f'UPDATE {owner.table} SET {name}=? WHERE {owner.key}=?;'

def __get__(self, obj, objtype=None):
return conn.execute(self.fetch, [obj.key]).fetchone()[0]

def __set__(self, obj, value):
conn.execute(self.store, [value, obj.key])
conn.commit()

Pure Python Equivalents

Properties, bound methods, static methods, class methods, and __slots__ are all based on the descriptor protocol

Properties

property(fget=None, fset=None, fdel=None, doc=None) -> property

Functions and methods

Functions stored in class dictionaries get turned into methods when invoked. Methods only differ from regular functions in that the object instance is prepended to the other arguments. By convention, the instance is called self but could be called this or any other variable name

Methods can be created manually with types.MethodType which is roughly equivalent to:

class MethodType:
"Emulate PyMethod_Type in Objects/classobject.c"

def __init__(self, func, obj):
self.__func__ = func
self.__self__ = obj

def __call__(self, *args, **kwargs):
func = self.__func__
obj = self.__self__
return func(obj, *args, **kwargs)

def __getattribute__(self, name):
"Emulate method_getset() in Objects/classobject.c"
if name == '__doc__':
return self.__func__.__doc__
return object.__getattribute__(self, name)

def __getattr__(self, name):
"Emulate method_getattro() in Objects/classobject.c"
return getattr(self.__func__, name)

def __get__(self, obj, objtype=None):
"Emulate method_descr_get() in Objects/classobject.c"
return self

To support automatic creation of methods, functions include the __get__() method for binding methods during attribute access. This means that functions are non-data descriptors that return bound methods during dotted lookup from an instance. Here’s how it works:

class Function:
...

def __get__(self, obj, objtype=None):
"Simulate func_descr_get() in Objects/funcobject.c"
if obj is None:
return self
return MethodType(self, obj)

Static methods

Using the non-data descriptor protocol, a pure Python version of staticmethod() would look like this

import functools

class StaticMethod:
"Emulate PyStaticMethod_Type() in Objects/funcobject.c"

def __init__(self, f):
self.f = f
functools.update_wrapper(self, f)

def __get__(self, obj, objtype=None):
return self.f

def __call__(self, *args, **kwds):
return self.f(*args, **kwds)

Class methods

import functools

class ClassMethod:
"Emulate PyClassMethod_Type() in Objects/funcobject.c"

def __init__(self, f):
self.f = f
functools.update_wrapper(self, f)

def __get__(self, obj, cls=None):
if cls is None:
cls = type(obj)
return MethodType(self.f, cls)

Member objects and slots

When a class defines __slots__, it replaces instance dictionaries with a fixed-length array of slot values