Python描述器介绍

发表于 2022-11-27 更新于 2025-03-21 分类于 Python 本文字数： 5.6k 阅读时长 ≈ 9 分钟

概述

Python 描述器(Descriptor) 是一个具有”绑定行为”的对象属性，它的访问控制（读、写和删除）可以由描述器协议(Descriptor Protocol)重写。一个实现 __set__、__get__和__del__方法中的任意一个的对象就称为描述符。

主要翻译自 Python Descriptors: An Introduction by Davide Mastromatteo.

描述器定义

描述器是一个实现描述器协议中方法的 Python 对象，描述器协议定义如下：

__get__(self, obj, type=None) -> object
__set__(self, obj, value) -> None
__delete__(self, obj) -> None
__set_name__(self, owner, name)

如果描述器仅仅实现了 __get__() 方法，则称它为非数据描述器(non-data descriptor)；如果它实现了 __set__() 或 __del__() 方法，则称它为数据描述器(data-decriptor)；两者不仅在名字上，在行为上也有差别，数据描述器在属性遍历中有更高的优先级。

示例：

# descriptors.py
class Verbose_attribute():
    def __get__(self, obj, type=None) -> object:
        print("accessing the attribute to get the value")
        return 42
    def __set__(self, obj, value) -> None:
        print("accessing the attribute to set the value")
        raise AttributeError("Cannot change the value")

class Foo():
    attribute1 = Verbose_attribute()

my_foo_object = Foo()
x = my_foo_object.attribute1
print(x)

上述例子中，Verbose_attribute 实现了描述器协议，并被示例化为 Foo 的属性，因此它可以被认为是一个描述器。

作为描述器，当它被访问时（使用符号 .）会做出自己的绑定行为(binding behavior)。在上面的例子中，当 Verbose_attribute 描述器被访问时，它就会向终端打印日志信息。

运行上面的例子，将会得到下面的输出：

1
2
3

$ python descriptors.py
accessing the attribute to get the value
42

内部工作细节

熟悉 Python 面向对象的用户会认为上面的例子完全可以使用 properties 来达到同样的效果。尽管这是对的，但实际上，properties 在 python 中就是描述器。

Property 中的描述器

对于第2节的例子，如果不想显式地使用描述器，直接使用 property 也可以达到同样的结果：

# property_decorator.py
class Foo():
    @property
    def attribute1(self) -> object:
        print("accessing the attribute to get the value")
        return 42

    @attribute1.setter
    def attribute1(self, value) -> None:
        print("accessing the attribute to set the value")
        raise AttributeError("Cannot change the value")

my_foo_object = Foo()
x = my_foo_object.attribute1
print(x)

上面的例子使用了修饰符(decorators)来定义 property，但 decorators 是一种语法糖。实际上，它可以被重写成下面的例子：

# property_function.py
class Foo():
    def getter(self) -> object:
        print("accessing the attribute to get the value")
        return 42

    def setter(self, value) -> None:
        print("accessing the attribute to set the value")
        raise AttributeError("Cannot change the value")

    attribute1 = property(getter, setter)

my_foo_object = Foo()
x = my_foo_object.attribute1
print(x)

property() 返回一个实现了描述器协议的 property 对象，它使用参数 fget、fset 和 fdel 作为描述器协议中三个方法的具体实现。

方法和函数中的描述器

Function 类、ClassMethod类和StaticMethod类实际上都实现了 __get__() 方法，因此可以认为它们是非数据描述器。其中ClassMethod描述器的实现如下：

class ClassMethod(object):
    "Emulate PyClassMethod_Type() in Objects/funcobject.c"
    def __init__(self, f):
        self.f = f

    def __get__(self, obj, klass=None):
        if klass is None:
            klass = type(obj)
        def newfunc(*args):
            return self.f(klass, *args)
        return newfunc

当一个对象调用类方法(method)时，obj.method(*args) 就会被转换为 method(type(obj), *args)

对于静态方法，则更简单：

class StaticMethod(object):
    "Emulate PyStaticMethod_Type() in Objects/funcobject.c"
    def __init__(self, f):
        self.f = f

    def __get__(self, obj, objtype=None):
        return self.f

当一个对象调用静态方法时，obj.method(*args) 会被转换为 method(*args) 。

如何使用查找链进行属性访问?

在 Python 中，任何对象都有一个内值的 __dict__ 属性，它是一个包含了在该对象中定义的所有属性的字典。
对象所属的类(Class)也是一个对象，因此它也有 __dict__ 属性，它包含了该类的属性和方法。

当我们在 Python 中访问一个属性时，在底层到底发生了什么呢？解释器是如何直到你想要什么呢？这些问题可以通过查找链（Lookup chain）的概念来回答：

首先，如果你要查找的属性是一个数据描述器，该描述器的 __get__方法会被调用，其返回结果作为属性的值。
如果失败，则会在对象的__dict__ 中查找，该属性作为查找时使用的 key。
如果失败，如果你要查找的属性是一个非数据描述器，该描述器的 __get__方法会被调用，其返回结果作为属性的值。
如果失败，则会在对象所属类的__dict__ 中查找，该属性作为查找时使用的 key。
如果失败，则会在对象所属类的父类的__dict__ 中查找，该属性作为查找时使用的 key。
如果失败，重复上一步，直到遍历所有父类。
如果都失败了，则返回 AttributeError 异常。

如何正确地使用 Python 描述器?

仅仅需要实现描述符协议，其中最重要的是 __get__ 和 __set__ 方法：

1
2
3

__get__(self, obj, type=None) -> object
__set__(self, obj, value) -> None

另外，一定要注意：

self 代表描述符实例。
obj 代表描述符所属的对象的实例。
type 代表描述符所属对象的类型。

在 __set__() 中，不需要 type 变量，因为只有对象才能调用 __set__()，而__get__()可以被对象或类调用。

另外，描述器只会被实例化一次，即每一个类的所有实例共享该类中描述器的实例。示例如下：

# descriptors2.py
class OneDigitNumericValue():
    def __init__(self):
        self.value = 0
    def __get__(self, obj, type=None) -> object:
        return self.value
    def __set__(self, obj, value) -> None:
        if value > 9 or value < 0 or int(value) != value:
            raise AttributeError("The value is invalid")
        self.value = value

class Foo():
    number = OneDigitNumericValue()

my_foo_object = Foo()
my_second_foo_object = Foo()

my_foo_object.number = 3
print(my_foo_object.number)
print(my_second_foo_object.number)

my_third_foo_object = Foo()
print(my_third_foo_object.number)

上例的运行结果是所有 Foo 的实例的 number 属性有着相同的值。number描述器实际上仅仅是一个类属性（class-level attribute）。

如何解决这个问题呢？如果在描述器 OneDigitNumericValue中为所有 Foo 的对象建立字典，保存其 value 值，可以吗？这样做会造成对 Foo 对象的强引用（strong reference），影响 gc 释放 Foo 对象所占的内存。

正确的做法应该是将 value 存贮在 Foo 对象的 __dict__ 属性中，因为每个对象都有自己单独的 __dict__ 属性。示例如下：

# descriptors4.py
class OneDigitNumericValue():
    def __set_name__(self, owner, name):
        self.name = name

    def __get__(self, obj, type=None) -> object:
        return obj.__dict__.get(self.name) or 0

    def __set__(self, obj, value) -> None:
        obj.__dict__[self.name] = value

class Foo():
    number = OneDigitNumericValue()

my_foo_object = Foo()
my_second_foo_object = Foo()

my_foo_object.number = 3
print(my_foo_object.number)
print(my_second_foo_object.number)

my_third_foo_object = Foo()
print(my_third_foo_object.number)

其中 __set_name__() 方法在描述符实例化时会被默认调用，并且 name 参数会被默认设置。

概述