diff --git a/pep-0205.txt b/pep-0205.txt index 42a9c494031..18acad0a3a1 100644 --- a/pep-0205.txt +++ b/pep-0205.txt @@ -5,444 +5,461 @@ Last-Modified: $Date$ Author: Fred L. Drake, Jr. Status: Final Type: Standards Track +Content-Type: text/x-rst Created: Python-Version: 2.1 Post-History: 11-Jan-2001 Motivation - - There are two basic applications for weak references which have - been noted by Python programmers: object caches and reduction of - pain from circular references. - - Caches (weak dictionaries) - - There is a need to allow objects to be maintained that represent - external state, mapping a single instance to the external - reality, where allowing multiple instances to be mapped to the - same external resource would create unnecessary difficulty - maintaining synchronization among instances. In these cases, - a common idiom is to support a cache of instances; a factory - function is used to return either a new or existing instance. - - The difficulty in this approach is that one of two things must - be tolerated: either the cache grows without bound, or there - needs to be explicit management of the cache elsewhere in the - application. The later can be very tedious and leads to more - code than is really necessary to solve the problem at hand, - and the former can be unacceptable for long-running processes - or even relatively short processes with substantial memory - requirements. - - - External objects that need to be represented by a single - instance, no matter how many internal users there are. This - can be useful for representing files that need to be written - back to disk in whole rather than locked & modified for - every use. - - - Objects that are expensive to create, but may be needed by - multiple internal consumers. Similar to the first case, but - not necessarily bound to external resources, and possibly - not an issue for shared state. Weak references are only - useful in this case if there is some flavor of "soft" - references or if there is a high likelihood that users of - individual objects will overlap in lifespan. - - Circular references - - - DOMs require a huge amount of circular (to parent & document - nodes) references, but these could be eliminated using a weak - dictionary mapping from each node to its parent. This - might be especially useful in the context of something like - xml.dom.pulldom, allowing the .unlink() operation to become - a no-op. - - This proposal is divided into the following sections: - - - Proposed Solution - - Implementation Strategy - - Possible Applications - - Previous Weak Reference Work in Python - - Weak References in Java - - The full text of one early proposal is included as an appendix - since it does not appear to be available on the net. +========== + +There are two basic applications for weak references which have +been noted by Python programmers: object caches and reduction of +pain from circular references. + +Caches (weak dictionaries) +-------------------------- + +There is a need to allow objects to be maintained that represent +external state, mapping a single instance to the external +reality, where allowing multiple instances to be mapped to the +same external resource would create unnecessary difficulty +maintaining synchronization among instances. In these cases, +a common idiom is to support a cache of instances; a factory +function is used to return either a new or existing instance. + +The difficulty in this approach is that one of two things must +be tolerated: either the cache grows without bound, or there +needs to be explicit management of the cache elsewhere in the +application. The later can be very tedious and leads to more +code than is really necessary to solve the problem at hand, +and the former can be unacceptable for long-running processes +or even relatively short processes with substantial memory +requirements. + +- External objects that need to be represented by a single + instance, no matter how many internal users there are. This + can be useful for representing files that need to be written + back to disk in whole rather than locked & modified for + every use. + +- Objects that are expensive to create, but may be needed by + multiple internal consumers. Similar to the first case, but + not necessarily bound to external resources, and possibly + not an issue for shared state. Weak references are only + useful in this case if there is some flavor of "soft" + references or if there is a high likelihood that users of + individual objects will overlap in lifespan. + +Circular references +------------------- + +- DOMs require a huge amount of circular (to parent & document + nodes) references, but these could be eliminated using a weak + dictionary mapping from each node to its parent. This + might be especially useful in the context of something like + ``xml.dom.pulldom``, allowing the ``.unlink()`` operation to become + a no-op. + +This proposal is divided into the following sections: + +- Proposed Solution +- Implementation Strategy +- Possible Applications +- Previous Weak Reference Work in Python +- Weak References in Java + +The full text of one early proposal is included as an appendix +since it does not appear to be available on the net. Aspects of the Solution Space - - There are two distinct aspects to the weak references problem: - - - Invalidation of weak references - - Presentation of weak references to Python code - - Invalidation: - - Past approaches to weak reference invalidation have often hinged - on storing a strong reference and being able to examine all the - instances of weak reference objects, and invalidating them when - the reference count of their referent goes to one (indicating that - the reference stored by the weak reference is the last remaining - reference). This has the advantage that the memory management - machinery in Python need not change, and that any type can be - weakly referenced. - - The disadvantage of this approach to invalidation is that it - assumes that the management of the weak references is called - sufficiently frequently that weakly-referenced objects are noticed - within a reasonably short time frame; since this means a scan over - some data structure to invalidate references, an operation which - is O(N) on the number of weakly referenced objects, this is not - effectively amortized for any single object which is weakly - referenced. This also assumes that the application is calling - into code which handles weakly-referenced objects with some - frequency, which makes weak-references less attractive for library - code. - - An alternate approach to invalidation is that the de-allocation - code to be aware of the possibility of weak references and make a - specific call into the weak-reference management code to all - invalidation whenever an object is deallocated. This requires a - change in the tp_dealloc handler for weakly-referencable objects; - an additional call is needed at the "top" of the handler for - objects which support weak-referencing, and an efficient way to - map from an object to a chain of weak references for that object - is needed as well. - - Presentation: - - Two ways that weak references are presented to the Python layer - have been as explicit reference objects upon which some operation - is required in order to retrieve a usable reference to the - underlying object, and proxy objects which masquerade as the - original objects as much as possible. - - Reference objects are easy to work with when some additional layer - of object managemenet is being added in Python; references can be - checked for liveness explicitly, without having to invoke - operations on the referents and catching some special exception - raised when an invalid weak reference is used. - - However, a number of users favor the proxy approach simply because - the weak reference looks so much like the original object. +============================= + +There are two distinct aspects to the weak references problem: + +- Invalidation of weak references +- Presentation of weak references to Python code + +Invalidation +------------ + +Past approaches to weak reference invalidation have often hinged +on storing a strong reference and being able to examine all the +instances of weak reference objects, and invalidating them when +the reference count of their referent goes to one (indicating that +the reference stored by the weak reference is the last remaining +reference). This has the advantage that the memory management +machinery in Python need not change, and that any type can be +weakly referenced. + +The disadvantage of this approach to invalidation is that it +assumes that the management of the weak references is called +sufficiently frequently that weakly-referenced objects are noticed +within a reasonably short time frame; since this means a scan over +some data structure to invalidate references, an operation which +is O(N) on the number of weakly referenced objects, this is not +effectively amortized for any single object which is weakly +referenced. This also assumes that the application is calling +into code which handles weakly-referenced objects with some +frequency, which makes weak-references less attractive for library +code. + +An alternate approach to invalidation is that the de-allocation +code to be aware of the possibility of weak references and make a +specific call into the weak-reference management code to all +invalidation whenever an object is deallocated. This requires a +change in the tp_dealloc handler for weakly-referencable objects; +an additional call is needed at the "top" of the handler for +objects which support weak-referencing, and an efficient way to +map from an object to a chain of weak references for that object +is needed as well. + +Presentation +------------ + +Two ways that weak references are presented to the Python layer +have been as explicit reference objects upon which some operation +is required in order to retrieve a usable reference to the +underlying object, and proxy objects which masquerade as the +original objects as much as possible. + +Reference objects are easy to work with when some additional layer +of object managemenet is being added in Python; references can be +checked for liveness explicitly, without having to invoke +operations on the referents and catching some special exception +raised when an invalid weak reference is used. + +However, a number of users favor the proxy approach simply because +the weak reference looks so much like the original object. Proposed Solution - - Weak references should be able to point to any Python object that - may have substantial memory size (directly or indirectly), or hold - references to external resources (database connections, open - files, etc.). - - A new module, weakref, will contain new functions used to create - weak references. weakref.ref() will create a "weak reference - object" and optionally attach a callback which will be called when - the object is about to be finalized. weakref.mapping() will - create a "weak dictionary". A third function, weakref.proxy(), - will create a proxy object that behaves somewhat like the original - object. - - A weak reference object will allow access to the referenced object - if it hasn't been collected and to determine if the object still - exists in memory. Retrieving the referent is done by calling the - reference object. If the referent is no longer alive, this will - return None instead. - - A weak dictionary maps arbitrary keys to values, but does not own - a reference to the values. When the values are finalized, the - (key, value) pairs for which it is a value are removed from all - the mappings containing such pairs. Like dictionaries, weak - dictionaries are not hashable. - - Proxy objects are weak references that attempt to behave like the - object they proxy, as much as they can. Regardless of the - underlying type, proxies are not hashable since their ability to - act as a weak reference relies on a fundamental mutability that - will cause failures when used as dictionary keys -- even if the - proper hash value is computed before the referent dies, the - resulting proxy cannot be used as a dictionary key since it cannot - be compared once the referent has expired, and comparability is - necessary for dictionary keys. Operations on proxy objects after - the referent dies cause weakref.ReferenceError to be raised in - most cases. "is" comparisons, type(), and id() will continue to - work, but always refer to the proxy and not the referent. - - The callbacks registered with weak references must accept a single - parameter, which will be the weak reference or proxy object - itself. The object cannot be accessed or resurrected in the - callback. +================= + +Weak references should be able to point to any Python object that +may have substantial memory size (directly or indirectly), or hold +references to external resources (database connections, open +files, etc.). + +A new module, weakref, will contain new functions used to create +weak references. ``weakref.ref()`` will create a "weak reference +object" and optionally attach a callback which will be called when +the object is about to be finalized. ``weakref.mapping()`` will +create a "weak dictionary". A third function, ``weakref.proxy()``, +will create a proxy object that behaves somewhat like the original +object. + +A weak reference object will allow access to the referenced object +if it hasn't been collected and to determine if the object still +exists in memory. Retrieving the referent is done by calling the +reference object. If the referent is no longer alive, this will +return None instead. + +A weak dictionary maps arbitrary keys to values, but does not own +a reference to the values. When the values are finalized, the +(key, value) pairs for which it is a value are removed from all +the mappings containing such pairs. Like dictionaries, weak +dictionaries are not hashable. + +Proxy objects are weak references that attempt to behave like the +object they proxy, as much as they can. Regardless of the +underlying type, proxies are not hashable since their ability to +act as a weak reference relies on a fundamental mutability that +will cause failures when used as dictionary keys -- even if the +proper hash value is computed before the referent dies, the +resulting proxy cannot be used as a dictionary key since it cannot +be compared once the referent has expired, and comparability is +necessary for dictionary keys. Operations on proxy objects after +the referent dies cause weakref.ReferenceError to be raised in +most cases. "is" comparisons, ``type()``, and ``id()`` will continue to +work, but always refer to the proxy and not the referent. + +The callbacks registered with weak references must accept a single +parameter, which will be the weak reference or proxy object +itself. The object cannot be accessed or resurrected in the +callback. Implementation Strategy - - The implementation of weak references will include a list of - reference containers that must be cleared for each weakly- - referencable object. If the reference is from a weak dictionary, - the dictionary entry is cleared first. Then, any associated - callback is called with the object passed as a parameter. Once - all callbacks have been called, the object is finalized and - deallocated. - - Many built-in types will participate in the weak-reference - management, and any extension type can elect to do so. The type - structure will contain an additional field which provides an - offset into the instance structure which contains a list of weak - reference structures. If the value of the field is <= 0, the - object does not participate. In this case, weakref.ref(), - .__setitem__() and .setdefault(), and item assignment will - raise TypeError. If the value of the field is > 0, a new weak - reference can be generated and added to the list. - - This approach is taken to allow arbitrary extension types to - participate, without taking a memory hit for numbers or other - small types. - - Standard types which support weak references include instances, - functions, and bound & unbound methods. With the addition of - class types ("new-style classes") in Python 2.2, types grew - support for weak references. Instances of class types are weakly - referencable if they have a base type which is weakly referencable, - the class not specify __slots__, or a slot is named __weakref__. - Generators also support weak references. +======================= + +The implementation of weak references will include a list of +reference containers that must be cleared for each weakly- +referencable object. If the reference is from a weak dictionary, +the dictionary entry is cleared first. Then, any associated +callback is called with the object passed as a parameter. Once +all callbacks have been called, the object is finalized and +deallocated. + +Many built-in types will participate in the weak-reference +management, and any extension type can elect to do so. The type +structure will contain an additional field which provides an +offset into the instance structure which contains a list of weak +reference structures. If the value of the field is <= 0, the +object does not participate. In this case, ``weakref.ref()``, +``.__setitem__()`` and ``.setdefault()``, and item assignment will +raise ``TypeError``. If the value of the field is > 0, a new weak +reference can be generated and added to the list. + +This approach is taken to allow arbitrary extension types to +participate, without taking a memory hit for numbers or other +small types. + +Standard types which support weak references include instances, +functions, and bound & unbound methods. With the addition of +class types ("new-style classes") in Python 2.2, types grew +support for weak references. Instances of class types are weakly +referencable if they have a base type which is weakly referencable, +the class not specify ``__slots__``, or a slot is named ``__weakref__``. +Generators also support weak references. Possible Applications +===================== - PyGTK+ bindings? +PyGTK+ bindings? - Tkinter -- could avoid circular references by using weak - references from widgets to their parents. Objects won't be - discarded any sooner in the typical case, but there won't be so - much dependence on the programmer calling .destroy() before - releasing a reference. This would mostly benefit long-running - applications. +Tkinter -- could avoid circular references by using weak +references from widgets to their parents. Objects won't be +discarded any sooner in the typical case, but there won't be so +much dependence on the programmer calling ``.destroy()`` before +releasing a reference. This would mostly benefit long-running +applications. - DOM trees. +DOM trees. Previous Weak Reference Work in Python +====================================== - Dianne Hackborn has proposed something called "virtual references". - 'vref' objects are very similar to java.lang.ref.WeakReference - objects, except there is no equivalent to the invalidation - queues. Implementing a "weak dictionary" would be just as - difficult as using only weak references (without the invalidation - queue) in Java. Information on this has disappeared from the Web, - but is included below as an Appendix. +Dianne Hackborn has proposed something called "virtual references". +'vref' objects are very similar to java.lang.ref.WeakReference +objects, except there is no equivalent to the invalidation +queues. Implementing a "weak dictionary" would be just as +difficult as using only weak references (without the invalidation +queue) in Java. Information on this has disappeared from the Web, +but is included below as an Appendix. - Marc-André Lemburg's mx.Proxy package: +Marc-André Lemburg's mx.Proxy package: - http://www.lemburg.com/files/python/mxProxy.html + http://www.lemburg.com/files/python/mxProxy.html - The weakdict module by Dieter Maurer is implemented in C and - Python. It appears that the Web pages have not been updated since - Python 1.5.2a, so I'm not yet sure if the implementation is - compatible with Python 2.0. +The weakdict module by Dieter Maurer is implemented in C and +Python. It appears that the Web pages have not been updated since +Python 1.5.2a, so I'm not yet sure if the implementation is +compatible with Python 2.0. - http://www.handshake.de/~dieter/weakdict.html + http://www.handshake.de/~dieter/weakdict.html - PyWeakReference by Alex Shindich: +PyWeakReference by Alex Shindich: - http://sourceforge.net/projects/pyweakreference/ + http://sourceforge.net/projects/pyweakreference/ - Eric Tiedemann has a weak dictionary implementation: +Eric Tiedemann has a weak dictionary implementation: - http://www.hyperreal.org/~est/python/weak/ + http://www.hyperreal.org/~est/python/weak/ Weak References in Java - - http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html - - Java provides three forms of weak references, and one interesting - helper class. The three forms are called "weak", "soft", and - "phantom" references. The relevant classes are defined in the - java.lang.ref package. - - For each of the reference types, there is an option to add the - reference to a queue when it is invalidated by the memory - allocator. The primary purpose of this facility seems to be that - it allows larger structures to be composed to incorporate - weak-reference semantics without having to impose substantial - additional locking requirements. For instance, it would not be - difficult to use this facility to create a "weak" hash table which - removes keys and referents when a reference is no longer used - elsewhere. Using weak references for the objects without some - sort of notification queue for invalidations leads to much more - tedious implementation of the various operations required on hash - tables. This can be a performance bottleneck if deallocations of - the stored objects are infrequent. - - Java's "weak" references are most like Dianne Hackborn's old vref - proposal: a reference object refers to a single Python object, - but does not own a reference to that object. When that object is - deallocated, the reference object is invalidated. Users of the - reference object can easily determine that the reference has been - invalidated, or a NullObjectDereferenceError can be raised when - an attempt is made to use the referred-to object. - - The "soft" references are similar, but are not invalidated as soon - as all other references to the referred-to object have been - released. The "soft" reference does own a reference, but allows - the memory allocator to free the referent if the memory is needed - elsewhere. It is not clear whether this means soft references are - released before the malloc() implementation calls sbrk() or its - equivalent, or if soft references are only cleared when malloc() - returns NULL. - - "Phantom" references are a little different; unlike weak and soft - references, the referent is not cleared when the reference is - added to its queue. When all phantom references for an object - are dequeued, the object is cleared. This can be used to keep an - object alive until some additional cleanup is performed which - needs to happen before the objects .finalize() method is called. - - Unlike the other two reference types, "phantom" references must be - associated with an invalidation queue. +======================= + +http://java.sun.com/j2se/1.3/docs/api/java/lang/ref/package-summary.html + +Java provides three forms of weak references, and one interesting +helper class. The three forms are called "weak", "soft", and +"phantom" references. The relevant classes are defined in the +java.lang.ref package. + +For each of the reference types, there is an option to add the +reference to a queue when it is invalidated by the memory +allocator. The primary purpose of this facility seems to be that +it allows larger structures to be composed to incorporate +weak-reference semantics without having to impose substantial +additional locking requirements. For instance, it would not be +difficult to use this facility to create a "weak" hash table which +removes keys and referents when a reference is no longer used +elsewhere. Using weak references for the objects without some +sort of notification queue for invalidations leads to much more +tedious implementation of the various operations required on hash +tables. This can be a performance bottleneck if deallocations of +the stored objects are infrequent. + +Java's "weak" references are most like Dianne Hackborn's old vref +proposal: a reference object refers to a single Python object, +but does not own a reference to that object. When that object is +deallocated, the reference object is invalidated. Users of the +reference object can easily determine that the reference has been +invalidated, or a NullObjectDereferenceError can be raised when +an attempt is made to use the referred-to object. + +The "soft" references are similar, but are not invalidated as soon +as all other references to the referred-to object have been +released. The "soft" reference does own a reference, but allows +the memory allocator to free the referent if the memory is needed +elsewhere. It is not clear whether this means soft references are +released before the ``malloc()`` implementation calls ``sbrk()`` or its +equivalent, or if soft references are only cleared when ``malloc()`` +returns ``NULL``. + +"Phantom" references are a little different; unlike weak and soft +references, the referent is not cleared when the reference is +added to its queue. When all phantom references for an object +are dequeued, the object is cleared. This can be used to keep an +object alive until some additional cleanup is performed which +needs to happen before the objects ``.finalize()`` method is called. + +Unlike the other two reference types, "phantom" references must be +associated with an invalidation queue. Appendix -- Dianne Hackborn's vref proposal (1995) - - [This has been indented and paragraphs reflowed, but there have be - no content changes. --Fred] - - Proposal: Virtual References - - In an attempt to partly address the recurring discussion - concerning reference counting vs. garbage collection, I would like - to propose an extension to Python which should help in the - creation of "well structured" cyclic graphs. In particular, it - should allow at least trees with parent back-pointers and - doubly-linked lists to be created without worry about cycles. - - The basic mechanism I'd like to propose is that of a "virtual - reference," or a "vref" from here on out. A vref is essentially a - handle on an object that does not increment the object's reference - count. This means that holding a vref on an object will not keep - the object from being destroyed. This would allow the Python - programmer, for example, to create the aforementioned tree - structure tree structure, which is automatically destroyed when it - is no longer in use -- by making all of the parent back-references - into vrefs, they no longer create reference cycles which keep the - tree from being destroyed. - - In order to implement this mechanism, the Python core must ensure - that no -real- pointers are ever left referencing objects that no - longer exist. The implementation I would like to propose involves - two basic additions to the current Python system: - - 1. A new "vref" type, through which the Python programmer creates - and manipulates virtual references. Internally, it is - basically a C-level Python object with a pointer to the Python - object it is a reference to. Unlike all other Python code, - however, it does not change the reference count of this object. - In addition, it includes two pointers to implement a - doubly-linked list, which is used below. - - 2. The addition of a new field to the basic Python object - [PyObject_Head in object.h], which is either NULL, or points to - the head of a list of all vref objects that reference it. When - a vref object attaches itself to another object, it adds itself - to this linked list. Then, if an object with any vrefs on it - is deallocated, it may walk this list and ensure that all of - the vrefs on it point to some safe value, e.g. Nothing. - - - This implementation should hopefully have a minimal impact on the - current Python core -- when no vrefs exist, it should only add one - pointer to all objects, and a check for a NULL pointer every time - an object is deallocated. - - Back at the Python language level, I have considered two possible - semantics for the vref object -- - - ==> Pointer semantics: - - In this model, a vref behaves essentially like a Python-level - pointer; the Python program must explicitly dereference the vref - to manipulate the actual object it references. - - An example vref module using this model could include the - function "new"; When used as 'MyVref = vref.new(MyObject)', it - returns a new vref object such that MyVref.object == - MyObject. MyVref.object would then change to Nothing if - MyObject is ever deallocated. - - For a concrete example, we may introduce some new C-style syntax: - - & -- unary operator, creates a vref on an object, same as vref.new(). - * -- unary operator, dereference a vref, same as VrefObject.object. - - We can then define: - - 1. type(&MyObject) == vref.VrefType - 2. *(&MyObject) == MyObject - 3. (*(&MyObject)).attr == MyObject.attr - 4. &&MyObject == Nothing - 5. *MyObject -> exception - - Rule #4 is subtle, but comes about because we have made a vref - to (a vref with no real references). Thus the outer vref is - cleared to Nothing when the inner one inevitably disappears. - - ==> Proxy semantics: - - In this model, the Python programmer manipulates vref objects - just as if she were manipulating the object it is a reference - of. This is accomplished by implementing the vref so that all - operations on it are redirected to its referenced object. With - this model, the dereference operator (*) no longer makes sense; - instead, we have only the reference operator (&), and define: - - 1. type(&MyObject) == type(MyObject) - 2. &MyObject == MyObject - 3. (&MyObject).attr == MyObject.attr - 4. &&MyObject == MyObject - - Again, rule #4 is important -- here, the outer vref is in fact a - reference to the original object, and -not- the inner vref. - This is because all operations applied to a vref actually apply - to its object, so that creating a vref of a vref actually - results in creating a vref of the latter's object. - - The first, pointer semantics, has the advantage that it would be - very easy to implement; the vref type is extremely simple, - requiring at minimum a single attribute, object, and a function to - create a reference. - - However, I really like the proxy semantics. Not only does it put - less of a burden on the Python programmer, but it allows you to do - nice things like use a vref anywhere you would use the actual - object. Unfortunately, it would probably an extreme pain, if not - practically impossible, to implement in the current Python - implementation. I do have some thoughts, though, on how to do - this, if it seems interesting; one possibility is to introduce new - type-checking functions which handle the vref. This would - hopefully older C modules which don't expect vrefs to simply - return a type error, until they can be fixed. - - Finally, there are some other additional capabilities that this - system could provide. One that seems particularly interesting to - me involves allowing the Python programmer to add "destructor" - function to a vref -- this Python function would be called - immediately prior to the referenced object being deallocated, - allowing a Python program to invisibly attach itself to another - object and watch for it to disappear. This seems neat, though I - haven't actually come up with any practical uses for it, yet... :) - - -- Dianne +================================================== + +[This has been indented and paragraphs reflowed, but there have be +no content changes. --Fred] + +Proposal: Virtual References +---------------------------- + +In an attempt to partly address the recurring discussion +concerning reference counting vs. garbage collection, I would like +to propose an extension to Python which should help in the +creation of "well structured" cyclic graphs. In particular, it +should allow at least trees with parent back-pointers and +doubly-linked lists to be created without worry about cycles. + +The basic mechanism I'd like to propose is that of a "virtual +reference," or a "vref" from here on out. A vref is essentially a +handle on an object that does not increment the object's reference +count. This means that holding a vref on an object will not keep +the object from being destroyed. This would allow the Python +programmer, for example, to create the aforementioned tree +structure tree structure, which is automatically destroyed when it +is no longer in use -- by making all of the parent back-references +into vrefs, they no longer create reference cycles which keep the +tree from being destroyed. + +In order to implement this mechanism, the Python core must ensure +that no -real- pointers are ever left referencing objects that no +longer exist. The implementation I would like to propose involves +two basic additions to the current Python system: + +1. A new "vref" type, through which the Python programmer creates + and manipulates virtual references. Internally, it is + basically a C-level Python object with a pointer to the Python + object it is a reference to. Unlike all other Python code, + however, it does not change the reference count of this object. + In addition, it includes two pointers to implement a + doubly-linked list, which is used below. + +2. The addition of a new field to the basic Python object + [``PyObject_Head`` in object.h], which is either ``NULL``, or points to + the head of a list of all vref objects that reference it. When + a vref object attaches itself to another object, it adds itself + to this linked list. Then, if an object with any vrefs on it + is deallocated, it may walk this list and ensure that all of + the vrefs on it point to some safe value, e.g. Nothing. + + +This implementation should hopefully have a minimal impact on the +current Python core -- when no vrefs exist, it should only add one +pointer to all objects, and a check for a ``NULL`` pointer every time +an object is deallocated. + +Back at the Python language level, I have considered two possible +semantics for the vref object -- + +Pointer semantics +----------------- + +In this model, a vref behaves essentially like a Python-level +pointer; the Python program must explicitly dereference the vref +to manipulate the actual object it references. + +An example vref module using this model could include the +function "new"; When used as 'MyVref = vref.new(MyObject)', it +returns a new vref object such that ``MyVref.object == MyObject``. +``MyVref.object`` would then change to Nothing if +``MyObject`` is ever deallocated. + +For a concrete example, we may introduce some new C-style syntax: + +* ``&`` -- unary operator, creates a vref on an object, same as ``vref.new()``. +* ``*`` -- unary operator, dereference a vref, same as ``VrefObject.object``. + +We can then define:: + + 1. type(&MyObject) == vref.VrefType + 2. *(&MyObject) == MyObject + 3. (*(&MyObject)).attr == MyObject.attr + 4. &&MyObject == Nothing + 5. *MyObject -> exception + +Rule #4 is subtle, but comes about because we have made a vref +to (a vref with no real references). Thus the outer vref is +cleared to Nothing when the inner one inevitably disappears. + +Proxy semantics +---------------- + +In this model, the Python programmer manipulates vref objects +just as if she were manipulating the object it is a reference +of. This is accomplished by implementing the vref so that all +operations on it are redirected to its referenced object. With +this model, the dereference operator (*) no longer makes sense; +instead, we have only the reference operator (&), and define:: + + 1. type(&MyObject) == type(MyObject) + 2. &MyObject == MyObject + 3. (&MyObject).attr == MyObject.attr + 4. &&MyObject == MyObject + +Again, rule #4 is important -- here, the outer vref is in fact a +reference to the original object, and -not- the inner vref. +This is because all operations applied to a vref actually apply +to its object, so that creating a vref of a vref actually +results in creating a vref of the latter's object. + +The first, pointer semantics, has the advantage that it would be +very easy to implement; the vref type is extremely simple, +requiring at minimum a single attribute, object, and a function to +create a reference. + +However, I really like the proxy semantics. Not only does it put +less of a burden on the Python programmer, but it allows you to do +nice things like use a vref anywhere you would use the actual +object. Unfortunately, it would probably an extreme pain, if not +practically impossible, to implement in the current Python +implementation. I do have some thoughts, though, on how to do +this, if it seems interesting; one possibility is to introduce new +type-checking functions which handle the vref. This would +hopefully older C modules which don't expect vrefs to simply +return a type error, until they can be fixed. + +Finally, there are some other additional capabilities that this +system could provide. One that seems particularly interesting to +me involves allowing the Python programmer to add "destructor" +function to a vref -- this Python function would be called +immediately prior to the referenced object being deallocated, +allowing a Python program to invisibly attach itself to another +object and watch for it to disappear. This seems neat, though I +haven't actually come up with any practical uses for it, yet... :) + +-- Dianne Copyright +========= - This document has been placed in the public domain. +This document has been placed in the public domain. - -Local Variables: -mode: indented-text -indent-tabs-mode: nil -sentence-end-double-space: t -fill-column: 70 -coding: utf-8 -End: +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + coding: utf-8 + End: