Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadAt #323

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

loadAt #323

wants to merge 12 commits into from

Conversation

navytux
Copy link
Contributor

@navytux navytux commented Jul 27, 2020

loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

  1. returns data=None and serial of the removal, when loaded object was found to
    be deleted. loadBefore is returning only data=None in such case. This loadAt
    property allows to fix DemoStorage data corruption when whiteouts in overlay
    part were not previously correctly taken into account.

    DemoStorage does not take whiteouts into account -> leading to data corruption #318

  2. for regular data records, does not require storages to return next_serial,
    in addition to (data, serial). loadBefore requirement to return both
    serial and next_serial is constraining storages unnecessarily, and,
    while for FileStorage it is free to implement, for other storages it is
    not - for example for NEO and RelStorage, finding out next_serial, after
    looking up oid@at data record, costs one more SQL query:

    https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
    https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

    https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
    https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

    next_serial is not only about execution overhead - it is semantically
    redundant to be there and can be removed from load return. The reason
    I say that next_serial can be removed is that in ZODB/py the only place,
    that I could find, where next_serial is used on client side is in client
    cache (e.g. in NEO client cache), and that cache can be remade to
    work without using that next_serial at all. In simple words whenever
    after

      loadAt(oid, at)  ->  (data, serial)
    

    query, the cache can remember data for oid in [serial, at] range.

    Next, when invalidation message from server is received, cache entries,
    that had at == client_head, are extended (at -> new_head) for oids that
    are not present in invalidation message, while for oids that are present
    in invalidation message no such extension is done. This allows to
    maintain cache in correct state, invalidate it when there is a need to
    invalidate, and not to throw away cache entries that should remain live.
    This of course requires ZODB server to include both modified and
    just-created objects into invalidation messages

    ( Include both modified and just created objects into invalidations ZEO#160 ,
    interface: Require invalidations to be called with full set of objects and not to skip transactions #319 ).

    Switching to loadAt should thus allow storages like NEO and, maybe,
    RelStorage, to do 2x less SQL queries on every object access.

    DemoStorage does not take whiteouts into account -> leading to data corruption #318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError                          object does not exist at all 
   None                                 object was removed
   (data, serial, next_serial)          regular data record

used by loadBefore.

This patch:

  • introduces new interface.
  • introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
    or, if the storage does not implement loadAt interface, tries to mimic
    loadAt semantic via storage.loadBefore to possible extent + emits
    corresponding warning.
  • converts MVCCAdapter to use loadAt instead of loadBefore.
  • changes DemoStorage to use loadAt, and this way fixes above-mentioned
    data corruption issue; adds corresponding test; converts
    DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
  • adds loadAt implementation to FileStorage and MappingStorage.
  • adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch

@navytux
Copy link
Contributor Author

navytux commented Jul 27, 2020

( fixed travis wrt ZOPE_INTERFACE_STRICT_IRO=1 )

@navytux
Copy link
Contributor Author

navytux commented Jul 27, 2020

Fixed CI on Windows by not forgetting to close test database before its removal. Python38 failure is due to "Failed to connect to github.com port 443" and so should be transient.

interdiff
--- a/src/ZODB/tests/testDemoStorage.py
+++ b/src/ZODB/tests/testDemoStorage.py
@@ -357,6 +357,10 @@ def getObjAt(at):
         self.assertEqual(getObjAt(atUnlink), 2)
         self.assertRaises(ZODB.POSException.POSKeyError, getObjAt, atGC)
 
+        # end
+        db.close()
+        zdemo.close() # closes zbase and zoverlay as well
+
 
 def test_suite():
     suite = unittest.TestSuite((

@navytux
Copy link
Contributor Author

navytux commented Jul 31, 2020

( rebased on top of master since #320 was merged )

NexediGitlab pushed a commit to Nexedi/zodbtools that referenced this pull request Mar 16, 2021
When zodbdump input says to copy an object, we first load that object.
However if object does not exist loadBefore raises POSKeyError, and when
object at copied-from revision was deleted loadBefore returns None.

-> Handle that explicitly to provide failure details to the user, so
that instead of cryptic

    === RUN   TestLoad/δstart=0285cbac75555580
    Traceback (most recent call last):
      File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
        "__main__", fname, loader, pkg_name)
      File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
        exec code in run_globals
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module>
        main()
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
        return command_module.main(argv)
      File "<decorator-gen-6>", line 2, in main
      File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
        return f(*argv, **kw)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main
        zodbrestore(stor, asbinstream(sys.stdin), _)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore
        zodbcommit(stor, at, txn)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 122, in zodbcommit
        _()
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 91, in _
        data, _, _ = stor.loadBefore(obj.oid, p64(u64(obj.copy_from)+1))
    TypeError: 'NoneType' object is not iterable
        xtesting.go:483: /tmp/demo009767458/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1

it fails with something more understandable:

    === RUN   TestLoad/δstart=0285cbac75555580
    Traceback (most recent call last):
      File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
        "__main__", fname, loader, pkg_name)
      File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
        exec code in run_globals
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 133, in <module>
        main()
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodb.py", line 129, in main
        return command_module.main(argv)
      File "<decorator-gen-6>", line 2, in main
      File "/home/kirr/src/tools/go/pygolang/golang/__init__.py", line 103, in _
        return f(*argv, **kw)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main
        zodbrestore(stor, asbinstream(sys.stdin), _)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore
        zodbcommit(stor, at, txn)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 129, in zodbcommit
        _()
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 97, in _
        (stor.getName(), ashex(obj.oid), ashex(obj.copy_from)))
    ValueError: /tmp/demo358030847/δ0285cbac75555580/δ.fs: object 0000000000000003: copy from @0285cbac70a3d733: no data
        xtesting.go:483: /tmp/demo358030847/δ0285cbac75555580/δ.fs: zpyrestore: exit status 1

For the implementation it would be easier to use loadAt
(zopefoundation/ZODB#323), but we don't have
that yet.

/reviewed-by @jerome
/reviewed-on https://lab.nexedi.com/nexedi/zodbtools/merge_requests/20
loadAt is new optional storage interface that is intended to replace loadBefore
with more clean and uniform semantic. Compared to loadBefore, loadAt:

1) returns data=None and serial of the removal, when loaded object was found to
   be deleted. loadBefore is returning only data=None in such case. This loadAt
   property allows to fix DemoStorage data corruption when whiteouts in overlay
   part were not previously correctly taken into account.

   zopefoundation#318

2) for regular data records, does not require storages to return next_serial,
   in addition to (data, serial). loadBefore requirement to return both
   serial and next_serial is constraining storages unnecessarily, and,
   while for FileStorage it is free to implement, for other storages it is
   not - for example for NEO and RelStorage, finding out next_serial, after
   looking up oid@at data record, costs one more SQL query:

   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L484-508
   https://lab.nexedi.com/nexedi/neoppod/blob/fb746e6b/neo/storage/database/mysqldb.py#L477-482

   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/storage/load.py#L259-L264
   https://github.com/zodb/relstorage/blob/3.1.1-1-ge7628f9/src/relstorage/adapters/mover.py#L177-L199

   next_serial is not only about execution overhead - it is semantically
   redundant to be there and can be removed from load return. The reason
   I say that next_serial can be removed is that in ZODB/py the only place,
   that I could find, where next_serial is used on client side is in client
   cache (e.g. in NEO client cache), and that cache can be remade to
   work without using that next_serial at all. In simple words whenever
   after

     loadAt(oid, at)  ->  (data, serial)

   query, the cache can remember data for oid in [serial, at] range.

   Next, when invalidation message from server is received, cache entries,
   that had at == client_head, are extended (at -> new_head) for oids that
   are not present in invalidation message, while for oids that are present
   in invalidation message no such extension is done. This allows to
   maintain cache in correct state, invalidate it when there is a need to
   invalidate, and not to throw away cache entries that should remain live.
   This of course requires ZODB server to include both modified and
   just-created objects into invalidation messages

     ( zopefoundation/ZEO#160 ,
       zopefoundation#319 ).

   Switching to loadAt should thus allow storages like NEO and, maybe,
   RelStorage, to do 2x less SQL queries on every object access.

   zopefoundation#318 (comment)

In other words loadAt unifies return signature to always be

   (data, serial)

instead of

   POSKeyError				object does not exist at all
   None					object was removed
   (data, serial, next_serial)		regular data record

used by loadBefore.

This patch:

- introduces new interface.
- introduces ZODB.utils.loadAt helper, that uses either storage.loadAt,
  or, if the storage does not implement loadAt interface, tries to mimic
  loadAt semantic via storage.loadBefore to possible extent + emits
  corresponding warning.
- converts MVCCAdapter to use loadAt instead of loadBefore.
- changes DemoStorage to use loadAt, and this way fixes above-mentioned
  data corruption issue; adds corresponding test; converts
  DemoStorage.loadBefore to be a wrapper around DemoStorage.loadAt.
- adds loadAt implementation to FileStorage and MappingStorage.
- adapts other tests/code correspondingly.

/cc @jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch
@navytux
Copy link
Contributor Author

navytux commented Mar 16, 2021

I decoupled this patch from #322 - that ReadConflictError -> POSKeyError switch requires more involvement for which I don't currently have time. On the other hand existing behaviour could be still preserved with careful programming around loadAt usage in MVCCAdapter.load. All ZODB tests pass now - without complete switch to POSKeyError - including checkPackNowWhileWriting and friends.

Once again loadAt is needed to fix DemoStorage data corruption (#318), and opens the door to do 2x less SQL queries on every load for NEO and RelStorage (see #318 (comment)). I also need loadAt as the basis for further work on DemoStorage enhancements.

@jimfulton, @jamadden, @vpelletier, @jmuchemb, @arnaud-fontaine, @gidzit, @klawlf82, @hannosch, please take a look.

Thanks beforehand,
Kirill

@navytux navytux requested review from jamadden and jimfulton March 18, 2021 07:32
navytux added a commit to navytux/ZODB that referenced this pull request Mar 26, 2021
So that it is generally possible to zodbrestore[1,2] zodbdumped[3]
database/transactions into DemoStorage. Because without this patch, when dump
contains deletion records, e.g.

    txn 0285cbacc06d3a4c " "
    user ""
    description ""
    extension ""
    obj 0000000000000007 delete

it fails like this:

    Traceback (most recent call last):
      ...
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 94, in main
        zodbrestore(stor, asbinstream(sys.stdin), _)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbrestore.py", line 43, in zodbrestore
        zodbcommit(stor, at, txn)
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 131, in zodbcommit
        _()
      File "/home/kirr/src/wendelin/z/zodbtools/zodbtools/zodbcommit.py", line 118, in _
        stor.deleteObject(obj.oid, current_serial(obj.oid), txn)
    AttributeError: 'DemoStorage' object has no attribute 'deleteObject'
        demo_test.go:105: demo:(file:///tmp/demo470143633/δ0285cbac258bf266/base.fs)/(file:///tmp/demo470143633/δ0285cbac258bf266/δ.fs): zpyrestore: exit status 1

To be able to implement DemoStorage.deleteObject, we have to fix
FileStorage.deleteObject first: currently FileStorage.deleteObject raises
POSKeyError if an object is not present in the database, but for situation
where

- demo=base+δ,
- object is present in base, and
- object is not present in δ

it creates a problem because there is no way to add whiteout record for the
object into δ.

This change should be generally good; let me quote
https://lab.nexedi.com/kirr/neo/commit/543041a3 for why:

---- 8< ----
Even though currently it is not possible to create such data record via
FileStorage(py).deleteObject (because it raises POSKeyError if there is
no previous object revision), being able to use such data records is
semantically useful in overlayed DemoStorage settings, where δ part
marks an object that exists only in base with delete record whiteout.

It is also generally meaningful to be able to create "delete" record
even if object was not previously existing: "deleteObject" is actually
similar to "store" (and so should be better named as "storeDelete"). If
one wants to store deletion, there should not be a reason to reject it,
because deleteObject already has seatbelt in the form of oldserial, and
if the user calls deleteObject(oid, oldserial=z64), he/she is already
telling that "I know that this object does not exist in this storage
(oldserial=z64), but still please create a deletion record for it". Once
again this is useful in overlayed DemoStorage settings described above.

For the reference, such whiteout deletion records pass ZODB/scripts/fstest just fine.
---- 8< ----

DemoStorage.deleteObject implementation is straightforward, but builds on
loadAt[4] as precondition.

/cc @jimfulton, @jamadden, @perrinjerome

[1] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbrestore.py
[2] https://lab.nexedi.com/nexedi/zodbtools/merge_requests/19
[3] https://lab.nexedi.com/nexedi/zodbtools/blob/129afa67/zodbtools/zodbdump.py
[4] zopefoundation#323
@navytux
Copy link
Contributor Author

navytux commented Mar 26, 2021

loadAt is also needed for DemoStorage.deleteObject (#341).

@navytux
Copy link
Contributor Author

navytux commented Mar 26, 2021

For the reference - see also https://lab.nexedi.com/kirr/neo/commit/4fb6bd0a?expanded=1 which proides DemoStorage-analogue for ZODB/go with Load organization similar to loadAt-based DemoStorage as implemented in this pull-request.

@navytux
Copy link
Contributor Author

navytux commented Apr 8, 2021

/cc @d-maurer

@navytux
Copy link
Contributor Author

navytux commented May 3, 2021

Anyone?

@navytux navytux requested review from d-maurer and removed request for jimfulton May 3, 2021 15:20
Copy link
Contributor

@d-maurer d-maurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ZODB logic relating to historical data (including MVCC) was largely centered around before. You have changed this to at - requiring wide spread modifications. I would much prefer to keep the before centered approach and implement an optimized version of loadBefore (for backward compatibility under a new name or with a new keyword parameter) instead of introducing a new loadAt.

I suggest to keep loadBefore (without deprecation), define loadBefore_opt with streamlined return value and implement it by default in BaseStorage via loadBefore and change calls to loadBefore to calls to loadBefore_opt in the ZODB code base where the third return value of loadBefore is not necessary.

Comment on lines 270 to 271
def loadBefore(self, oid, tid):
"""Return most recent revision of oid before tid committed."""
return None
# do not provide loadAt/loadBefore here in BaseStorage - if child forgets
# to override it - storage will always return "no data" instead of failing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer raise NotImplementedError over not defining the methods.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

navytux added 3 commits May 4, 2021 16:35
@d-maurer suggests[1]:

    The ZODB logic relating to historical data (including MVCC) was largely
    centered around before. You have changed this to at - requiring wide spread
    modifications. I would much prefer to keep the before centered approach...

    (zopefoundation#323 (review))

So let's change "at"-based logic to "before"-based logic and rename the new
method from loadAt to loadBeforeEx.
@d-maurer suggests to keep loadBefore without deprecation
(zopefoundation#323 (review)).

-> Don't emit warnings about deprecating loadBefore.

-> Keep the deprecation text in loadBefore interface, since loadBeforeEx
   should practically provide wider functionality without putting
   unnecessary constraint on storage implementations. In other words
   loadBefore deprecation is still there, but less aggressively
   advertised with the idea to make transition for outside-of-ZODB code
   to loadBeforeEx more smooth and with a bit more steps (we might want
   to reinstate the deprecation warnings at a later time).
@d-maurer
Copy link
Contributor

d-maurer commented May 11, 2021 via email

@d-maurer
Copy link
Contributor

@navytux wrote:

... and that cache can be remade to
    work without using that next_serial at all. In simple words whenever
    after
    ```
      loadAt(oid, at)  ->  (data, serial)
    ```
    
    
    query, the cache can remember data for oid in [serial, at] range.
    Next, when invalidation message from server is received, cache entries,
    that had at == client_head, are extended (at -> new_head) for oids that
    are not present in invalidation message,

This will not give the same cache performance as before: formerly, the cache had precise information about the currency of the object revision and for non current revisions about its validity range. Now an object revision is considered current only if it had been loaded for tid >= cache_head(at load time) (and was not invalidated since that time); however, loads for tid < cache_head can retrieve current revisions as well which is no longer recognized. For revisions loaded with tid < cache_head, the validity range is only approximated which may result in additional loads in the future.

I do not know whether the performance loss is significant in practice because most loads are likely to be for tid >= cache_head. Nevertheless, I see the potential loss as an argument against the deprecation of loadBefore; storages which can determine the exact validity range efficiently should be able to provide this information also in the future.

@navytux
Copy link
Contributor Author

navytux commented May 21, 2021

@d-maurer thanks for feedback, and first of all I apologize for the delay with replying on my side.

While, probably, it was my fault not to present things clearly, let me explain
why returning only (data, serial) without next_serial should not decrease
cache performance for current - non-historical - connections:

Imagine the situation when the database is under constant change. Transactions
are committed on server and propagate via invalidations to client.
Client.last_tid is potentially a bit behind on-server last_tid. When an
on-client thread opens a Connection, that connection gets assigned an at
that denotes the Connection's snapshot of the database state. Connection.at,
is potentially a bit behind Client.last_tid, because during the time when
that client transaction is processed, client could receive some further
invalidations from the server. At any time, there might be several "current"
on-client Connections that potentially all have different Connection.at.

In order to be efficient with putting entries into the cache, the client
maintains ΔTail that keeps history of all ZODB invalidations from
Cache.head till the "at" of its oldest "current" Connection. Then, when an
object is loaded for a connection with Connection.at < Cache.head, the
client looks into ΔTail in (Connection.at, Cache.head]
range to find out what is the first next serial that changed the object. If
there is an entry - that entry's serial is used as next_serial when putting
data into the cache after the call to load. If there is no such an entry - it
means that the data, that were loaded via at < Cache.head, is still current
even at Cache.head state, and so the data can be put into the cache with
next_serial=None.

    ΔTail.tail                            ΔTail.head

       ↓                                      ↓
────────────────────────────────────────────────────────────────────
                ↑            ↑                ↑                ↑

              conn₁        conn₂          Cache.head        on-server
                                              =             last_tid
                                          Client.last_tid

This organization makes sure that for "current" connections - that were not
opened in historical mode - the cache will be updated in exactly the same way
as it would be updated if next_serial is returned from load.

Yes, for historical connections - that are opened via DB.open(at/before=...)
and are maintained in DB.historical_pool instead of DB.pool for "current"
Connections - the update of the cache will be less efficient compared to the
situation when next_serial is provided by load: the next_serial of the
cache entry would be set as Connection.at instead of real next_serial, thus
potentially forcing another load to happen if another historical connection is
opened with different at. However, in my experience, historical connections
are seldomly used, and when used, at for only one particular point in time is
used to investigate a particular database state. It is thus the performance of
"current" access that matters the most.

For the reference: support for historic connections was added in late 2007 in 5f2b2d70a18a.
For the reference 2: I was talking a bit about ΔTail in #290 (comment).


I have not yet seen "deletion records" -- seems to be a newer concept

Deletion records were added in late 2008 in d2d83b88131b (see also http://web.archive.org/web/20120426053905/http://wiki.zope.org/ZODB/ExternalGC).

I had not expected many external (outside ZODB) users of loadBefore.
...
Maybe, we can break the interface with a version bump and an incompatibility note in the documentation?

You have understood my proposal - no need to describe it again.
It will break the interface with potential consequences for
users. Let's hear what those (or others) have to say.

I really don't understand why we need to break things to fix other things,
especially when compatibility-preserving solution is possible.

Making a major version change with compatibility breakage also does not work
well in my experience. If the new versions, besides breaking compatibility,
provides other benefits - that might somewhat work, but in our case the benefit
of fixed DemoStorage wrt deletion records is imho low for an average ZODB user,
while the downsides of compatibility breakage are high. This approach also
reminds me about Python2/Python3 situation where many companies are still
using Python2 because a) porting to Python3 has non-small cost, and b) this
porting does not usually bring any significant benefit. And all that more than
10 years after Python3 was released.

For the reference, where I work ZODB4
is still being used, because, even if it was not intended to be a compatibility breakage, the switch to pickle protocol=3 broke things in one major place for us. This, hopefully, will be addressed, but I would like to stress that changes in behaviour expose cost on users, especially changes that are intentionally backward-incompatible.

Personally I do not need, nor want, to make an experiment to see what happens if we break things.
I know beforehand that people will scream or silently get annoyed and pin ZODB to an older release.
I know the outcome of such experiment will be "this should not be the way to evolve a software".

Once again I really don't understand why we need to break things to fix other things,
especially when compatibility-preserving solution is possible.

Kirill

@navytux
Copy link
Contributor Author

navytux commented May 21, 2021

I have not yet seen "deletion records" -- seems to be a newer concept

Deletion records were added in late 2008 in d2d83b88131b (see also http://web.archive.org/web/20120426053905/http://wiki.zope.org/ZODB/ExternalGC).

For the reference: deletion records can also appear without any external garbage-collection - as a result of regular undo of object creation.

@navytux
Copy link
Contributor Author

navytux commented May 21, 2021

For the reference: deletion records can also appear without any external garbage-collection - as a result of regular undo of object creation.

They were known and handled at least from 2002 - see 6c7d0c429b19 and:

for j, rec in enumerate(trans):
if rec.data is None:
fullclass = "undo or abort of object creation"
size = ""

@d-maurer
Copy link
Contributor

d-maurer commented May 22, 2021 via email

@leycec leycec mentioned this pull request May 29, 2021
Dieter Maurer notes that loadBefore cannot be deprecated yet because ZEO
essentially depends on the `end_tid` information returned by loadBefore
to update its cache:

zopefoundation#323 (comment)

And to remove this dependency it would require to rework ZODB caching layer:

zopefoundation#323 (comment)

So we cannot deprecate loadBefore until this rework is implemented first.

-> Remove general loadBefore deprecation, and emit loadBefore vs
loadBeforeEx warning only when actually hitting a "deletion" record,
because only that case is known to lead to data corruption.
@navytux
Copy link
Contributor Author

navytux commented Jun 6, 2021

@d-maurer, thanks for feedback.

I indeed explained how things could be changed to work without next_serial
information, and you are right that currently there is nothing like that
implemented in ZEO: ZEO indeed currently uses end_tid to update its cache.

You are, again, right, that the caching layer, that I explained, is indeed
generic, and could be implemented as generic IStorage wrapper inside ZODB itself.

I believe (with you) that it is possible to get cache
management efficient even without "end_tid" information.

Thanks.

Unlike you, I fear that it requires significant (architectural) changes.
And until those changes have been done, loadBefore should not
get deprecated: FileStorage is still a major base storage and can
deliver end_tid information without any performance loss.

You are probably right here. Until we don't have that efficient caching scheme
implemented, loadBefore should remain in non-deprecated state.

I cannot allow myself to break things, but I have amended my change to
undeprecate loadBefore:

navytux@5ae48fe1d933

Hopefully this could help us to find some common ground.

Kirill

P.S. And DemoStorage is not only for demonstrations. DemoStorage provides generic storage overlaying, similarly to e.g. overlayfs. Besides demonstrations, such overlaying is used e.g. in tests - to create fresh ZODB instance for each test quickly - and could be used in other similar situations.

@navytux
Copy link
Contributor Author

navytux commented Jun 6, 2021

@d-maurer, thanks for approval. @jamadden, do you want to provide any input on the changes?

@icemac
Copy link
Member

icemac commented Oct 28, 2021

This PR is already approved, is there still something to do before merging it?

@navytux
Copy link
Contributor Author

navytux commented Oct 29, 2021

@icemac, thanks for asking.

@jamadden, 6 months ago, at #318 (comment), you said you "hope to find time to examine these PRs in depth "soon."

5 months ago, at #323 (comment) after @d-maurer's approval, I've asked whether you want to provide any input on this patch.

Are you still interested in reviewing it?

We can wait if you still plan to have a look.
Alternatively, if things changed on your side, it does not make sense to wait any longer.

So could you please explicitly state what's in your mind regarding this pull request.

Thanks beforehand for feedback,
Kirill

to resolve trivial conflict on CHANGES.rst

* origin/master: (22 commits)
  Fix TypeError for fsoids (zopefoundation#351)
  Fix deprecation warnings occurring on Python 3.10.
  fix more PY3 incompatibilities in `fsstats`
  fix Python 3 incompatibility for `fsstats`
  add `fsdump/fsstats` test
  fsdump/fsstats improvements
  - add coverage combine step
  - first cut moving tests from Travis CI to GitHub Actions
  - ignore virtualenv artifacts [ci skip]
  tests: Run race-related tests with high frequency of switches between threads
  tests: Add test for load vs external invalidation race
  tests: Add test for open vs invalidation race
  fixup! doc/requirements: Require pygments < 2.6 on py2
  doc/requirements: Require pygments < 2.6 on py2
  fixup! buildout: Fix Sphinx install on Python2
  buildout: Fix Sphinx install on Python2
  Update README.rst
  Security fix documentation dependencies (zopefoundation#342)
  changes: Correct link to UnboundLocalError fsoids.py fix
  fsrefs: Optimize IO  (take 2) (zopefoundation#340)
  ...
@navytux
Copy link
Contributor Author

navytux commented Oct 29, 2021

( PR refreshed by syncing to master to resolve trivial conflict on CHANGES.rst )

@jamadden
Copy link
Member

I appreciate the reminder, and I apologize for the no-show; there are various personal and professional reasons I haven't found the time to review this PR in depth. I will do my best to look at it within the next week; feel free to move ahead without me if that doesn't happen.

I have some vague memories about some concerns I might have had:

  • I'm not thrilled about the proliferation of Yet Another Load Method. In my (not very considered yet, and certainly forgetful) opinion, if this is the way forward, it should be the only load method (maybe that needs more work to require storages to properly support MVCC? Or the wrapper that implements it for them might need to work with storages that implement a different interface?). Yeah, it breaks backwards compatibility, but just call it 6.0 and be done with it. The overall simplification in all storage code is probably worth it. We're still supporting Python 2.7 with this, so there should be few migration blockers.
  • It wasn't clear to me the impact this would have on RelStorage, since its connections are already internally consistent and have no way to load data from the future; an arbitrary "at" request is impossible to fulfill.

@icemac
Copy link
Member

icemac commented Nov 10, 2021

I created some merge conflicts by merging #357, sorry.

@jamadden
Copy link
Member

jamadden commented Nov 10, 2021

OK, I think I'm allowed to relay this now: The company that I co-founded and which sponsored all my open source work for the last decade or so…will, umm...not be doing that…or any development work. It's been all-hands-on-deck to focus on things that could prevent that, but at this point it seems unlikely.

I still care, deeply, about these projects. They're important to me on a personal level, and I think they are important to the wider community. But my time, at this point, is no longer going to be subsidized to work on them. I sincerely hope that changes! But in the meantime, I will have to focus on finishing the most important work I had outstanding to the community (currently I think that means gevent and greenlet), and then I have to find some way to pay the bills.

I am deeply sorry for any turmoil this has caused. I didn't want that.

@jimfulton
Copy link
Member

jimfulton commented Nov 11, 2021 via email

navytux added a commit to navytux/zc.zlibstorage that referenced this pull request Nov 11, 2021
zc.zlibstorage wraps base storage and includes generic __getattr__
to forward all unknown attribute access to base:

https://github.com/zopefoundation/zc.zlibstorage/blob/6d5a3c75/src/zc/zlibstorage/__init__.py#L52-L53

But if base implements loadBeforeEx (zopefoundation/ZODB#323)
the following scenario is then possible:

ZODB sees that zlibstorage provides loadBeforeEx and loads data via
loadBeforeEx instead of loadBefore, but the data are loaded
not decompressed, and ZODB complains about "broken pickle".

-> let's fix this by explicitly wrapping loadBeforeEx if base storage
provides it.
Resolve many conflicts after zopefoundation#357.

* master:
  Let the year float.
  Configuring for pure-python
  Specify a PyPy2 version.
  Lint the code.
  Configuring for pure-python
Sorry, but some of flake8 complaints are really stupid...
navytux added a commit to navytux/relstorage that referenced this pull request Nov 11, 2021
loadBeforeEx is like loadBefore, but simpler, provides better
information for object delete records and can be more efficiently
implemented by many storages: zopefoundation/ZODB#323

On RelStorage loadBefore is currently implemented via 3 SQL queries:

  1) check whether object record exists at all
  2) retrieve object state
  3) retrieve serial of next object revision

Compared to that loadBeforeEx is implemented via only one SQL query "2"
from the above - "retrieve object state". It is exactly the same query
that loadBefore uses and after the patch loadBefore actually invokes
loadBeforeEx for step 2.

This change was outlined in

zopefoundation/ZODB#318 (comment) and
zopefoundation/ZODB#318 (comment)

and as explained in the first link this patch is also semantically coupled with

zodb#484

This patch passes tests with both ZODB5 and with ZODB5+zopefoundation/ZODB#323:

- when ran with ZODB5 it verifies that loadBefore implementation does
  not become broken.

- when ran with ZODB5+zopefoundation/ZODB#323 it
  verifies that loadBeforeEx implementation is correct.

For tests to pass with
ZODB5+zopefoundation/ZODB#323 we also need
zopefoundation/zc.zlibstorage#11 because without
that fix zc.zlibstorage does not decompress data on loadBeforeEx.
@navytux
Copy link
Contributor Author

navytux commented Nov 11, 2021

@jamadden thanks for feedback. And I'm very sorry to hear that things are unfortunate on your side.
Let's hope this situation could improve for the better.

In the meantime please feel to skip what I will post below for the record.


  • loadBeforeEx intention was to replace loadBefore and deprecate it. It turned out we cannot deprecate loadBefore until ZEO client cache is reworked not to use next_serial. It is clear how to do this rework, but it was not done. For this reason loadBefore stays in undeprecated state for now:

    loadAt #323 (comment)
    loadAt #323 (comment) (first part)
    loadAt #323 (comment) (first part)
    loadAt #323 (comment)

  • I try not to break backward compatibility. One of the motivation for loadAt/loadBeforeEx is to fix DemoStorage does not take whiteouts into account -> leading to data corruption #318. I find it reasonable not to break things to fix other things, especially when backward-compatible solution is possible. The other reason is that besides ZODB and main storages (FileStorage, ZEO, NEO and RelStorage), there are many third-party places which depend on loadBefore being provided and particular semantic. It is not practical to discover and go through all those places and change them all. Also personally I find major breaking changes to be disasters, so unless such a change brings in some huge benefit, breaking backward compatibility is not worth it. In other words breaking backward-compatibility has its cost and this cost should be evaluated against expected gain. All this reasons were explained in detail in

    loadAt #323 (comment) (first part)
    loadAt #323 (review) (first part)
    loadAt #323 (comment) (third part)

  • the impact for RelStorage should be minimal. RelStorage already provides loadBefore, and loadBeforeEx can be implemented similarly. This implementation will have the same properties as loadBefore regarding your concern. I might be misunderstanding something but there is something like "stale handling" in RelStorage which loadBefore utilizes to detect time when SQL connection view needs to be refreshed. Once again even if RelStorage provides MVCC natively, loadBefore is already there and loadBeforeEx can be added the same way. Taking recent news into account I went ahead and sketched it myself:

    Provide loadBeforeEx zodb/relstorage#485


@icemac, no problem. I resolved those conflicts.

Kirill

navytux added a commit to navytux/zc.zlibstorage that referenced this pull request Nov 16, 2021
zc.zlibstorage wraps base storage and includes generic __getattr__
to forward all unknown attribute access to base:

https://github.com/zopefoundation/zc.zlibstorage/blob/6d5a3c75/src/zc/zlibstorage/__init__.py#L52-L53

But if base implements loadBeforeEx (zopefoundation/ZODB#323)
the following scenario is then possible:

ZODB sees that zlibstorage provides loadBeforeEx and loads data via
loadBeforeEx instead of loadBefore, but the data are loaded
not decompressed, and ZODB complains about "broken pickle".

-> let's fix this by explicitly wrapping loadBeforeEx if base storage
provides it.
@navytux
Copy link
Contributor Author

navytux commented Nov 18, 2021

A bit of argument to maybe reconsider loadAt -> loadBeforeEx renaming back: #341 (comment).

* master:
  - vb [ci skip]
  - force recognizing ellipsis
  - try to fix changed traceback representation
  - select more specific pypy versions
  - prepare release 5.7.0
  - revert 5b8e2dc
@navytux
Copy link
Contributor Author

navytux commented Mar 17, 2022

( synced to master to resolve conflict on CHANGES.rst )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants