Faster dict #103

alanjds · 2018-10-06T19:06:50Z

Kudos @nairb774 for the original code and @trotterdylan for original guidance and comments.

This improves the speed of the previous dict implementation through a reduction in the number of atomic loads in the read path (max 1 when the dict is read-only - think globals) as well as the number of allocations needed in the write path. Overall the performance is improved by about 30%. Some of the major changes are as follows: * The internal table layout was changed from []*dictEntry to []dictEntry reducing a memory indirection as well as hopefully improving the speed of slot probing in insertAbsentEntry as well as lookupEntry. * Many iteration operations which might have needed to grab a relatively expensive lock previously can now do so without locking if the dict is in the read-only mode. * The sizeof(Dict) increased some as a few variables (used and fill) were moved from the dictTable to Dict itself. The addition of a `write` and `misses` values to the Dict makes the overall memory usage of Dict generally larger. This is offset by the type change of dictTable and the reduction of additional pointers there. An empty dict weighs in at 304 bytes compared to 176 previously. At 4 elements, both the old and new implementation use 304 bytes of memory. From that point on, the new implementation actually uses less memory.

…9/files#diff-e3a0fcbc98a35742993f35b0a80c2428

alanjds · 2018-10-08T16:03:40Z

Looks like I messed with some calls during the merge...

funny-falcon · 2018-10-08T22:02:51Z

@rezaghanbari I'm looking on

funny-falcon

I like change []*dictEntry -> []dictEntry, and I bet this change has most positive impact on performance due to better cache locality and less allocation count.

I don't like copy-on-write part. Having write part under lock, read part and iteration could be easily done lock-free without copy-on-write:

never overwrite hash and key of already written dictEntry.
Delete should be accomplished by writing deletedValue into value instead of deletedKey into key.
writing should be done in three atomic.Stores: hash, then key, then value.
reading entry should be done as following:

    key := atomic.Load(&entry.key)
    if key == nil {
        return emptyValue
    }
    for {
        val := atomic.Load(&entry.Value)
        if val == &deletedValue {
            return empty
        } else if val != nil {
            return value
        }
        runtime.Gosched()
    }

Hash value should be also read after checking key

Remark: hash == 0 could be used instead of key == nil to mark never written yet entry. It is preffered order is preffered for GetEntry, because it more often will touch only hash without need to check key. Then if key's hash already equal to zero, it should be replaced with some constant.

Probably I'm mistaken, and this way GetEntry will be considerably slower because of more atomic operations. But this design should be at least tried and compared with.

There is other "issue" with dict: Python 3.7 adopts order preserving dict by storing entries in an array and storing indices in a hash part. This is the way array works in PHP7, Hash implemented in Ruby 2.5, Map in Javascript (i suppose), dict in PyPy (iirc). While "order preserving" were not declared as part of language specification yet, it is a quite pretty property.
So, if change of dict implementation is desirable, then it is better to implement "order preserving" variant right now.

alanjds · 2018-10-09T14:26:47Z

I agree with @funny-falcon about 3.7 order preservation. If a behavior of 3.X is 2.7-compatible, I will almost always be in favor of it. Python 3 is a goal.

It is ok to have many PRs refactoring dict, for benchmark comparison...

The actual benchmarks looks like this PR is better than the existing implementation. @funny-falcon, do you think we should wait for challenge implementations before merging or merge then compare with the challenges?

nairb774 · 2018-10-09T14:34:52Z

Most dicts are "readonly" or at least write rarely. Do benchmarks with real code. This is by far the closest you can get safely to py27's performance without requiring additional external synchronization that isn't needed in normal cpython. I implemented a fully lock free map, but it was slower. The implementation I followed was the one described by Cliff Click here https://youtu.be/HJ-719EGIts. Go doesn't directly expose "light" enough atomic loads - at least that was the case at the time I wrote the initial patch. Regardless, measure with real benchmarks, they tell a good story.

…

On Tue, Oct 9, 2018, 00:40 Sokolov Yura ***@***.***> wrote: ***@***.**** requested changes on this pull request. I like change []*dictEntry -> []dictEntry, and I bet this change has most positive impact on performance due to better cache locality and less allocation count. I don't like copy-on-write part. Having write part under lock, read part and iteration could be easily done lock-free without copy-on-write: - never overwrite hash and key of already written dictEntry. Delete should be accomplished by writing deletedValue into value instead of deletedKey into key. - writing should be done in three atomic.Stores: hash, then key, then value. - reading entry should be done as following: key := atomic.Load(&entry.key) if key == nil { return emptyValue } for { val := atomic.Load(&entry.Value) if val == &deletedValue { return empty } else if val != nil { return value } runtime.Gosched() } Hash value should be also read after checking key Remark: hash == 0 could be used instead of key == nil to mark never written yet entry. It is preffered order is preffered for GetEntry, because it more often will touch only hash without need to check key. Then if key's hash already equal to zero, it should be replaced with some constant. Probably I'm mistaken, and this way GetEntry will be considerably slower because of more atomic operations. But this design should be at least tried and compared with. There is other "issue" with dict: Python 3.7 adopts order preserving dict by storing entries in an array and storing indices in a hash part. This is the way array works in PHP7, Hash implemented in Ruby 2.5, Map in Javascript (i suppose), dict in PyPy (iirc). While "order preserving" were not declared as part of language specification yet, it is a quite pretty property. So, if change of dict implementation is desirable, then it is better to implement "order preserving" variant right now. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAH3G-WSix49w6lplNlBl3yAm49pJKC2ks5ujFLegaJpZM4XLgM1> .

funny-falcon · 2018-10-09T18:48:42Z

do you think we should wait for challenge implementations before merging or merge then compare with the challenges?

Doesn't matter. I'll try to find time this week to make alternative variant, but I don't claim I will complete.

Most dicts are "readonly" or at least write rarely.

Yes. But some are not.

I implemented a fully lock free map, but it was slower. The implementation
I followed was the one described by Cliff Click here
https://youtu.be/HJ-719EGIts. Go doesn't directly expose "light" enough
atomic loads - at least that was the case at the time I wrote the initial
patch.

Yes, I've said already that I could be mistaken, and "lock-free" could be remarkably slower.
We need to measure.

funny-falcon · 2018-10-14T22:03:36Z

I've spent a day today... my first attempt were noticeably slower. Will try again next week.

funny-falcon · 2018-10-21T00:28:01Z

I've done ordered dictionary at #111 with []dictEntry instead of []*dictEntry.
It is not slower on lookup than current dictionary implementation for lookup, though not faster either.
It is much faster on iteration mainly because of trick: it doesn't create dictTable for empty dictionary, and therefore improves creation of StopIteration exception.

I didn't fix comments yet, that is why it is marked as WIP.

funny-falcon · 2018-10-22T20:38:58Z

#111 is ready for review

alanjds · 2018-10-24T16:04:31Z

Today I was thinking about TimSort and wondering if a Dict can adapt itself for the use behaviors.

What about bookkeeping reads vs writes and change from Copy-On-Write (good for reads) into something else when is perceived to be a write-heavy dict?

funny-falcon · 2018-10-24T18:06:00Z

What about bookkeeping reads vs writes and change from Copy-On-Write (good for reads)

I still don't believe Copy-On-Write gives noticeable performance improvement.
How should I run benchmarks properly to measure #111 against this one?

alanjds · 2018-10-25T19:05:41Z

How should I run benchmarks properly to measure #111 against this one?

Can you point somewhere on this, @nairb774 ?
(sorry for the poking 👉)

nairb774 · 2018-10-27T19:40:22Z

I had used https://github.com/grumpyhome/grumpy/tree/master/grumpy-runtime-src/benchmarks for the primary benchmarks. The testing.B benchmarks added to the Go code (and referenced in the initial PR) were created as an early indicator for how real programs might behave. The proof is in real code - microbenchmarks allow isolating and measuring opportunities. Integrating more of https://github.com/python/performance wouldn't be a bad idea. I was using it for measuring the difference in implementation performance available.

All that being said, getting other, "real" sizable code chunks to measure is always useful.

alanjds · 2018-10-27T22:37:12Z

The testing.B benchmarks added to the Go code (and referenced in the initial PR) were created as an early indicator for how real programs might behave.

You mean this new bigDict test on your PR ? https://github.com/grumpyhome/grumpy/pull/103/files#diff-568fd4ad4f28d648af0e9d4befa2acd9R583

	bigDict := newTestDict(
		"abc", "def",
		"ghi", "jkl",
		"mno", "pqr",
		"stu", "vwx",
		"yzA", "BCD",
		"EFG", "HIJ",
		"KLM", "OPQ",
		"RST", "UVW",
		"XYZ", "123",
		"456", "789",
		"10!", "@#$",
		"%^&", "*()")
	cases := []invokeTestCase{
		{args: wrapArgs(NewDict()), want: NewList().ToObject()},
		{args: wrapArgs(newTestDict("foo", None, 42, None)), want: newTestList(42, "foo").ToObject()},
		{args: wrapArgs(bigDict), want: newTestList("abc", "yzA", "KLM", "XYZ", "10!", "456", "stu", "%^&", "mno", "RST", "ghi", "EFG").ToObject()},
	}

If so, it should be added to the test suite anyway...

alanjds · 2018-10-27T22:38:40Z

Integrating more of https://github.com/python/performance wouldn't be a bad idea.

Agree. Would be a nice.

nairb774 and others added 3 commits February 18, 2017 19:11

Merge branch 'master' into fasterDict

403b296

Readability changes asked on https://github.com/google/grumpy/pull/25…

f100134

…9/files#diff-e3a0fcbc98a35742993f35b0a80c2428

alanjds requested review from cclauss and corona10 October 6, 2018 19:06

alanjds mentioned this pull request Oct 6, 2018

dict: Change to a sticky Copy-On-Write based table. #85

Open

alanjds requested a review from a team October 8, 2018 16:04

r21gh requested review from r21gh and funny-falcon October 8, 2018 16:46

funny-falcon requested changes Oct 9, 2018

View reviewed changes

funny-falcon mentioned this pull request Oct 21, 2018

Implement ordered dictionary #111

Merged

alanjds added 3 commits October 29, 2018 00:19

Merge branch 'master' into fasterDict

ef4ccde

Oops: Forgot to pass the f *Frame

faae89d

typo

6df4b08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster dict #103

Faster dict #103

alanjds commented Oct 6, 2018

alanjds commented Oct 8, 2018

funny-falcon commented Oct 8, 2018

funny-falcon left a comment

alanjds commented Oct 9, 2018

nairb774 commented Oct 9, 2018 via email

funny-falcon commented Oct 9, 2018

funny-falcon commented Oct 14, 2018

funny-falcon commented Oct 21, 2018 •

edited

Loading

funny-falcon commented Oct 22, 2018

alanjds commented Oct 24, 2018

funny-falcon commented Oct 24, 2018

alanjds commented Oct 25, 2018

nairb774 commented Oct 27, 2018

alanjds commented Oct 27, 2018

alanjds commented Oct 27, 2018

Faster dict #103

Are you sure you want to change the base?

Faster dict #103

Conversation

alanjds commented Oct 6, 2018

alanjds commented Oct 8, 2018

funny-falcon commented Oct 8, 2018

funny-falcon left a comment

Choose a reason for hiding this comment

alanjds commented Oct 9, 2018

nairb774 commented Oct 9, 2018 via email

funny-falcon commented Oct 9, 2018

funny-falcon commented Oct 14, 2018

funny-falcon commented Oct 21, 2018 • edited Loading

funny-falcon commented Oct 22, 2018

alanjds commented Oct 24, 2018

funny-falcon commented Oct 24, 2018

alanjds commented Oct 25, 2018

nairb774 commented Oct 27, 2018

alanjds commented Oct 27, 2018

alanjds commented Oct 27, 2018

funny-falcon commented Oct 21, 2018 •

edited

Loading