Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible shuffle improvement? #6

Open
siiky opened this issue Mar 26, 2023 · 3 comments
Open

Possible shuffle improvement? #6

siiky opened this issue Mar 26, 2023 · 3 comments

Comments

@siiky
Copy link

siiky commented Mar 26, 2023

Kind of related to #5: I believe it may be possible to improve performance of the shuffle operation in gochan-select*.

Here's my proposed alternative:

(define (shuffle l)
  (if (null? l)
      '()
      (let ((x (car l))
            (l (shuffle (cdr l))))
        (if (zero? (pseudo-random-integer 2))
            (cons x l)
            (append l `(,x))))))

Thinking about it (though not too much, admitedly), I believe using this rather than (map ... (sort ... (map ... l))) would improve that operation from 2 traversals+sorting of the original list (which should be roughly 3 traversals total?1) to roughly 2 traversals (assuming 50% chance of (pseudo-random-integer 2) returning either 0 or 12).

I have no idea how I could benchmark this, however. You're certainly more familiar with the codebase, so if you think this is worth exploring I would appreciate some pointers/ideas. :)

One question I have is whether it would be a problem for shuffle not to be tail-recursive.

And another question is whether the result would be random enough to work well as a load-balancer.

Footnotes

  1. Depends on sorting algorithm! And assuming that map is implemented as a tail-recursive procedure with a reverse at the end, as usual, the total number of traversals is actually 5+.

  2. Worst case scenario, putting the first half of the elements to the left and then the second half all to the right, total number of traversals would be roughly 3.

@kristianlm
Copy link
Owner

HI @siiky and sorry for this embarrasingly long response time. I really appreciate when people use software I've written and I really need to get better at showing that ...

Anyhow, I had the same idea when I was implementing this. The problem with this way of sorting is that it isn't fair: the first element in a 10-element list has a 50% chance of becoming the first element in the resulting list, but it should be 10%. The last element's chance of becoming the first is, and I'm no statistician, but is that 0.5^10?

(list-tabulate 10 (lambda (x) (car (shuffle '(0 1 2 3 4 5 6 7 8 9)))))
#;=> (0 0 1 1 0 4 1 1 3 0)

I don't know if this would actually be a problem in practice. But I'd rather take the performance hit than take chances. Also, I think the list of channels involved are usually bound to around 3-5 items. Or does your experience contradict that?

It would be interesting to benchmark gochan, though. I suspect it'd perform painfully slow.

@siiky
Copy link
Author

siiky commented Jul 28, 2023

Makes sense, I understand fairness is important.

Also, I think the list of channels involved are usually bound to around 3-5 items. Or does your experience contradict that?

Yeah, I agree, I think it should be the most common case to select on only a few channels at a time. Unless someone is using it in some crazy way with macros that generate tons of channels.

Unfortunately I ended dropping the gochan experiment because of tight deadlines and all. I was hitting deadlocks no clue why, and was forced to move on.


BTW (shameless plug) some weeks ago I started working on something you might be interested in, since you seem to be interested in message-passing. It's an egg wrapper of the Erl_Interface library, the C library used to read/write Erlang's External Terms (binary) Format: https://git.sr.ht/~siiky/experiments/tree/main/item/erlang-interface.scm

It's not done yet, though, some reader functions segfault, usual C stuff :)

@kristianlm
Copy link
Owner

Do you reckon the deadlocks were caused by "user error", or some bug in gochan? If it's the latter, I'd like to investigate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants