Bit-packed GF2 #583

amirebrahimi · 2024-12-12T02:14:22Z

Initially, I started with an approach that defined bit-packing at ArrayMeta, so it could be used in Array and FieldArray. Then, I would handle implicit bit packing and unpacking as necessary depending on the operation. This soon turned into a game of whack-a-mole where I was needing to override a lot of numpy functions on top of ndarray methods.

This version is much simpler, smaller scope, and has limited functionality: basically arithmetic operations and matrix-vector/matrix-matrix multiplication. I figure let's start with this and then expand scope as needed.

Rather than introduce fields higher up in the class/meta hierarchy I've opted for overriding specific functionality in a custom GF2BP class. There are simple conversions for going between GF2 and GF2BP via .astype, which implies that you will receive a new array.

The numpy v2 and numba v0.60 bump was necessary, so that I could make use of np.bitwise_count. This allows a roughly 4x speed-up in matrix-matrix multiplication.

I started to work on tests and tried to introduce GF2BP under the FIELDS fixture construction, but there won't be an easy way to xfail the tests I know would fail. Ideally, I'd like to piggyback on the tests in test_arithmetic.py and test_linalg.py, but only in comparison to GF2 after unpacking. So, let me know if you have any good ideas for this without having to duplicate tests.

mhostetter · 2024-12-12T15:48:51Z

Thanks for working on / thinking through this. Here are some initial thoughts.

I think a good API would be to have GF2 and GF2BP conversions happen through np.packbits() and np.unpackbits(). Doing this would avoid the need for the .astype() method. For example:

x = galois.GF2([1, 0, 1, 1])
print(repr(x))
# GF([1, 0, 1, 1], order=2)
x_bp = np.packbits(x)
print(repr(x_bp))
# GF([176], order=2, bitpacked=True)

Whereas, creating a GF2BP assumes the input is already bit-packed. For example:

x_bp = galois.GF2BP([176])
print(repr(x_bp))
# GF([176], order=2, bitpacked=True)
x = np.unpackbits(x_bp)
print(repr(x))
# GF([1, 0, 1, 1, 0, 0, 0, 0], order=2)

It's an open question whether to use only np.uint8, as np.packbits() does, or support larger dtypes like np.uint64. For the moment, let's assume the former.

If we use np.packbits(), I don't think it's necessary to keep track of the unpacked shape, do you? Also, in the code snippet below, I don't believe the output unpacked shape equals the first input's unpacked shape. This is because of NumPy's broadcasting rules.

    def __call__(self, ufunc, method, inputs, kwargs, meta):
        output = super().__call__(ufunc, method, inputs, kwargs, meta)
        output._unpacked_shape = inputs[0]._unpacked_shape
        return output

Looking at the ufuncs for GF2, most of them use bitwise arithmetic (XOR, AND, etc).

class UFuncMixin_2_1(UFuncMixin):
    """
    A mixin class that provides explicit calculation arithmetic for GF(2).
    """

    def __init_subclass__(cls) -> None:
        super().__init_subclass__()
        cls._add = add_ufunc(cls, override=np.bitwise_xor)
        cls._negative = negative_ufunc(cls, override=np.positive)
        cls._subtract = subtract_ufunc(cls, override=np.bitwise_xor)
        cls._multiply = multiply_ufunc(cls, override=np.bitwise_and)
        cls._reciprocal = reciprocal(cls)
        cls._divide = divide(cls)
        cls._power = power(cls)
        cls._log = log(cls)
        cls._sqrt = sqrt(cls)

We can probably make all of the ufuncs for GF2 (reciprocal, divide, etc) also work for bit-packed GF2 arrays. If that's the case, maybe a more elegant solution is to not have a separate class for GF2BP, but to have a bitpacked property in GF2 itself.

I'm uncomfortable bumping the minimum NumPy version to 2.0. It's too minimally adopted and too much infrastructure relies on v1. A compromise could be having a v1 and v2 implementation of matmul for GF2BP arrays. If the present NumPy version is 2.0 or greater, then we can use np.bitwise_count().

amirebrahimi · 2024-12-12T17:22:02Z

Thanks, @mhostetter. I'll work on incorporating this feedback in. One thing I wanted to clarify since it seems to be a misunderstanding. GF2BP does the bit-packing for you, so it doesn't require your matrices to already be bit-packed. With that information, does that change any of your feedback?

mhostetter · 2024-12-12T17:42:56Z

One thing I wanted to clarify since it seems to be a misunderstanding. GF2BP does the bit-packing for you, so it doesn't require your matrices to already be bit-packed. With that information, does that change any of your feedback?

That is my understanding of your current implementation, but also something I'm suggesting we change. To me, having an internal data conversion seems dubious. What is the motivation for doing the conversions internally? In my proposal, if a user wanted to start with an unpacked array, they would x = galois.GF2(unpacked_array) and then x_bp = np.packbits(x), which returns a GF2BP. If they want to start with a bit-packed array, they would x_bp = galois.GF2BP(packed_array).

My issue with the internal (mandatory) data conversion is that it is not idempotent. For instance, doing x_bp = galois.GF2BP(x_bp) repeatedly could change the value of x_bp. This isn't true of any of the other Galois field arrays. Here's an example:

x = np.packbits([0, 0, 0, 1])
print(x)
# [16]
x = np.packbits(x)
print(x)
# [128]
x = np.packbits(x)
print(x)
# [128]

mhostetter · 2024-12-12T17:43:47Z

Also, I'll try to think about an easy way to plug into the test infrastructure.

amirebrahimi · 2024-12-13T00:05:01Z

To your question about whether it's necessary to track the unpacked shape. I think so because otherwise when you unpack you won't be able to supply the count property and will wind up with potentially extra padded zeros.

mhostetter · 2024-12-13T13:18:09Z

To your question about whether it's necessary to track the unpacked shape. I think so because otherwise when you unpack you won't be able to supply the count property and will wind up with potentially extra padded zeros.

Yes, I agree with that. I was thinking we could populate a property like ._count (or whatever name makes sense) when the array is created inside the override to np.packbits().

So the user has x = galois.GF2.Random((100, 6)) and then calls x_bp = np.packbits(x). Inside the intercepted call to packbits() we can x_bp = np.packbits(x) and then x_bp = x_bp.view(galois.GF2BP) and then set x_bp._count = -2.

Later when the user wants to unpack it, calling y = np.unpackbits(x_bp), we can use the internal ._count property if count=None. This way the unpacked array has the correct original shape. So rather than tracking the shape of the unpacked array, we just need to know how many zeros were appended on the last axis when the array was packed.

I can help you implement the interception of packbits() and unpackbits(), if you'd like. It might be a bit tricky.

amirebrahimi · 2024-12-14T00:58:13Z

Yes, I agree with that. I was thinking we could populate a property like ._count (or whatever name makes sense) when the array is created inside the override to np.packbits().

I've implemented packbits and unpackbits.

So the user has x = galois.GF2.Random((100, 6)) and then calls x_bp = np.packbits(x). Inside the intercepted call to packbits() we can x_bp = np.packbits(x) and then x_bp = x_bp.view(galois.GF2BP) and then set x_bp._count = -2.

Later when the user wants to unpack it, calling y = np.unpackbits(x_bp), we can use the internal ._count property if count=None. This way the unpacked array has the correct original shape. So rather than tracking the shape of the unpacked array, we just need to know how many zeros were appended on the last axis when the array was packed.

I'm now tracking the axis element count because it seemed more straightforward. Is there a specific reason why you wanted to track the appended zero count?

amirebrahimi · 2024-12-14T01:17:20Z

If we use np.packbits(), I don't think it's necessary to keep track of the unpacked shape, do you? Also, in the code snippet below, I don't believe the output unpacked shape equals the first input's unpacked shape. This is because of NumPy's broadcasting rules.
    def __call__(self, ufunc, method, inputs, kwargs, meta):
        output = super().__call__(ufunc, method, inputs, kwargs, meta)
        output._unpacked_shape = inputs[0]._unpacked_shape
        return output

Coming back to this. I was trying to find a case of broadcasting where the output shape would not be the same as the input shape and couldn't. Numpy will complain about the broadcasting if the shapes aren't compatible both for the unpacked and packed case.

a = GF2.Random((10, 10))
b = GF2.Random((1, 10))
x = np.packbits(a)
y = np.packbits(b)
print((a + b).shape)
print((x + y).shape)

If the shapes aren't broadcastable, then it will throw a ValueError:

a = GF2.Random((10, 10))
b = GF2.Random((2, 10))
x = np.packbits(a)
y = np.packbits(b)
print((b + a).shape)
print((y + x).shape)

Was your concern with how I'm choosing the first input, which would cause an issue if the operands were switched (as in the last example)? If so, then I think a max over the inputs ._axis_count would work.

mhostetter · 2024-12-17T15:24:31Z

Was your concern with how I'm choosing the first input, which would cause an issue if the operands were switched (as in the last example)? If so, then I think a max over the inputs ._axis_count would work.

Yes, that was a concern.

Coming back to this. I was trying to find a case of broadcasting where the output shape would not be the same as the input shape and couldn't. Numpy will complain about the broadcasting if the shapes aren't compatible both for the unpacked and packed case.

Using ufunc methods, like outer or accumulate, can change the broadcasted shape.

import numpy as np

a = np.random.randint(0, 2, 10)
b = np.random.randint(0, 2, 10)
x = np.packbits(a)
y = np.packbits(b)
print((a * b).shape)
print((x * y).shape)
print(np.multiply.outer(a, b).shape)
print(np.multiply.outer(x, y).shape)
# (10,)
# (2,)
# (10, 10)
# (2, 2)

Also, here's crazy example of broadcasting over multiple axes.

import numpy as np

a = np.random.randint(0, 2, (1, 2, 3))
b = np.random.randint(0, 2, (2, 2, 1))
x = np.packbits(a)
y = np.packbits(b)
print((a * b).shape)
print((x * y).shape)
print(np.multiply.outer(a, b).shape)
print(np.multiply.outer(x, y).shape)
# (2, 2, 3)
# (1,)
# (1, 2, 3, 2, 2, 1)
# (1, 1)

I think the question is, can we track the axis elements (or appended zeros) through all of the broadcasting and manipulation?

amirebrahimi · 2024-12-21T01:11:03Z

I think the question is, can we track the axis elements (or appended zeros) through all of the broadcasting and manipulation?

Yes, that is a good question. I'll explore that soon.

Just wanted to give you an update on where I'm at. I started looking at making a bit-packed version of inv, which then had me look at concatenate, which has ultimately led me to consider indexing for both getting items and setting items. At first I started with an explicit version (i.e. separate methods that would need to be called), leaving the default __getitem__ and __setitem__ intact, but now I've decided to see how things work if I move back towards an implicit unpacking.

Currently, I've got index update rules to handle indexing without defaulting to unpacking the whole matrix. I think I can simplify these rules if I convert them into a canonical form first. There's a test file I had Chat GPT generate to provide some examples of all the ways one can index.

My thought for going down this path is I don't see much utility to the class if there aren't at least some of the more common numpy functions supported in bit-packed form. For us, matrix inverses are one of those.

amirebrahimi · 2024-12-24T21:16:54Z

Finished a bitpacked implementation of np.linalg.inv:

# %%
a = GF2.Random((1024, 1024), seed=4)
x = np.packbits(a)
%timeit np.linalg.inv(a)  # ~3s
%timeit np.linalg.inv(x)  # ~734ms
assert np.array_equal(a, np.unpackbits(x))

So, there's a 4.5x speedup w/ 1k matrices.

Also, did some tests in general to see if unpacking portions of matrices was worthwhile over unpacking the full matrix (to validate having a custom __getitem__ and __setitem__:

# %%
M, N = 1800, 1000
a = np.random.randint(0, 2, size=(M, N), dtype=np.uint8)
A = np.packbits(a, axis=-1)  # shape: (M, ceil(N/8))

# ~434 us
%timeit np.unpackbits(A, axis=-1, count=N)

# %%
# ~3 us
%timeit np.unpackbits(A[np.random.randint(0, M)], axis=-1, count=N)

So, >100x speedup -- I think it's worthwhile to keep this approach around. I think if one wants to get at the rawdata, then they just perform a .view(np.ndarray)

I'll plan to look into broadcasting and clean up / normalize the indexing code when I'm back on Thursday.

amirebrahimi · 2024-12-24T21:18:55Z

tests/fields/test_bitpacked.py

+    # 9. Using np.newaxis (reshaped array assignment)
+    arr = GF([1, 0, 1, 1])
+    arr = np.packbits(arr)
+    reshaped = arr[:, np.newaxis]  # should this be using arr's data (as would be the case without packbits) or a new array?


Still not sure what to do with this case.

amirebrahimi · 2024-12-31T19:23:04Z

I've cleaned up the indexing routines and have started to look at broadcasting. I found a bug in my inv_bitpacked implementation that I need to fix first though.

… indexing bugs

amirebrahimi added 8 commits December 6, 2024 11:28

Take a first pass at bitpacked fields.

f2e1112

Check in WIP on implicit bitpacking

a209f86

Clean up formatting and move towards a more explicit version.

3ec5def

Check in working matmul (w/ updated numpy)

27048bc

Clean up matmul function

3410deb

Merge remote-tracking branch 'upstream/main' into explicit

c63655d

Remove class-level bitpacked field

d9ba03c

Fix matmul shape bug and __new__

7bf44e4

amirebrahimi changed the base branch from main to release/0.4.x December 12, 2024 02:15

Override np.packbits and np.unpackbits

7054295

Disallow unsupported numpy functions

b9d75e0

amirebrahimi added 2 commits December 19, 2024 16:42

Start work on concatenate, inv, and indexing

1186fa4

Add additional indexing support

c07ba0f

amirebrahimi added 3 commits December 24, 2024 12:24

Add tests for setitem and implement

7c15a85

Get inv working

6c1d62f

Fix up repacked_index

f2b286a

amirebrahimi commented Dec 24, 2024

View reviewed changes

amirebrahimi added 2 commits December 26, 2024 15:15

Implement bit-packed outer product

913b95d

Try normalizing indexing

28267d1

amirebrahimi added 2 commits December 31, 2024 09:35

Clean up indexing rules

7d44824

Fix shape / slice bugs

57bf589

Normalize to positive indexing; .shape is now the unpacked shape; fix…

8d2ad26

… indexing bugs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bit-packed GF2 #583

Bit-packed GF2 #583

amirebrahimi commented Dec 12, 2024 •

edited

Loading

mhostetter commented Dec 12, 2024

amirebrahimi commented Dec 12, 2024 •

edited

Loading

mhostetter commented Dec 12, 2024

mhostetter commented Dec 12, 2024

amirebrahimi commented Dec 13, 2024

mhostetter commented Dec 13, 2024

amirebrahimi commented Dec 14, 2024

amirebrahimi commented Dec 14, 2024 •

edited

Loading

mhostetter commented Dec 17, 2024

amirebrahimi commented Dec 21, 2024

amirebrahimi commented Dec 24, 2024 •

edited

Loading

amirebrahimi Dec 24, 2024 •

edited

Loading

amirebrahimi commented Dec 31, 2024

Bit-packed GF2 #583

Are you sure you want to change the base?

Bit-packed GF2 #583

Conversation

amirebrahimi commented Dec 12, 2024 • edited Loading

mhostetter commented Dec 12, 2024

amirebrahimi commented Dec 12, 2024 • edited Loading

mhostetter commented Dec 12, 2024

mhostetter commented Dec 12, 2024

amirebrahimi commented Dec 13, 2024

mhostetter commented Dec 13, 2024

amirebrahimi commented Dec 14, 2024

amirebrahimi commented Dec 14, 2024 • edited Loading

mhostetter commented Dec 17, 2024

amirebrahimi commented Dec 21, 2024

amirebrahimi commented Dec 24, 2024 • edited Loading

amirebrahimi Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

amirebrahimi commented Dec 31, 2024

amirebrahimi commented Dec 12, 2024 •

edited

Loading

amirebrahimi commented Dec 12, 2024 •

edited

Loading

amirebrahimi commented Dec 14, 2024 •

edited

Loading

amirebrahimi commented Dec 24, 2024 •

edited

Loading

amirebrahimi Dec 24, 2024 •

edited

Loading