You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Task graphs containing coffea.lumi_tools.LumiData fail to pickle/serialize when trying to compute inside a dask.distributed.Client context.
To Reproduce
Setup is as follows.
import awkward as ak
import dask
import dask_awkward as dak
from dask.distributed import Client
from coffea.lumi_tools import LumiData, LumiList
LUMI_CSV_2023 = "utils/lumis/lumi2023.csv" #produced from brilcalc, see end of "To Reproduce"
runs_eager = ak.Array([368229, 368229, 368229, 368229])
runs = dak.from_awkward(runs_eager,2)
lumis_eager = ak.Array([74, 74, 74, 74])
lumis = dak.from_awkward(lumis_eager,2)
def count_lumi(runs,lumis):
total_lumi = 0
my_lumilist = LumiList(runs,lumis)
my_lumidata = LumiData(LUMI_CSV_2023)
total_lumi += my_lumidata.get_lumi(my_lumilist)
return total_lumi
LUMI_CSV_2023 is produced with brilcalc - if it's interesting to know how this is done, I can provide more details.
Expected behavior
Computing with and without a Client should result in a float.
Output
2025-01-24 12:25:28,650 - distributed.protocol.pickle - ERROR - Failed to serialize <ToPickle: HighLevelGraph with 8 layers.
<dask.highlevelgraph.HighLevelGraph object at 0x7f519ff778d0>
0. Dict-e9470c53-4fde-4b2e-9fba-355b2af184dd
1. lumilist-unique-chunked-45752f79d9f1cd5c491c6273eea610de
2. lumilist-unique-finalize-180fa53edd47b037d837d5cc42a9cad3
3. <dask-awkward.lib.core.ArgsKwargsPackedFunction ob-aa20eed3f086352e6b8d3580ec93efca
4. sum-finalize-37e6b194124cdba8701c0bcccda3d779
5. partitions-a42ffbc4519a9ece50642e0abf538c03
6. getitem-c792600458c5b6f9356df7390e977633
7. add-2371b017cb36d22d5eacb3d8120c99f8
>.
Traceback (most recent call last):
File "[/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 60](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=59), in dumps
result = pickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_pickle.PicklingError: Can't pickle <function add at 0x7f51c0fbfec0>: it's not the same object as _operator.add
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 65](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=64), in dumps
pickler.dump(x)
_pickle.PicklingError: Can't pickle <function add at 0x7f51c0fbfec0>: it's not the same object as _operator.add
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "[/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py", line 77](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=76), in dumps
result = cloudpickle.dumps(x, **dump_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1537](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py#line=1536), in dumps
cp.dump(obj)
File "[/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py", line 1303](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py#line=1302), in dump
return super().dump(obj)
^^^^^^^^^^^^^^^^^
TypeError: cannot pickle '_nrt_python._MemInfo' object
PicklingError Traceback (most recent call last)
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py:60](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=59), in dumps(x, buffer_callback, protocol)
59 try:
---> 60 result = pickle.dumps(x, **dump_kwargs)
61 except Exception:
PicklingError: Can't pickle <function add at 0x7f51c0fbfec0>: it's not the same object as _operator.add
During handling of the above exception, another exception occurred:
PicklingError Traceback (most recent call last)
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py:65](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=64), in dumps(x, buffer_callback, protocol)
64 buffers.clear()
---> 65 pickler.dump(x)
66 result = f.getvalue()
PicklingError: Can't pickle <function add at 0x7f51c0fbfec0>: it's not the same object as _operator.add
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py:366](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py#line=365), in serialize(x, serializers, on_error, context, iterate_collection)
365 try:
--> 366 header, frames = dumps(x, context=context) if wants_context else dumps(x)
367 header["serializer"] = name
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py:78](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py#line=77), in pickle_dumps(x, context)
76 writeable.append(not f.readonly)
---> 78 frames[0] = pickle.dumps(
79 x,
80 buffer_callback=buffer_callback,
81 protocol=context.get("pickle-protocol", None) if context else None,
82 )
83 header = {
84 "serializer": "pickle",
85 "writeable": tuple(writeable),
86 }
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py:77](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/pickle.py#line=76), in dumps(x, buffer_callback, protocol)
76 buffers.clear()
---> 77 result = cloudpickle.dumps(x, **dump_kwargs)
78 except Exception:
File [/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1537](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py#line=1536), in dumps(obj, protocol, buffer_callback)
1536 cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1537 cp.dump(obj)
1538 return file.getvalue()
File [/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1303](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/cloudpickle/cloudpickle.py#line=1302), in Pickler.dump(self, obj)
1302 try:
-> 1303 return super().dump(obj)
1304 except RuntimeError as e:
TypeError: cannot pickle '_nrt_python._MemInfo' object
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
Cell In[6], line 7
5 print(type(output))
6 print(output)
----> 7 coutput = dask.compute(output)[0]
8 print(type(coutput))
9 print(coutput)
File [/usr/local/lib/python3.11/site-packages/dask/base.py:660](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/dask/base.py#line=659), in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
657 postcomputes.append(x.__dask_postcompute__())
659 with shorten_traceback():
--> 660 results = schedule(dsk, keys, **kwargs)
662 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
File [/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py:392](https://cms01.hep.wisc.edu:8003/user/rsimeon/lab/tree/home/usr/local/lib/python3.11/site-packages/distributed/protocol/serialize.py#line=391), in serialize(x, serializers, on_error, context, iterate_collection)
390 except Exception:
391 raise TypeError(msg) from exc
--> 392 raise TypeError(msg, str_x) from exc
393 else: # pragma: nocover
394 raise ValueError(f"{on_error=}; expected 'message' or 'raise'")
TypeError: ('Could not serialize object of type HighLevelGraph', '<ToPickle: HighLevelGraph with 8 layers.\n<dask.highlevelgraph.HighLevelGraph object at 0x7f519ff778d0>\n 0. Dict-e9470c53-4fde-4b2e-9fba-355b2af184dd\n 1. lumilist-unique-chunked-45752f79d9f1cd5c491c6273eea610de\n 2. lumilist-unique-finalize-180fa53edd47b037d837d5cc42a9cad3\n 3. <dask-awkward.lib.core.ArgsKwargsPackedFunction ob-aa20eed3f086352e6b8d3580ec93efca\n 4. sum-finalize-37e6b194124cdba8701c0bcccda3d779\n 5. partitions-a42ffbc4519a9ece50642e0abf538c03\n 6. getitem-c792600458c5b6f9356df7390e977633\n 7. add-2371b017cb36d22d5eacb3d8120c99f8\n>')
Desktop (please complete the following information):
OS: linux
Version: AlmaLinux 9
Additional context
Running within a container with an image inheriting from coffeateam/coffea-dask-almalinux9:2025.1.1-py3.11 and dask_jobqueue version 0.9.0.
The text was updated successfully, but these errors were encountered:
small_lumi.csv
Describe the bug
Task graphs containing coffea.lumi_tools.LumiData fail to pickle/serialize when trying to compute inside a dask.distributed.Client context.
To Reproduce
Setup is as follows.
The following code works as intended:
where
coutput_noclient
is a float. The following code fails:Error output is included in "Output" section.
LUMI_CSV_2023
is produced withbrilcalc
- if it's interesting to know how this is done, I can provide more details.Expected behavior
Computing with and without a
Client
should result in a float.Output
Desktop (please complete the following information):
Additional context
Running within a container with an image inheriting from coffeateam/coffea-dask-almalinux9:2025.1.1-py3.11 and
dask_jobqueue
version 0.9.0.The text was updated successfully, but these errors were encountered: