Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ Core] cannot serialize polars.LazyFrame #46343

Open
jmakov opened this issue Jun 29, 2024 · 1 comment
Open

[ Core] cannot serialize polars.LazyFrame #46343

jmakov opened this issue Jun 29, 2024 · 1 comment
Labels
core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks question Just a question :)

Comments

@jmakov
Copy link
Contributor

jmakov commented Jun 29, 2024

What happened + What you expected to happen

Cannot send polars.LazyFrame to worker. Traceback:

---> 38 self._ref_lf = ray.put(lf)
     39 self._refs_task = []

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/auto_init_hook.py:21, in wrap_auto_init.<locals>.auto_init_wrapper(*args, **kwargs)
     18 @wraps(fn)
     19 def auto_init_wrapper(*args, **kwargs):
     20     auto_init_ray()
---> 21     return fn(*args, **kwargs)

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/client_mode_hook.py:103, in client_mode_hook.<locals>.wrapper(*args, **kwargs)
    101     if func.__name__ != "init" or is_client_mode_enabled_by_default:
    102         return getattr(ray, func.__name__)(*args, **kwargs)
--> 103 return func(*args, **kwargs)

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/worker.py:2720, in put(value, _owner)
   2718 with profiling.profile("ray.put"):
   2719     try:
-> 2720         object_ref = worker.put_object(value, owner_address=serialize_owner_address)
   2721     except ObjectStoreFullError:
   2722         logger.info(
   2723             "Put failed since the value was either too large or the "
   2724             "store was full of pinned objects."
   2725         )

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/worker.py:764, in Worker.put_object(self, value, object_ref, owner_address, _is_experimental_channel)
    759     assert (
    760         object_ref is None
    761     ), "Local Mode does not support inserting with an ObjectRef"
    763 try:
--> 764     serialized_value = self.get_serialization_context().serialize(value)
    765 except TypeError as e:
    766     sio = io.StringIO()

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/serialization.py:519, in SerializationContext.serialize(self, value)
    517     return RawSerializedObject(value)
    518 else:
--> 519     return self._serialize_to_msgpack(value)

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/serialization.py:497, in SerializationContext._serialize_to_msgpack(self, value)
    495 if python_objects:
    496     metadata = ray_constants.OBJECT_METADATA_TYPE_PYTHON
--> 497     pickle5_serialized_object = self._serialize_to_pickle5(
    498         metadata, python_objects
    499     )
    500 else:
    501     pickle5_serialized_object = None

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/serialization.py:444, in SerializationContext._serialize_to_pickle5(self, metadata, value)
    442 except Exception as e:
    443     self.get_and_clear_contained_object_refs()
--> 444     raise e
    445 finally:
    446     self.set_out_of_band_serialization()

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/_private/serialization.py:439, in SerializationContext._serialize_to_pickle5(self, metadata, value)
    437 try:
    438     self.set_in_band_serialization()
--> 439     inband = pickle.dumps(
    440         value, protocol=5, buffer_callback=writer.buffer_callback
    441     )
    442 except Exception as e:
    443     self.get_and_clear_contained_object_refs()

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/cloudpickle/cloudpickle.py:1479, in dumps(obj, protocol, buffer_callback)
   1477 with io.BytesIO() as file:
   1478     cp = Pickler(file, protocol=protocol, buffer_callback=buffer_callback)
-> 1479     cp.dump(obj)
   1480     return file.getvalue()

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/ray/cloudpickle/cloudpickle.py:1245, in Pickler.dump(self, obj)
   1243 def dump(self, obj):
   1244     try:
-> 1245         return super().dump(obj)
   1246     except RuntimeError as e:
   1247         if len(e.args) > 0 and "recursion" in e.args[0]:

File ~/mambaforge-pypy3/envs/test/lib/python3.11/site-packages/polars/lazyframe/frame.py:324, in LazyFrame.__getstate__(self)
    323 def __getstate__(self) -> bytes:
--> 324     return self._ldf.__getstate__()

RuntimeError: BindingsError: "Value(\"the enum variant Expr::RenameAlias cannot be serialized\")"```


### Versions / Dependencies

ray 2.31.0
python 3.11.9

### Reproduction script

import polars
import ray


lf = polars.LazyFrame(...)
ray.put(lf)

### Issue Severity

None
@jmakov jmakov added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jun 29, 2024
@anyscalesam anyscalesam added the core Issues that should be addressed in Ray Core label Jul 8, 2024
@jjyao
Copy link
Collaborator

jjyao commented Jul 8, 2024

Is polars.LazyFrame pickable? If not then Ray cannot serialize it and you can reach out to polars people to make it pickable.

@jjyao jjyao added question Just a question :) P1 Issue that should be fixed within a few weeks and removed bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Issues that should be addressed in Ray Core P1 Issue that should be fixed within a few weeks question Just a question :)
Projects
None yet
Development

No branches or pull requests

3 participants