Skip to content
This repository has been archived by the owner on Feb 8, 2023. It is now read-only.

[BUG] Mars DataFrame merge with Series #32

Open
ChengjieLi28 opened this issue Sep 26, 2022 · 0 comments
Open

[BUG] Mars DataFrame merge with Series #32

ChengjieLi28 opened this issue Sep 26, 2022 · 0 comments

Comments

@ChengjieLi28
Copy link

Describe the bug
It seems that mars dataframe cannot merge with series.

To Reproduce
To help us reproducing this bug, please provide information below:

  1. Your Python version: 3.9.12
  2. The version of Mars you use: latest
  3. Versions of crucial packages, such as numpy, scipy and pandas: pandas 1.4.3
  4. Full stack of the error.
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 c = df2.merge(r_group_res, left_on=["c2"], right_on=["c2"])[['c1', 'c3']]

File ~/Projects/mars/mars/dataframe/merge/merge.py:1132, in merge(df, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate, method, auto_merge, auto_merge_threshold, bloom_filter, bloom_filter_options)
   1110             raise ValueError(
   1111                 f"Invalid filter {k}, available: {BLOOM_FILTER_ON_OPTIONS}"
   1112             )
   1113 op = DataFrameMerge(
   1114     how=how,
   1115     on=on,
   (...)
   1130     output_types=[OutputType.dataframe],
   1131 )
-> 1132 return op(df, right)

File ~/Projects/mars/mars/core/mode.py:77, in _EnterModeFuncWrapper.__call__.<locals>._inner(*args, **kwargs)
     74 @functools.wraps(func)
     75 def _inner(*args, **kwargs):
     76     with enter_mode(**mode_name_to_value):
---> 77         return func(*args, **kwargs)

File ~/Projects/mars/mars/dataframe/merge/merge.py:193, in DataFrameMerge.__call__(self, left, right)
    192 def __call__(self, left, right):
--> 193     empty_left, empty_right = build_df(left), build_df(right)
    194     # this `merge` will check whether the combination of those arguments is valid
    195     merged = empty_left.merge(
    196         empty_right,
    197         how=self.how,
   (...)
    207         validate=self.validate,
    208     )

File ~/Projects/mars/mars/dataframe/utils.py:573, in build_df(df_obj, fill_value, size, ensure_string)
    570     fill_values = fill_value
    572 for size, fill_value in zip(sizes, fill_values):
--> 573     dtypes = df_obj.dtypes
    574     record = [[_generate_value(dtype, fill_value) for dtype in dtypes]] * size
    575     df = pd.DataFrame(record)

File ~/Projects/mars/mars/core/entity/core.py:140, in Entity.__getattr__(self, attr)
    139 def __getattr__(self, attr):
--> 140     return getattr(self._data, attr)

AttributeError: 'SeriesData' object has no attribute 'dtypes'
  1. Minimized code to reproduce the error.
import mars
import mars.dataframe as md

df1 = md.DataFrame(
            {
                "c1": [3, 4, 5, 3, 5, 4, 1, 2, 3],
                "c2": [1, 3, 4, 5, 6, 5, 4, 4, 4],
                "c3": list("aabaaddce"),
                "c4": list("abaaaddce"),
            }
        )

df2 = md.DataFrame(
            {
                "c1": [3, 3, 4, 5, 6, 5, 4, 4, 4],
                "c2": [1, 3, 4, 5, 6, 5, 4, 4, 4],
                "c3": list("aabaaddce"),
                "c4": list("abaaaddce"),
            }
        )
r_group_res = df1.groupby(['c1'])['c2'].sum()
c = df2.merge(r_group_res, left_on=["c2"], right_on=["c2"])[['c1', 'c3']]

Expected behavior
Pandas 1.4.3 can execute the code above successfully.

Additional context
Add any other context about the problem here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant