Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify ZipLongest implementation #905

Closed
wants to merge 1 commit into from
Closed

Simplify ZipLongest implementation #905

wants to merge 1 commit into from

Conversation

viceroypenguin
Copy link
Contributor

The various Zip* methods currently share implementation in .ZipImpl(). However, the code is simpler for each one when they are written separately; that is, in this case, the variances are more than the commonality. As such, we should allow each method to be implemented separately.

This PR updates the implementation of ZipLongest to be done separately from .ZipImpl().

@viceroypenguin
Copy link
Contributor Author

NB: Depends on #901.

@atifaziz
Copy link
Member

Possibly a duplicate of PR #715 by @Orace?

@viceroypenguin
Copy link
Contributor Author

Duplicate in the sense that both separate the implementations of the three zip functions, yes. However, #715 uses T4 to generate separate implementations of each method for argument count 2-4. Also, #715 and #905 have very different implementations (subject to debate which is better).

@Orace
Copy link
Contributor

Orace commented Jan 24, 2023

Some thoughts about this:

  • In Zip refactoring #715, the TryRead method should be moved to EquiZip.g.cs as a private method.
  • The used pattern probably wastes resources. It's implemented for 4 sources, and for other cases, it uses dummy sources and discards values with a lambda (resulting to memory allocation that can be avoided):
    return ZipLongestImpl(first, second, Enumerable.Repeat(default(object?), int.MaxValue), Enumerable.Repeat(default(object?), int.MaxValue), (a, b, _, _) => resultSelector(a, b), 2);

    It will not scale well if we want to add overloads with more sources.
  • Calls to .MoveNext() after an enumerable is complete should be avoided. Moreover this is tested with TestingSequence
    enumerator.MoveNextCalled += (_, moved) =>
    {
    Assert.That(ended, Is.False, "LINQ operators should not continue iterating a sequence that has terminated.");
    ended = !moved;
    MoveNextCallCount++;
    };

@viceroypenguin
Copy link
Contributor Author

viceroypenguin commented Jan 24, 2023

Some thoughts about this:

  • The used pattern probably wastes resources. It's implemented for 4 sources, and for other cases, it uses dummy sources and discards values with a lambda (resulting to memory allocation that can be avoided):
    return ZipLongestImpl(first, second, Enumerable.Repeat(default(object?), int.MaxValue), Enumerable.Repeat(default(object?), int.MaxValue), (a, b, _, _) => resultSelector(a, b), 2);

I did do some testing, and while the memory usage is negligible (400b vs 176b), it does cost an additional 30-50% to do it the #905 way. Probably has mostly to do with the extra MoveNext calls being done on the dummy enumerables. 🤔

It will not scale well if we want to add overloads with more sources.

Not sure that this is a concern. Asking for more than 4 enumerables to be zipped together seems a little excessive, and should probably be solved by the person requesting it.

  • Calls to .MoveNext() after an enumerable is complete should be avoided. Moreover this is tested with TestingSequence
    enumerator.MoveNextCalled += (_, moved) =>
    {
    Assert.That(ended, Is.False, "LINQ operators should not continue iterating a sequence that has terminated.");
    ended = !moved;
    MoveNextCallCount++;
    };

Already discussed and addressed in #901/#936. I won't repeat the discussion further here.

@viceroypenguin
Copy link
Contributor Author

Hey @Orace, I took your idea and managed to improve performance by another 10-20%. Feel free to take a look at the updated TryRead() and code gen. It's Scriban instead of T4, but should be easy to translate, I hope?

Closing this PR in favor of an updated #715.

@viceroypenguin viceroypenguin deleted the ziplongest-implementation branch January 27, 2023 12:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants