Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ObjectMapper.toJsonNode() Collector Stream Support #4691

Closed

Conversation

rikkarth
Copy link
Contributor

@rikkarth rikkarth commented Sep 10, 2024

Context from previous discussion

FasterXML/jackson#239

I am not 100% sure I quite understand the specific idea/suggestion, but I think that:

  1. Adding more support within JsonNode to help with Java 8 streams (new accessors) would be fine
  2. I am not a big fan of adding helper classes (like JsonNodeUtil or JacksonCollection), ideally extensions would start from existing main API types
  3. This would almost certainly go in jackson-databind.

So: improving ergonomics, +1. Just need to know how, what and where to add pieces.

Description

Based on point #2 (above), I've decided to perform this implementation in the ObjectMapper because it was where it made more sense to me. It naturally falls nice into it, since this class is usually used in conjunction with this kind of operations.

Usage

public class ObjectMapperTest extends DatabindTestUtil {

    @Test
    public void testCollector(){
        final ObjectMapper objectMapper = new ObjectMapper();

        final JsonNode jsonNodeResult = IntStream.range(0, 10)
            .mapToObj(i -> {
                // we save 10 nodes where testNumber is incremented by 1 each time
                ObjectNode objectNode = objectMapper.createObjectNode();
                objectNode.put("testString", "example");
                objectNode.put("testNumber", i);
                objectNode.put("testBoolean", true);

                return objectNode; // returned through the Stream
            })
            .collect(objectMapper.toJsonNode()); // collected into a JsonNode

        System.out.println(jsonNodeResult.toPrettyString()); // I left this here so you can test it locally will remove if idea is approved

        assertEquals(10, jsonNodeResult.size()); // testing that all the nodes are saved accordingly
        jsonNodeResult.forEach(jsonNode -> assertFalse(jsonNode.isEmpty())); // testing that no nodes are empty
    }
}

Implementation

    /**
     * Creates a {@link Collector} that collects {@link JsonNode} elements into an {@link ArrayNode}.
     * <p>
     * This method uses this instance of {@link ObjectMapper} to create an empty {@link ArrayNode} and then adds each
     * {@link JsonNode} to it.
     * </p>
     *
     * @return a {@link Collector} that collects {@link JsonNode} elements into an {@link ArrayNode}
     *
     * @since 3.0
     */
    public Collector<JsonNode, ArrayNode, ArrayNode> toJsonNode() {
        return Collector.of(
            this::createArrayNode, // supplier
            ArrayNode::add, // accumulator
            ArrayNode::addAll // combiner
        );
    }

cc @cowtowncoder

unit-test in ObjectMapperTest to validate base solution
@rikkarth rikkarth force-pushed the feat/toJsonNodeStreamCollector branch from 66004a3 to 4fcaf6f Compare September 10, 2024 19:46
@cowtowncoder
Copy link
Member

I must say I don't quite understand this... need to re-read a few times, maybe I get it.

@rikkarth
Copy link
Contributor Author

I must say I don't quite understand this... need to re-read a few times, maybe I get it.

You mean my implementation? If so, I can break it down and explain it in a clearer way if you wish.

@iProdigy
Copy link
Contributor

Is this use case actually common in the wild? I rarely hold onto JsonNode objects, instead simply deserializing to custom POJO classes. Later I can serialize List<MyPojo> and never need to think about ArrayNode

@rikkarth
Copy link
Contributor Author

In my use case I don't deserialise JsonNodes because we write the output directly to other processes. Deserialising implies a performance and memory overhead that we can't afford since we are dealing with high data volumes in limited amount of memory.

Regardless, having an extra feature which only provides additional support without any side-effects should pose no conflicts to the proposal as a whole, I hope.

@iProdigy
Copy link
Contributor

Deserialising implies a performance and memory overhead that we can't afford since we are dealing with high data volumes in limited amount of memory.

For high-performance within limited memory, you probably want to use jackson's Streaming API rather than Stream<T>

@rikkarth
Copy link
Contributor Author

Deserialising implies a performance and memory overhead that we can't afford since we are dealing with high data volumes in limited amount of memory.

For high-performance within limited memory, you probably want to use jackson's Streaming API rather than Stream<T>

Thank you for the suggestion, I will for sure look into it, but for now this lib provides the level of abstraction and performance target we are looking for. This proposal only aims at providing an additional support to Streams within the usage example provided above.

Is there any technical reason why this method can't be added as an extra layer of support?

It would help people like me not have to implement this method every time we import jackson-databind.

Thank you for your considerations so far, I do appreciate them truly.

@cowtowncoder
Copy link
Member

cowtowncoder commented Sep 11, 2024

@rikkarth No, not the implementation but actual usage, need.

And my question/comment was not meant as "no" but trying to understand the benefit of addition.

@yawkat
Copy link
Member

yawkat commented Sep 12, 2024

I think it's useful. Maybe should be called toArrayNode, though.

@rikkarth
Copy link
Contributor Author

I think it's useful. Maybe should be called toArrayNode, though.

Good point. I named it toJsonNode because the final return value should be JsonNode, but in practice it is an ArrayNode that is being returned as the result of the collection.

I can rename it if the suggestion is approved.

* @return a {@link Collector} that collects {@link JsonNode} elements into an {@link ArrayNode}
*/
public static Collector<JsonNode, ArrayNode, ArrayNode> toJsonNode() {
final JsonNodeFactory jsonNodeFactory = new JsonNodeFactory();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although for most use the default JsonNodeFactory is fine, I think it'd better to have a variant that takes in JsonNodeFactory (or JsonNodeCreator maybe, which it implements), and then this overload that uses

JsonNodeFactory.instance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check now please?

@cowtowncoder
Copy link
Member

@rikkarth Ok, looks good in general, would be happy to merge. But realized 2 things:

First, before merging, we'd need a CLA (if you haven't sent one earlier -- one is good for all contributions). It's from

https://github.com/FasterXML/jackson/blob/master/contributor-agreement.pdf

and is easiest to print, fill & sign, scan/photo, email to cla at fasterxml dot com.
Once that is received I can merge PR.

But more important question is this: PR is against master, which is for Jackson 3.0. Is this intentional? 3.0 is probably not going to be released (3.0.0 final that is) before end of 2024 so it'll be a while. I would be happy to instead merge this in 2.18, but that would require new PR.
I am ok with 3.0 of course, it's up to you; just let me know how to proceed.

@rikkarth
Copy link
Contributor Author

rikkarth commented Sep 20, 2024

I will perform a new PR for 2.18 and will fill the document (done) as requested at the earliest.

Thank you.

@cowtowncoder
Copy link
Member

Closing in favor of #4709 (which I'll merge forward).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants