Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: IO::DupReader #14792

Open
straight-shoota opened this issue Jul 7, 2024 · 2 comments
Open

Proposal: IO::DupReader #14792

straight-shoota opened this issue Jul 7, 2024 · 2 comments

Comments

@straight-shoota
Copy link
Member

When working with IO streams, it's often helpful to tap into the data stream and observe what's being sent.
This can be useful for debugging or auditing purposes, but also for calculating hashes or signatures of the streamed data.

An example for this mechanism is IO::Hexdump: It's an IO which wraps another one and dumps all data that goes through it into another IO (STDERR by default), in hex format.

Actually, IO::Hexdump performs two distinct functions: Tapping into an IO, and hex formatting. Either one of them could be useful without the other, but they're unified in a single type and cannot be used independently.

If there was a variant that performs only the hexdump feature (IO::HexdumpWithoutTap in the following example), the entire write functionality of IO::Hexdump could be implemented using IO::MultiWriter:

io_write = IO::Hexdump.new(sink, STDERR, write: true)

# equivalent:

io_write = IO::MultiWriter.new(sink, IO::HexdumpWithoutTap.new(STDERR))

Of course this is a bit less succinct, but not by much. And I think it's very clear.

The great thing about such composition is that it's useful for other purposes as well. You can easily exchange the hexdump for something else. For example, you could capture the data into an IO::Memory for later replay.

In the other direction, the read functionality cannot be implemented differently, because there is currently no equivalent of IO::MultiWriter for reading.

I'm proposing to add such an IO type. It would have a main source IO to read data from, and it sends all read data to a second IO, in addition to passing it to the caller.

The implementation is pretty trivial:

class IO::DupReader < IO
  def initialize(@source : IO, @sink : IO)
  end

  def read(slice : Bytes) : Int32
    @source.read(slice).tap do
      @sink.write(slice)
    end
  end

  delegate :peek, :close, :closed?, :flush, :tty?, :pos, :pos=, :seek, to: @io
end

This would allow an equivalent of the current integrated IO::Hexdump, which looks very similar to the write variant:

io_read = IO::Hexdump.new(source, STDERR, read: true)

# equivalent:

io_read = IO::DupReader.new(sink, IO::HexdumpWithoutTap.new(STDERR))

Again, it's easy to exchange the hexdump formatter for something else.

Addendum:

The implementation of IO::HexdumpWithoutTap would also be very simple:

class IO::HexdumpWithoutTap < IO
  def initialize(@io : IO = STDERR)
  end

  def write(slice : Bytes) : Nil
    return if slice.empty?

    slice.hexdump(@output)
  end

  delegate :peek, :close, :closed?, :flush, :tty?, :pos, :pos=, :seek, to: @io
end
@yxhuvud
Copy link
Contributor

yxhuvud commented Jul 7, 2024

Related function in C (at least on linux, no idea of if it exists on other OSes), : man 2 tee. Could at the very least be a possible optimization in certain cases).

If IO::Tee would be better as a name for it I don't know. DupReader is explicit in what it does, but tbh I'd have to read the documentation to figure out what the purpose of it was. 🤷

It is fairly simple though in any case, so perhaps it IO#tee could be an alternative (that is, just a method on IO rather than a class of its own) or even IO.tee (similar to IO.pipe).

I really like the idea of this kind of IO data flow control though.

@straight-shoota
Copy link
Member Author

Regarding the name, for tee I'd think of man 1 tee which is more similar to IO::MultiWriter. It writes all of the input to all outputs, not just the amount that's read by the main output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants