Skip to content

Commit

Permalink
LazyEncoder added
Browse files Browse the repository at this point in the history
  • Loading branch information
dehesa committed May 6, 2020
1 parent 2db5f56 commit 5cfe55b
Show file tree
Hide file tree
Showing 14 changed files with 561 additions and 60 deletions.
67 changes: 53 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ A `CSVWriter` encodes CSV information into a specified target (i.e. a `String`,
for row in input {
try writer.write(row: row)
}
try writer.endFile()
try writer.endEncoding()
```

Alternatively, you may write directly to a buffer in memory and access its `Data` representation.
Expand All @@ -223,7 +223,7 @@ A `CSVWriter` encodes CSV information into a specified target (i.e. a `String`,
for row in input.dropFirst() {
try writer.write(row: row)
}
try writer.endFile()
try writer.endEncoding()
let result = try writer.data()
```

Expand All @@ -241,7 +241,7 @@ A `CSVWriter` encodes CSV information into a specified target (i.e. a `String`,
try writer.write(fields: input[2])
try writer.endRow()

try writer.endFile()
try writer.endEncoding()
```

`CSVWriter` has a wealth of low-level imperative APIs, that let you write one field, several fields at a time, end a row, write an empty row, etc.
Expand Down Expand Up @@ -330,7 +330,7 @@ let decoder = CSVDecoder()
let result = try decoder.decode(CustomType.self, from: data)
```
`CSVDecoder` can decode CSVs represented as a `Data` blob, a `String`, or an actual file in the file system.
`CSVDecoder` can decode CSVs represented as a `Data` blob, a `String`, an actual file in the file system, or an `InputStream` (e.g. `stdin`).
```swift
let decoder = CSVDecoder { $0.bufferingStrategy = .sequential }
Expand Down Expand Up @@ -377,33 +377,33 @@ decoder.decimalStrategy = .custom { (decoder) in
</p></details>
<details><summary><code>CSVDecoder.LazySequence</code>.</summary><p>
<details><summary><code>CSVDecoder.LazyDecoder</code>.</summary><p>
A CSV input can be decoded _on demand_ with the decoder's `lazy(from:)` function.
```swift
var sequence = CSVDecoder().lazy(from: fileURL)
let lazyDecoder = CSVDecoder().lazy(from: fileURL)
while let row = sequence.next() {
let student = try row.decode(Student.self)
// Do something here
}
```
`LazySequence` conforms to Swift's [`Sequence` protocol](https://developer.apple.com/documentation/swift/sequence), letting you use functionality such as `map()`, `allSatisfy()`, etc. Please note, `LazySequence` cannot be used for repeated access. It _consumes_ the input CSV.
`LazyDecoder` conforms to Swift's [`Sequence` protocol](https://developer.apple.com/documentation/swift/sequence), letting you use functionality such as `map()`, `allSatisfy()`, etc. Please note, `LazyDecoder` cannot be used for repeated access. It _consumes_ the input CSV.
```swift
var sequence = decoder.lazy(from: fileData)
let students = try sequence.map { try $0.decode(Student.self) }
let lazyDecoder = CSVDecoder(configuration: config).lazy(from: fileData)
let students = try lazyDecoder.map { try $0.decode(Student.self) }
```
A nice benefit of using the _lazy_ operation, is that it lets you switch how a row is decoded at any point. For example:
```swift
var sequence = decoder.lazy(from: fileString)
let students = zip( 0..<100, sequence) { (_, row) in row.decode(Student.self) }
let teachers = zip(100..<110, sequence) { (_, row) in row.decode(Teacher.self) }
let lazyDecoder = decoder.lazy(from: fileString)
let students = ( 0..<100).map { _ in try lazyDecoder.decode(Student.self) }
let teachers = (100..<110).map { _ in try lazyDecoder.decode(Teacher.self) }
```
Since `LazySequence` exclusively provides sequential access; setting the buffering strategy to `.sequential` will reduce the decoder's memory usage.
Since `LazyDecoder` exclusively provides sequential access; setting the buffering strategy to `.sequential` will reduce the decoder's memory usage.
```swift
let decoder = CSVDecoder {
Expand All @@ -420,7 +420,7 @@ let decoder = CSVDecoder {
```swift
let encoder = CSVEncoder()
let data: Data = try encoder.encode(value)
let data = try encoder.encode(value, into: Data.self)
```
The `Encoder`'s `encode()` function creates a CSV file as a `Data` blob, a `String`, or an actual file in the file system.
Expand Down Expand Up @@ -472,6 +472,45 @@ encoder.dataStrategy = .custom { (data, encoder) in
> The `.headers` configuration is required if you are using keyed encoding container.
</p></details>
<details><summary><code>CSVEncoder.LazyEncoder</code>.</summary><p>
A series of codable types can be encoded _on demand_ with the encoder's `lazy(into:)` function.
```swift
let lazyEncoder = CSVEncoder().lazy(into: Data.self)
for student in students {
try lazyEncoder.encode(student)
}
let data = try lazyEncoder.endEncoding()
```
Call `endEncoding()` once there is no more values to be encoded. The function will return the encoded CSV.
```swift
let lazyEncoder = CSVEncoder().lazy(into: String.self)
students.forEach {
try lazyEncoder.encode($0)
}
let string = try lazyEncoder.endEncoding()
```
A nice benefit of using the _lazy_ operation, is that it lets you switch how a row is encoded at any point. For example:
```swift
let lazyEncoder = CSVEncoder(configuration: config).lazy(into: fileURL)
students.forEach { try lazyEncoder.encode($0) }
teachers.forEach { try lazyEncoder.encode($0) }
try lazyEncoder.endEncoding()
```
Since `LazyEncoder` exclusively provides sequential encoding; setting the buffering strategy to `.sequential` will reduce the encoder's memory usage.
```swift
let lazyEncoder = CSVEncoder {
$0.bufferingStrategy = .sequential
}.lazy(into: String.self)
```
</p></details>
</ul>
Expand Down
2 changes: 1 addition & 1 deletion sources/Delimiter.swift
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ public enum Delimiter {
}

extension Delimiter {
/// The delimiter between fields/vlaues.
/// The delimiter between fields/values.
public struct Field: ExpressibleByNilLiteral, ExpressibleByStringLiteral, RawRepresentable {
public let rawValue: String.UnicodeScalarView

Expand Down
17 changes: 17 additions & 0 deletions sources/Deprecated.swift
Original file line number Diff line number Diff line change
Expand Up @@ -75,4 +75,21 @@ extension CSVWriter {
public static func serialize<S:Sequence,C:Collection>(row: S, into fileURL: URL, append: Bool, setter: (_ configuration: inout Configuration) -> Void) throws where S.Element==C, C.Element==String {
try self.encode(rows: row, into: fileURL, append: append, setter: setter)
}

@available(*, deprecated, renamed: "endEncoding()")
public func endFile() throws {
try self.endEncoding()
}
}

extension CSVEncoder {
@available(*, deprecated, renamed: "CSVDecoder.LazyDecoder")
typealias LazySequence = CSVDecoder.LazyDecoder
}

extension CSVEncoder {
@available(*, deprecated, renamed: "encode(_:into:)")
open func encode<T:Encodable>(_ value: T) throws -> Data {
try self.encode(value, into: Data.self)
}
}
18 changes: 9 additions & 9 deletions sources/declarative/decodable/Decoder.swift
Original file line number Diff line number Diff line change
Expand Up @@ -69,31 +69,31 @@ extension CSVDecoder {
}

extension CSVDecoder {
/// Returns a sequence for decoding each row from a CSV file (given as a `Data` blob).
/// Returns a sequence for decoding row-by-row from a CSV file (given as a `Data` blob).
/// - parameter data: The data blob representing a CSV file.
/// - throws: `CSVError<CSVReader>` exclusively.
open func lazy(from data: Data) throws -> LazySequence {
open func lazy(from data: Data) throws -> LazyDecoder {
let reader = try CSVReader(input: data, configuration: self._configuration.readerConfiguration)
let source = ShadowDecoder.Source(reader: reader, configuration: self._configuration, userInfo: self.userInfo)
return LazySequence(source: source)
return LazyDecoder(source: source)
}

/// Returns a sequence for decoding each row from a CSV file (given as a `String`).
/// Returns a sequence for decoding row-by-row from a CSV file (given as a `String`).
/// - parameter string: A Swift string representing a CSV file.
/// - throws: `CSVError<CSVReader>` exclusively.
open func lazy(from string: String) throws -> LazySequence {
open func lazy(from string: String) throws -> LazyDecoder {
let reader = try CSVReader(input: string, configuration: self._configuration.readerConfiguration)
let source = ShadowDecoder.Source(reader: reader, configuration: self._configuration, userInfo: self.userInfo)
return LazySequence(source: source)
return LazyDecoder(source: source)
}

/// Returns a sequence for decoding each row from a CSV file (being pointed by `url`).
/// Returns a sequence for decoding row-by-row from a CSV file (being pointed by `url`).
/// - parameter url: The URL pointing to the file to decode.
/// - throws: `CSVError<CSVReader>` exclusively.
open func lazy(from url: URL) throws -> LazySequence {
open func lazy(from url: URL) throws -> LazyDecoder {
let reader = try CSVReader(input: url, configuration: self._configuration.readerConfiguration)
let source = ShadowDecoder.Source(reader: reader, configuration: self._configuration, userInfo: self.userInfo)
return LazySequence(source: source)
return LazyDecoder(source: source)
}
}

Expand Down
52 changes: 45 additions & 7 deletions sources/declarative/decodable/DecoderLazy.swift
Original file line number Diff line number Diff line change
@@ -1,20 +1,47 @@
extension CSVDecoder {
/// Swift sequence type giving access to all the "undecoded" CSV rows.
/// Lazy decoder allowing declarative row-by-row decoding.
///
/// The CSV rows are read _on-demand_ and only decoded when explicitly told so (unlike the default _decode_ functions).
public struct LazySequence: IteratorProtocol, Sequence {
public final class LazyDecoder: IteratorProtocol, Sequence {
/// The source of the CSV data.
private let _source: ShadowDecoder.Source
/// The row to be read (not decoded) next.
private var _currentIndex: Int = 0
private var _currentIndex: Int
/// A dictionary you use to customize the decoding process by providing contextual information.
public var userInfo: [CodingUserInfoKey:Any] { self._source.userInfo }

/// Designated initalizer passing all the required components.
/// - parameter source: The data source for the decoder.
internal init(source: ShadowDecoder.Source) {
self._source = source
self._currentIndex = 0
}

/// Returns a value of the type you specify, decoded from a CSV row.
///
/// This function will throw an error if the file has reached the end. If you are unsure where the CSV file ends, use the `next()` function instead.
/// - parameter type: The type of the value to decode from the supplied file.
/// - returns: A CSV row decoded as a type `T`.
public func decode<T:Decodable>(_ type: T.Type) throws -> T {
guard let rowDecoder = self.next() else { throw CSVDecoder.Error._unexpectedEnd() }
return try rowDecoder.decode(type)
}

/// Returns a value of the type you specify, decoded from a CSV row (if there are still rows to be decoded in the file).
/// - parameter type: The type of the value to decode from the supplied file.
/// - returns: A CSV row decoded as a type `T` or `nil` if the CSV file doesn't contain any more rows.
public func decodeIfPresent<T:Decodable>(_ type: T.Type) throws -> T? {
guard let rowDecoder = self.next() else { return nil }
return try rowDecoder.decode(type)
}

/// Ignores the subsequent row.
public func ignoreRow() {
let _ = self.next()
}

/// Advances to the next row and returns a `LazySequence.Row`, or `nil` if no next row exists.
public mutating func next() -> RowDecoder? {
/// Advances to the next row and returns a `LazyDecoder.Row`, or `nil` if no next row exists.
public func next() -> RowDecoder? {
guard !self._source.isRowAtEnd(index: self._currentIndex) else { return nil }

defer { self._currentIndex += 1 }
Expand All @@ -24,7 +51,7 @@ extension CSVDecoder {
}
}

extension CSVDecoder.LazySequence {
extension CSVDecoder.LazyDecoder {
/// Pointer to a row within a CSV file that is able to decode it to a custom type.
public struct RowDecoder {
/// The representation of the decoding process point-in-time.
Expand All @@ -36,10 +63,21 @@ extension CSVDecoder.LazySequence {
self._decoder = decoder
}

/// Returns a value of the type you specify, decoded from CSV row.
/// Returns a value of the type you specify, decoded from a CSV row.
/// - parameter type: The type of the value to decode from the supplied file.
@inline(__always) public func decode<T:Decodable>(_ type: T.Type) throws -> T {
return try T(from: self._decoder)
}
}
}

// MARK: -

fileprivate extension CSVDecoder.Error {
/// Error raised when the end of the file has been reached unexpectedly.
static func _unexpectedEnd() -> CSVError<CSVDecoder> {
.init(.invalidPath,
reason: "There are no more rows to decode. The file is at the end.",
help: "Use next() or decodeIfPresent(_:) instead of decode(_:) if you are unsure where the file ends.")
}
}
38 changes: 35 additions & 3 deletions sources/declarative/encodable/Encoder.swift
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,9 @@ import Foundation
extension CSVEncoder {
/// Returns a CSV-encoded representation of the value you supply.
/// - parameter value: The value to encode as CSV.
/// - parameter type: The Swift type for a data blob.
/// - returns: `Data` blob with the CSV representation of `value`.
open func encode<T:Encodable>(_ value: T) throws -> Data {
open func encode<T:Encodable>(_ value: T, into type: Data.Type) throws -> Data {
let writer = try CSVWriter(configuration: self._configuration.writerConfiguration)
let sink = try ShadowEncoder.Sink(writer: writer, configuration: self._configuration, userInfo: self.userInfo)
try value.encode(to: ShadowEncoder(sink: sink, codingPath: []))
Expand All @@ -44,9 +45,10 @@ extension CSVEncoder {

/// Returns a CSV-encoded representation of the value you supply.
/// - parameter value: The value to encode as CSV.
/// - parameter type: The Swift type for a string.
/// - returns: `String` with the CSV representation of `value`.
open func encode<T:Encodable>(_ value: T, into: String.Type) throws -> String {
let data = try self.encode(value)
open func encode<T:Encodable>(_ value: T, into type: String.Type) throws -> String {
let data = try self.encode(value, into: Data.self)
let encoding = self._configuration.writerConfiguration.encoding ?? .utf8
return String(data: data, encoding: encoding)!
}
Expand All @@ -63,6 +65,36 @@ extension CSVEncoder {
}
}

extension CSVEncoder {
/// Returns an instance to encode row-by-row the feeded values.
/// - parameter type: The Swift type for a data blob.
/// - returns: Instance used for _on demand_ encoding.
open func lazy(into type: Data.Type) throws -> LazyEncoder<Data> {
let writer = try CSVWriter(configuration: self._configuration.writerConfiguration)
let sink = try ShadowEncoder.Sink(writer: writer, configuration: self._configuration, userInfo: self.userInfo)
return LazyEncoder<Data>(sink: sink)
}

/// Returns an instance to encode row-by-row the feeded values.
/// - parameter type: The Swift type for a data blob.
/// - returns: Instance used for _on demand_ encoding.
open func lazy(into type: String.Type) throws -> LazyEncoder<String> {
let writer = try CSVWriter(configuration: self._configuration.writerConfiguration)
let sink = try ShadowEncoder.Sink(writer: writer, configuration: self._configuration, userInfo: self.userInfo)
return LazyEncoder<String>(sink: sink)
}

/// Returns an instance to encode row-by-row the feeded values.
/// - parameter fileURL: The file receiving the encoded values.
/// - parameter append: In case an existing file is under the given URL, this Boolean indicates that the information will be appended to the file (`true`), or the file will be overwritten (`false`).
/// - returns: Instance used for _on demand_ encoding.
open func lazy(into fileURL: URL, append: Bool = false) throws -> LazyEncoder<URL> {
let writer = try CSVWriter(fileURL: fileURL, append: append, configuration: self._configuration.writerConfiguration)
let sink = try ShadowEncoder.Sink(writer: writer, configuration: self._configuration, userInfo: self.userInfo)
return LazyEncoder<URL>(sink: sink)
}
}

#if canImport(Combine)
import Combine

Expand Down
Loading

0 comments on commit 5cfe55b

Please sign in to comment.