Skip to content

Read text file (remote over HTTP(S) or local) line by line as async iterator, with Node, browsers and Deno

License

Notifications You must be signed in to change notification settings

tomchen/fetchline

Repository files navigation

Fetch Line JavaScript packages: read remote or local file line by line as async iterator

Read text file (remote over HTTP(S) or local) line by line as async iterator, with Node, browsers and Deno.

This GitHub monorepo hosts 6 npm packages and 1 Deno module. They all serve a similar, simple purpose: read text file line by line and return an asynchronous iterable iterator of strings. They all try to be efficient and fast, and are written in TypeScript. However, their environment / platforms and exact purpose, features and behavior differ.

Actions Status Node.js Deno lerna License

TLDR: read the "Purpose & environment" table, pick the package you need, have a look at the Usage section, know at least a little bit about JavaScript's async / await, and go ahead to use them.

Comparison

Purpose & environment

Package / Module Name Rec? Fetch remote file over HTTP(S) Read local file Version
Node.js Deno Google ChromeFirefox
SafariMicrosoft Edge
OperaSamsung Internet
Node.js Deno
fetchline πŸ‘ βœ… βœ… version number
nodefetchline πŸ‘ βœ… version number
isomorphic-fetchline βœ… βœ… version number
naivefetchline βœ… βœ… version number
getfileline βœ… version number
readlineiter πŸ‘ βœ… version number
readlineiter for Deno πŸ‘ βœ…

If you are not sure, just use the recommended one that has the correct environment and purpose you need.

Tested:

  • Node.js: β‰₯ 12
  • Deno: β‰₯ v1.2.0
  • Modern browsers (Google Chrome, Firefox, Safari, Microsoft Edge, Opera, Samsung Internet): all latest

Usage

fetchline, nodefetchline, isomorphic-fetchline, and naivefetchline

Examples

npm install fetchline
import fetchline from 'fetchline'
const lineIterator =
  fetchline('https://raw.githubusercontent.com/tomchen/fetchline/main/testfile/crlf_finalnewline')
// This is the same as:
// fetchline(
//   'https://raw.githubusercontent.com/tomchen/fetchline/main/testfile/crlf_finalnewline',
//   {
//     includeLastEmptyLine: true,
//     encoding: 'utf-8',
//     delimiter: /\r?\n/g,
//   }
// )
;(async () => {
  for await (const line of lineIterator) {
    // do something with `line`
  }
})()

Change 'fetchline' to nodefetchline, isomorphic-fetchline, or naivefetchline if you use these packages instead.

For Deno: import fetchline from 'https://github.com/tomchen/fetchline/blob/main/packages/fetchline/src/index.ts'

Change the import line to syntax like const nodefetchline = require('nodefetchline') if you use nodefetchline or isomorphic-fetchline package in Node's CommonJS.

For browsers:

<script src="https://unpkg.com/fetchline/dist/umd"></script>
<script>
fetchline(...) // same as above
</script>

Details

These four packages have exactly the same interface (parameters and return value):

Parameter Name Required? Type Default Value Description
filepath Required string N/A URL or path of the text file
options Optional object {} options, including the following three
options.includeLastEmptyLine Optional boolean true Should it count the last empty line?
options.encoding Optional string 'utf-8' File encoding
options.delimiter Optional string or RegExp /\r?\n/g Line (or other item)'s delimiter / separator.
NOTE: do not set it as something like /\r\n|\n|\r/g, it causes trouble when one of the chunks of a CRLF (\r\n)-EOL file ends with CR (\r)

Return value: { AsyncIterableIterator<string> } An asynchronous iterable iterator containing each line in string from the text file

readlineiter, readlineiter for Deno, and getfileline

They have similar interface as the aforementioned fetchline, nodefetchline, isomorphic-fetchline, and naivefetchline, but do not have the second parameter, options, and everything options contains.

npm install readlineiter
import readlineiter from 'readlineiter' // For Deno: import readlineiter from 'https://raw.githubusercontent.com/tomchen/fetchline/main/packages/readlineiter-deno/mod.ts'
const lineIterator = readlineiter('./crlf_finalnewline')
;(async () => {
  for await (const line of lineIterator) {
    // do something with `line`
  }
})()

Further comparison

Characteristics

ASAP 0 dependencies TypeScript
fetchline βœ… βœ… βœ…
nodefetchline βœ… βœ… βœ…
isomorphic-fetchline βœ… βœ… .d.ts
naivefetchline ❌ βœ… βœ…
getfileline βœ… βœ… βœ…
readlineiter βœ… βœ… βœ…
readlineiter for Deno βœ… βœ… βœ…

ASAP:

  • These remote file requesting libs should resolve with the line text string as soon as possible, i.e. as soon as the chunks that have arrived can form the next complete line
    • Except for naivefetchline that is, well, naΓ―ve, I really can't blame it
  • The local file reading libs read the file with pointer, rather than get a whole string in memory then split the string

0 dependencies: no external non-dev dependency for npm packages. Note that:

  • Node libraries inevitably use native Node libraries http and https, or fs
  • getfileline and readlineiter also use readline native lib directly thus are just wrappers, but other packages here use own low-level method
  • "readlineiter for Deno" uses Deno Standard Module bufio.ts.

TypeScript: the source code is in TypeScript, except for isomorphic-fetchline's source which is in JavaScript but has type definition (.d.ts).

As for the production / dist files, these packages are all compiled into different module versions where possible: Node.js' Common.js (which uses require()), ES Module (native JS module with import), and minified UMD that is good for browser.

Parameters amd return value

filepath parameter includeLastEmptyLine option encoding option delimiter option Return AsyncIterableIterator<string>
fetchline βœ… βœ… βœ… βœ… βœ…
nodefetchline βœ… βœ… βœ… βœ… βœ…
isomorphic-fetchline βœ… βœ… βœ… βœ… βœ…
naivefetchline βœ… βœ… βœ… βœ… βœ…
getfileline βœ… ❌, always doesn't ❌, always utf-8 ❌, always EOL detected by readline βœ…
readlineiter βœ… ❌, always doesn't ❌, always utf-8 ❌, always EOL detected by readline βœ…
readlineiter for Deno βœ… ❌, always does ❌, always utf-8 ❌, always EOL detected by bufio.ts βœ…

getfileline and readlineiter's delimiter is EOL character detected by readline native module with its crlfDelay option set to Infinity.

Tips & thoughts

async / await

Of course, you should at least know a little bit about async / await, await asyncIterator.next() or for await of before using the packages here. If you don't, click the links to read the article from MDN Web Docs. Basically, you can do this:

;(async () => {
  for await (const line of lineIterator) {
    // do something with `line`
  }
})()

Or this:

;(async () => {
  let line
  let isDone
  while (1) {
    ;({ value: line, done: isDone } = await lineIterator.next())
    // do something with `line`
    if (isDone) {
      break
    }
  }
})()

Line-delimited JSON

These packages, especially 'fetchline' (the first one) for browsers, could be helpful for line-delimited JSON (aka. ndjson (Newline Delimited JSON), JSON Lines) parsing. You could write something like:

import fetchline from 'fetchline'
const lineIterator = fetchline(lineDelimitedJsonUrl)
;(async () => {
  for await (const line of lineIterator) {
    const lineJson = JSON.parse(line)
    // do something with `lineJson`
  }
})()

Development

This is a Lerna powered monorepo with mixed code (TypeScript / JavaScript), mixed module version (CommonsJS, ES Module, UMD) and cross-environment (Node.js, Deno, browsers) support, automated tests with GitHub Actions CI. Look at the root package.json and individual packages' package.json for available scripts, to get started, yarn then yarn bootstrap. You could also use the repo as an example of Lerna monorepo.