Documentation: https://flupy.readthedocs.io/en/latest/
Source Code: https://github.com/olirice/flupy
Flupy implements a fluent interface for operating on python iterables. All flupy methods return generators and are evaluated lazily. This allows expressions to transform arbitrary size data in extremely limited memory.
You can think of flupy as a light weight, 0 dependency, pure python alternative to the excellent Apache Spark project.
- Python 3.6+
Install flupy with pip:
$ pip install flupy
from itertools import count
from flupy import flu
# Processing an infinite sequence in constant memory
pipeline = (
flu(count())
.map(lambda x: x**2)
.filter(lambda x: x % 517 == 0)
.chunk(5)
.take(3)
)
for item in pipeline:
print(item)
# Returns:
# [0, 267289, 1069156, 2405601, 4276624]
# [6682225, 9622404, 13097161, 17106496, 21650409]
# [26728900, 32341969, 38489616, 45171841, 52388644]
The flupy command line interface brings the same syntax for lazy piplines to your shell. Inputs to the flu
command are auto-populated into a Fluent
context named _
.
$ flu -h
usage: flu [-h] [-f FILE] [-i [IMPORT [IMPORT ...]]] command
flupy: a fluent interface for python
positional arguments:
command flupy command to execute on input
optional arguments:
-h, --help show this help message and exit
-f FILE, --file FILE path to input file
-i [IMPORT [IMPORT ...]], --import [IMPORT [IMPORT ...]]
modules to import
Syntax: <module>:<object>:<alias>
Examples:
'import os' = '-i os'
'import os as op_sys' = '-i os::op_sys'
'from os import environ' = '-i os:environ'
'from os import environ as env' = '-i os:environ:env'