Pipelaner

Pipelaner is a high-performance and efficient Framework and Agent for creating data pipelines. The core of pipeline descriptions is based on the Configuration As Code concept and the Pkl configuration language by Apple.

Pipelaner manages data streams through three key entities: Generator, Transform and Sink.

📖 Contents

Core Entities
Built-in Pipeline Elements
Scalability
- Single-Node Deployment
- Multi-Node Deployment
Examples
Support
License

📌 Core Entities

Generator

The component responsible for creating or retrieving source data for the pipeline. Generators can produce messages, events, or retrieve data from various sources such as files, databases, or APIs.

Example use case:
Reading data from a file or receiving events via webhooks.

Transform

The component that processes data within the pipeline. Transforms perform operations such as filtering, aggregation, data transformation, or cleaning to prepare it for further processing.

Example use case:
Filtering records based on specific conditions or converting data format from JSON to CSV.

Sink

The final destination for the data stream. Sinks send processed data to a target system, such as a database, API, or message queue.

Example use case:
Saving data to PostgreSQL or sending it to a Kafka topic.

Basic Parameters

Parameter	Type	Description
`name`	String	Unique name of the pipeline element.
`threads`	Int	Number of threads for processing messages. Defaults to the value of `GOMAXPROC`.
`outputBufferSize`	Int	Size of the output buffer. Not applicable to Sink components.

📦 Built-in Pipeline Elements

Generators

Name	Description
cmd	Reads the output of a command, e.g., `"/usr/bin/log" "stream --style ndjson"`.
kafka	Apache Kafka consumer that streams `Value` into the pipeline.
pipelaner	GRPC server that streams values via gRPC.

Transforms

Name	Description
batch	Forms batches of data with a specified size.
chunks	Splits incoming data into chunks.
debounce	Eliminates "bounce" (frequent repeats) in data.
filter	Filters data based on specified conditions.
remap	Reassigns fields or transforms the data structure.
throttling	Limits data processing rate.

Sinks

Name	Description
clickhouse	Sends data to a ClickHouse database.
console	Outputs data to the console.
http	Sends data to a specified HTTP endpoint.
kafka	Publishes data to Apache Kafka.
pipelaner	Streams data via gRPC to other Pipelaner nodes.

🌐 Scalability

Single-Node Deployment

For operation on a single host:

Multi-Node Deployment

For distributed data processing across multiple hosts:

For distributed interaction between nodes, you can use:

gRPC — via generators and sinks with the parameter sourceName: "pipelaner".
Apache Kafka — for reading/writing data via topics.

Example configuration using Kafka:

new Inputs.Kafka {
    ...
    common {
        ...
        topics {
            "kafka-topic"
        }         
    }
}

new Sinks.Kafka {
    ...
    common {
        ...
        topics {
            "kafka-topic"
        }         
    }
}

🚀 Examples

Examples	Description
Basic Pipeline	A simple example illustrating the creation of a basic pipeline with prebuilt components.
Custom Components	An advanced example showing how to create and integrate custom Generators, Transforms, and Sinks.

Overview

🌟 Basic Pipeline
Learn the fundamentals of creating a pipeline with minimal configuration using ready-to-use components.
🛠 Custom Components
Extend Pipelaner’s functionality by developing your own Generators, Transforms, and Sinks.

Each example includes clear configuration files and explanations to help you get started quickly.

💡 Tip: Use these examples as templates to customize and build your own pipelines efficiently.

🤝 Support

If you have questions, suggestions, or encounter any issues, please create an Issue in the repository.
You can also participate in discussions in the Discussions section.

📜 License

This project is licensed under the Apache 2.0 license.
You are free to use, modify, and distribute the code under the terms of the license.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.github/workflows		.github/workflows
examples		examples
gen		gen
images		images
internal		internal
pipeline		pipeline
pkl		pkl
proto		proto
sources		sources
.gitignore		.gitignore
.golangci.yml		.golangci.yml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
agent.go		agent.go
go.mod		go.mod
go.sum		go.sum
pipelaner.go		pipelaner.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipelaner

📖 Contents

📌 Core Entities

Generator

Transform

Sink

Basic Parameters

📦 Built-in Pipeline Elements

Generators

Transforms

Sinks

🌐 Scalability

Single-Node Deployment

Multi-Node Deployment

🚀 Examples

Overview

🤝 Support

📜 License

About

Releases 1

Packages

Contributors 6

Languages

License

pipelane/pipelaner

Folders and files

Latest commit

History

Repository files navigation

Pipelaner

📖 Contents

📌 Core Entities

Generator

Transform

Sink

Basic Parameters

📦 Built-in Pipeline Elements

Generators

Transforms

Sinks

🌐 Scalability

Single-Node Deployment

Multi-Node Deployment

🚀 Examples

Overview

🤝 Support

📜 License

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 6

Languages

Packages