Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Gateway docs to README #403

Merged
merged 1 commit into from
Oct 6, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 101 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
<div align="center">
<img src="https://github.com/recap-cloud/recap/blob/main/static/recap-logo.png?raw=true" alt="recap">
<img src="https://github.com/recap-build/recap/blob/main/static/recap-logo.png?raw=true" alt="recap">
</div>

## What is Recap?

Recap reads and writes schemas from web services, databases, and schema registries in a standard format.

You can use Recap to build data contract tools, schema transpilers, compatibility checkers, data catalogs, schema registries, metadata caches, and a lot more.
⭐️ _If you like this project, please give it a star! It helps the project get more visibility._

## Table of Contents

Expand All @@ -16,6 +16,7 @@ You can use Recap to build data contract tools, schema transpilers, compatibilit
* [Usage](#usage)
* [CLI](#cli)
* [Gateway](#gateway)
* [Registry](#registry)
* [API](#api)
* [Docker](#docker)
* [Schema](#schema)
Expand Down Expand Up @@ -55,33 +56,12 @@ See `pyproject.toml` for a list of optional dependencies.

### CLI

Recap comes with a command line interface that can list and read schemas.
Recap comes with a command line interface that can list and read schemas from external systems.

Configure Recap to connect to one or more of your systems:
List the children of a URL:

```bash
recap add my_pg postgresql://user:pass@host:port/dbname
```

List the paths in your system:

```bash
recap ls my_pg
```

```json
[
"postgres",
"template0",
"template1",
"testdb"
]
```

Recap models Postgres paths as `system/database/schema/table`. Keep drilling down:

```bash
recap ls my_pg/testdb
recap ls postgresql://user:pass@host:port/testdb
```

```json
Expand All @@ -93,10 +73,10 @@ recap ls my_pg/testdb
]
```

Now we have a path to a testdb's public schemas:
Keep drilling down:

```bash
recap ls my_pg/testdb/public
recap ls postgresql://user:pass@host:port/testdb/public
```

```json
Expand All @@ -105,10 +85,10 @@ recap ls my_pg/testdb/public
]
```

Read the schema:
Read the schema for the `test_types` table as a Recap struct:

```bash
recap schema my_pg/testdb/public/test_types
recap schema postgresql://user:pass@host:port/testdb/public/test_types
```

```json
Expand All @@ -128,39 +108,95 @@ recap schema my_pg/testdb/public/test_types

Recap comes with a stateless HTTP/JSON gateway that can list and read schemas.

Configure Recap to connect to one or more of your systems:
Start the server at [http://localhost:8000](http://localhost:8000):

```bash
recap serve
```

List the schemas in a PostgreSQL database:

```bash
curl http://localhost:8000/gateway/ls/postgresql://user:pass@host:port/testdb
```

```json
["pg_toast","pg_catalog","public","information_schema"]
```

And read a schema:

```bash
recap add my_pg postgresql://user:pass@host:port/dbname
curl http://localhost:8000/gateway/schema/postgresql://user:pass@host:port/testdb/public/test_types
```

```json
{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}
```

The gateway fetches schemas from external systems in realtime and returns them as Recap schemas.

An OpenAPI schema is available at [http://localhost:8000/docs](http://localhost:8000/docs).

### Registry

You can store schemas in Recap's schema registry.

Start the server at [http://localhost:8000](http://localhost:8000):

```bash
recap serve
```

List the schemas in your system:
Put a schema in the registry:

```bash
curl -X POST \
-H "Content-Type: application/x-recap+json" \
-d '{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}' \
http://localhost:8000/registry/some_schema
```

Get the schema (and version) from the registry:

```bash
$ curl http://localhost:8000/ls/my_pg
curl http://localhost:8000/registry/some_schema
```

```json
["postgres","template0","template1","testdb"]
[{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]},1]
```

And read a schema:
Put a new version of the schema in the registry:

```bash
curl -X POST \
-H "Content-Type: application/x-recap+json" \
-d '{"type":"struct","fields":[{"type":"int32","name":"test_int","optional":true}]}' \
http://localhost:8000/registry/some_schema
```

List schema versions:

```bash
curl http://localhost:8000/schema/my_pg/testdb/public/test_types
curl http://localhost:8000/registry/some_schema/versions
```

```json
{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}
[1,2]
```

The gateway fetches schemas from external systems in realtime and returns them as Recap schemas.
Get a specific version of the schema:

```bash
curl http://localhost:8000/registry/some_schema/versions/1
```

```json
[{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]},1]
```

The registry uses [fsspec](https://filesystem-spec.readthedocs.io/en/latest/) to store schemas in a variety of filesystems like S3, GCS, ABS, and the local filesystem. See the [registry](https://recap.build/docs/registry/) docs for more details.

An OpenAPI schema is available at [http://localhost:8000/docs](http://localhost:8000/docs).

Expand All @@ -176,20 +212,20 @@ Read a schema from PostgreSQL:
```python
from recap.clients import create_client

client = create_client("postgresql://user:pass@host:port/dbname")
struct = client.get_schema("testdb", "public", "test_types")
with create_client("postgresql://user:pass@host:port/testdb") as c:
c.schema("testdb", "public", "test_types")
```

Convert the schema to Avro, Protobuf, and JSON schemas:

```python
from recap.converters.avro import AvroConverter
from recap.converters.protobuf import ProtobufConverter
from recap.converters.json_schema import JsonSchemaConverter
from recap.converters.json_schema import JSONSchemaConverter

avro_schema = AvroConverter().from_recap(struct)
protobuf_schema = ProtobufConverter().from_recap(struct)
json_schema = JsonSchemaConverter().from_recap(struct)
json_schema = JSONSchemaConverter().from_recap(struct)
```

Transpile schemas from one format to another:
Expand All @@ -213,14 +249,34 @@ struct = JSONSchemaConverter().to_recap(json_schema)
avro_schema = AvroConverter().from_recap(struct)
```

Store schemas in Recap's schema registry:

```python
from recap.storage.registry import RegistryStorage
from recap.types import StructType, IntType

storage = RegistryStorage("file:///tmp/recap-registry-storage")
version = storage.put(
"postgresql://localhost:5432/testdb/public/test_table",
StructType(fields=[IntType(32)])
)
storage.get("postgresql://localhost:5432/testdb/public/test_table")

# Get all versions of a schema
versions = storage.versions("postgresql://localhost:5432/testdb/public/test_table")

# List all schemas in the registry
schemas = storage.ls()
```

### Docker

Recap's gateway is also available as a Docker image:
Recap's gateway and registry are also available as a Docker image:

```bash
docker run \
-p 8000:8000 \
-e "RECAP_SYSTEMS__PG=postgresql://user:pass@localhost:5432/testdb" \
-e RECAP_URLS=["postgresql://user:pass@localhost:5432/testdb"]' \
ghcr.io/recap-build/recap:latest
```

Expand Down