Skip to content

Tolsto/tap-mongodb

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tap-mongodb

This is a Singer tap that produces JSON-formatted data following the Singer spec from a MongoDB source.

Set up Virtual Environment

python3 -m venv ~/.virtualenvs/tap-mongodb
source ~/.virtualenvs/tap-mongodb/bin/activate

Install tap

pip install -U pip setuptools
pip install tap-mongodb

Set up Config file

Create json file called config.json, with the following contents:

{
  "password": "<password>",
  "user": "<username>",
  "host": "<host ip address>",
  "port": "<port>",
  "database": "<database name>"
}

All of the above attributes are required by the tap to connect to your mongo instance.

Alternatively, you can use a database URL to connect in which case the above settings will be ignored and are thus optional. Note that you have to include settings like replica_set or ssl directly in the URL instead of providing them separately if you use a database URL to establish the connection.

For example, to connect to a database that uses DNS SRV records:

{
  "database_url": "mongodb+srv://user:myRealPassword@cluster0.mongodb.net/test?w=majority&tls=true"
}

The following parameters are optional for your config file:

Name Type Description
replica_set string name of replica set
ssl Boolean can be set to true to connect using ssl
include_schema_in_destination_stream_name Boolean forces the stream names to take the form <database_name>_<collection_name> instead of <collection_name>

Run in discovery mode

Run the following command and redirect the output into the catalog file

tap-mongodb --config ~/config.json --discover > ~/catalog.json

Your catalog file should now look like this:

{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count":<int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ]
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}

Edit Catalog file

Using valid json, edit the config.json file

To select a stream, enter the following to the stream's metadata:

"selected": true,
"replication-method": <replication method>,

<replication-method> must be either FULL_TABLE or LOG_BASED

To add a projection to a stream, add the following to the stream's metadata field:

"tap-mongodb.projection": <projection>

For example, if you were to edit the example stream to select the stream as well as add a projection, config.json should look this:

{
  "streams": [
    {
      "table_name": "<table name>",
      "tap_stream_id": "<tap_stream_id>",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "row-count": <int>,
            "is-view": <bool>,
            "database-name": "<database name>",
            "table-key-properties": [
              "_id"
            ],
            "valid-replication-keys": [
              "_id"
            ],
            "selected": true,
            "replication-method": "<replication method>",
            "tap-mongodb.projection": "<projection>"
          }
        }
      ],
      "stream": "<stream name>",
      "schema": {
        "type": "object"
      }
    }
  ]
}

Run in sync mode:

tap-mongodb --config ~/config.json --catalog ~/catalog.json

The tap will write bookmarks to stdout which can be captured and passed as an optional --state state.json parameter to the tap for the next sync.

Supplemental MongoDB Info

Local MongoDB Setup

If you haven't yet set up a local mongodb client, follow these instructions


Copyright © 2019 Stitch

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 99.8%
  • Makefile 0.2%