Skip to content

Find interesting and potentially hazardous commits in git projects

License

Notifications You must be signed in to change notification settings

srcclr/commit-watcher

Repository files navigation

Commit Watcher

Commit Watcher finds interesting and potentially hazardous commits in git projects. Watch your own projects to make sure you didn't accidentally leak your AWS keys or other credentials, and watch open-source projects you use to find undisclosed security vulnerabilities and patches.

At SourceClear, we want to help you use open-source software safely. Oftentimes when a security vulnerability is discovered and fixed in an open-source project, there isn't a public disclosure about it. In part, this is because the CVE process is onerous and labor intensive, and notifying all the users of a project isn't possible.

Oh, and about that UI. Commit Watcher is intended to be an API accessible backend service. The UI is only there for testing, and the scope of functionality is limited to collecting commits and auditing them against a set of rules.

Contributing

Check out the dozens of rules and patterns in the srcclr/commit-watcher-rules repository that help find leaked credentials and potential security issues. Just open an issue or PR in that repo if there's a rule you'd like to see added.

Additionally, if you find a security issue on an open-source project using Commit Watcher, our security research team would love to help verify it. You can open an issue against this repo from the UI, or just drop a link to the offending commit in a new issue.

Setup

Install and configure Ruby using RVM or Rbenv. Avoid using the system's bundled Ruby to avoid permission issues during installation/setup.

RVM: https://rvm.io
Rbenv: https://github.com/rbenv/rbenv

Install MySQL and Redis. On Mac, with Brew, you can do that with this command:

brew install mysql redis

Follow the instructions Brew gives you so the services are started properly.

Install gem dependencies:

gem install bundler
bundle install

Then setup some Rails secrets and passwords:

figaro install
echo "COMMIT_WATCHER_DATABASE_PASSWORD: 'changeme123'" >> config/application.yml
echo "SECRET_KEY_BASE: `rake secret`" >> config/application.yml

The rest of the setup depends on how you want to run Commit Watcher. You can either run it locally, which is good for quick development, or you can run it with Docker.

Optional: Configuring Email Notifications

To use email notifications, set your Gmail username and password with these commands:

echo "GMAIL_USERNAME: 'sah.dude@gmail.com'" >> config/application.yml
echo "GMAIL_PASSWORD: 'urpassbro'" >> config/application.yml

If you'd like to use another email provider other than Gmail, you'll have to change these two files: config/environments/development.rb and config/environments/production.rb.

Running Locally

Create the database, load the schema, and seed it with some sample rules:

rails db:setup

Now you're ready to start Rails with:

rails s

To start processing jobs, in another terminal:

bundle exec sidekiq

Running with Docker / RDS

First, change the root and user passwords in .env.db.

# Not used but should set one for security.
MYSQL_ROOT_PASSWORD=changeme123

# This is for the commit_watcher user.
MYSQL_PASSWORD=changeme123

Second, modify config/database.yml by commenting out socket in favor of host, like this:

  # Use this for local mysql instances
  #socket: /tmp/mysql.sock

  # Use this for Docker
  host: db

Alternatively, for RDS, setup the external RDS URL:

echo "COMMIT_WATCHER_EXTERNAL_DATABASE_URL: 'somedb.rds.amazonaws.com'" >> config/application.yml

Then, modify config/database.yml by commenting out socket in favor of host, like this:

  # Use this for local mysql instances
  #socket: /tmp/mysql.sock

  # Use this for Docker
  #host: db

  # Use this for External RDS
  host: <%= ENV['COMMIT_WATCHER_EXTERNAL_DATABASE_URL'] %>

And modify docker-compose.yml by commenting out - db in the web: and sidekiq: sections, like this:

  web:
    build: .
    volumes:
      - .:/myapp
    ports:
      - '3000:3000'
    links:
      #- db
      - redis
    ...
    sidekiq:
      build: .
      volumes:
        - .:/myapp
      links:
        #- db
        - redis

Now start everything going with:

docker-compose up

This downloads the images and builds the database and rails app containers. When it's finished building, and both containers are running, you should see rails messages like this:

77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO  WEBrick 1.3.1
77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO  ruby 2.2.2 (2015-04-13) [x86_64-linux]
77bcf6cd5a_commitwatcher_web_1 | [2016-03-09 18:29:36] INFO  WEBrick::HTTPServer#start: pid=1 port=3000

Stop Docker with Ctrl+C so the database can be setup with:

docker-compose run web bundle exec rake db:schema:load db:seed

Now start everything up again with:

docker-compose up

Use

If using Docker, the server will be accessible from the IP address given by:

docker-machine ip default

To crawl any projects, you must set a GitHub API token in the default configuration. This can be reached here: http://localhost:3000/configurations/1/edit.

The web UI contains a dashboard which links to all available pages. It's located here: http://localhost:3000/.

Sidekiq dashboard is here: http://localhost:3000/sidekiq/cron.

Overview

The process starts by every few minutes any project which hasn't been checked in a while is polled for new commits. These commits are then checked against whatever rules are setup for the project. Any commits which match are recorded and available at the /commits endpoint.

Everything is broken up into different Sidekiq jobs. There are three:

  1. Selecting projects which need to be polled
  2. Collecting new commits
  3. Auditing a single commit

API Access

The API endpoints are similar to the web UI and are documented by code.

The app must have a hostname to access the API endpoints. This can be done in development by adding a record to the host file:

echo "127.0.0.1 api.my_app.dev" >> /etc/hosts

Then the API can be accessed by:

curl http://api.my_app.dev:3000/v1/commits

Rules

Rule types are defined and described in config/rule_types.yml. They are:

  • filename_pattern - Regular expression for a filename
  • changed_code_pattern - Regular expression for a changed line
  • code_pattern - Regular expression for any code in a changed file
  • message_pattern - Regular expression for a commit message
  • author_pattern - Regular expression for a commit author name, normalized to "name "
  • commit_pattern - Combination of code_pattern and message_pattern
  • expression - Boolean expression referencing one or more rules

Expression Rules

This is a special rule type that allows for combining multiple rules in a boolean expression. The boolean expression has three operators: && (and), || (or), ! (not), and also allows for parenthetical expressions.

For example, if there are three rules:

  1. is_txt - /\.txt\z/ (filename_pattern)
  2. has_lulz_msg - /\blulz\b/ (message_pattern)
  3. has_42 - /\b42\b/ (code_pattern)

To create an expression rule which would match commits that include "lulz" in the commit message and contains at least a single text file or has a file with the word "42":

(is_txt && has_lulz_msg) || has_42

To match a commit where any file is not a text file and includes "42":

!is_txt && has_42

Publications

Automated identification of security issues from commit messages and bug reports, FSE 2017