This document outlines the options used to configure the behavior of github-archive.
github-archive allows to define multiple archival jobs. This is handy if you want to archive the starred repositories of mulitple GitHub users.
Jobs are configured in a jobs.yaml
or jobs.json
that can be mounted into the Docker container. It can be written in either JSON or YAML. We will use YAML in all examples.
accessTokens: # Define GitHub Personal Access Tokens that should be used. Optional in most cases.
- token1
- token2
# Alternatively you can also only specify one:
# accessTokens: single-token
# The org in which repositories will be created in. Either org or user is required. They cannot be used together.
org: exampleOrg
# The username of the user in which repositories will be created in. Either org or user is required. They cannot be used together.
user: exampleUser
# The URL of the Gitea instance ending with "/api/v1" where the repos will be created
# Access Token to the above configured Gitea instance
accessToken: exampleAccessToken
# Whether Gitea should regularly sync the content of the upstream repo. Optional, default: true
#mirror: true
# Interval at which Gitea should re-sync. Optional, default: 24h
#interval: 24h
# Visibility of the Gitea repo. Optional, default: false
#public: false
# The items that Gitea should migrate. When using anything in addition to "wiki", the GitHub Access Token is required
- wiki
#- labels
#- issues
#- pull-requests
#- releases
#- milestones
- type: starred # Required for this kind of job. Can be "starred" or "repos". See for a "repos" example below.
# Some descriptive name for the job used in logs
name: My Example Job
# Whether this job should be actually syncing repos. Can be handy to temporarily disable a job. Optional, default: true
#active: true
# Cron expression that configures when this job should run. Optional, default: 0 0 * * *
#schedule: 0 0 * * *
# Required. The GitHub username of the user that this job is syncing starred repos from.
user: exampleGitHubUser
# You can optionally overwrite the GitHub access tokens per job. They will be used instead the above configured
# - token1
# - token2
# Option 2: As a single string (commented out)
# accessTokens: single-token-string
# You can optionally overwrite the above configured Gitea mirror settings. They will be merged with the above settings. This allows to configure a different Gitea instance per job or further customize mirroring by job
# org: exampleOrg
# user: exampleUser
# accessToken: exampleAccessToken
# url:
# mirror: true
# public: false
# interval: 24h
# items:
# - wiki
# - labels
# - issues
# - pull-requests
# - releases
# - milestones
- type: repos # Required for this kind of job. Can be "repos" or "starred". See for a "starred" example above.
# Some descriptive name for the job used in logs
name: My Example Job
# Whether this job should be actually syncing repos. Can be handy to temporarily disable a job. Optional, default: true
#active: true
# Cron expression that configures when this job should run. Optional, default: 0 0 * * *
#schedule: 0 0 * * *
# Optional. The GitHub username of the user that this job is syncing repos from. If none is provided, the authenticated user is used. Default: null
user: exampleGitHubUser
# Optional. Different types of filters for repos that should be synced
# Optional. Filter repos by type. Possible options: all, owner, public, private, member. Default: owner
type: owner
# Optional. Filter repos by visibility. Default: private, public
- private
- public
# All other options of the "starred" job type apply here as well, like giteaDestination.
github-archive can archive all sorts of repositories. What is being archived is determined by the job's type
Jobs of the type starred
will archive repos that you or any configured user have starred on GitHub.
Jobs of the type repos
will archive any repos on GitHub by a specific user.
This example will archive all the private repos the authenticated user owns. If you want to archive repos of the currently authenticated user, don't configure your own GitHub username in the githubSource.user
, otherwise github-archive can only access your public repos.
accessTokens: gh-token
org: exampleOrg
accessToken: exampleAccessToken
- type: repos
name: My private repos
schedule: 0 0 * * *
# Notice: No user is configured here. It will be inferred by the used GitHub access token
# We only want repos we own
type: owner
# Only archive private repos
- private
This example will archive all the public repos of any user on GitHub.
# We don't need a GitHub access token for that, unless we run in rate-limiting issues.
org: exampleOrg
accessToken: exampleAccessToken
- type: repos
name: Ghosts public repos
schedule: 0 0 * * *
# The user from which we want to archive:
user: ghost
# Only repos they own
type: owner
# Only archive public repos
- public
GitHub access tokens are not always required. When the user you want to mirror starred repos from has their details public for example. But when you use other migration items apart from only wiki
, at least one GitHub access token is required.
The GitHub API will rate limit at some point. That's why multiple access tokens can be specified. That will cause Gitea to rotate them when needed.
In addition to the YAML/JSON configuration there are some configuration options via environment variables:
: Path to file on disk where the jobs are configured. This can be a path to a.yaml
file, what ever file format you prefer. Default./jobs.yaml
. Optional.LOG_LEVEL
: Log level used to control visiblity of messages. The possible items are:info
: Optionally write logs to files in addition to stdout. Specify a path on a Docker volume.