Skip to content

Commit

Permalink
docs: Add docs for installation, configuration and development
Browse files Browse the repository at this point in the history
  • Loading branch information
MohamedBassem committed Mar 20, 2024
1 parent fa30522 commit 2f21ef2
Show file tree
Hide file tree
Showing 27 changed files with 284 additions and 462 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,6 @@ jobs:
file: docker/Dockerfile
target: ${{ matrix.package }}
push: true
tags: ghcr.io/mohamedbassem/hoarder-${{ matrix.package }}:${{github.event.release.name}}
tags: ghcr.io/mohamedbassem/hoarder-${{ matrix.package }}:${{github.event.release.name}},ghcr.io/mohamedbassem/hoarder-${{ matrix.package }}:release
cache-from: type=gha
cache-to: type=gha,mode=max
75 changes: 7 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,54 +6,24 @@ A self-hostable bookmark-everything app with a touch of AI for the data hoarders

## Features

- 🔗 Bookmark links and take simple notes.
- 🔗 Bookmark links, take simple notes and store images.
- ⬇️ Automatic fetching for link titles, descriptions and images.
- 📋 Sort your bookmarks into lists.
- 🔎 Full text search of all the content stored.
- ✨ AI-based (aka chatgpt) automatic tagging.
- 🔖 [Chrome plugin](https://chromewebstore.google.com/detail/hoarder/kgcjekpmcjjogibpjebkhaanilehneje) for quick bookmarking.
- 📱 [iOS shortcut](https://www.icloud.com/shortcuts/78734b46624c4a3297187c85eb50d800) for bookmarking content from the phone. A minimal mobile app might come later.
- 📱 [iOS shortcut](https://www.icloud.com/shortcuts/78734b46624c4a3297187c85eb50d800) for bookmarking content from the phone. A minimal mobile app is in the works.
- 💾 Self-hosting first.
- [Planned] Archiving the content for offline reading.
- [Planned] Store raw images.

**⚠️ This app is under heavy development and it's far from stable.**

## Installation
## Documentation

Docker is the recommended way for deploying the app. A docker compose file is provided.

Run `docker compose up` then head to `http://localhost:3000` to access the app.

> NOTE: You'll need to set the env variable `OPENAI_API_KEY` without your own openai key for automatic tagging to work. Check the next section for config details.
## Configuration

The app is configured with env variables.

| Name | Default | Description |
| -------------- | --------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| OPENAI_API_KEY | Not set | The OpenAI key used for automatic tagging. If not set, automatic tagging won't be enabled. The app currently uses `gpt-3.5-turbo-0125` which is [extremely cheap](https://openai.com/pricing). You'll be able to bookmark 1000+ for less than $1. |
| DATA_DIR | Not set | The path for the persistent data directory. |
| REDIS_HOST | localhost | The address of redis used by background jobs |
| REDIS_POST | 6379 | The port of redis used by background jobs |
| MEILI_ADDR | Not set | The address of meilisearch. If not set, Search will be disabled. |
| MEILI_MASTER_KEY | Not set | The master key configured for meili. Not needed in development. |

## Security Considerations

If you're going to give app access to untrusted users, there's some security considerations that you'll need to be aware of given how the crawler works. The crawler is basically running a browser to fetch the content of the bookmarks. Any untrusted user can submit bookmarks to be crawled from your server and they'll be able to see the crawling result. This can be abused in multiple ways:

1. Untrused users can submit crawl requests to websites that you don't want to be coming out of your IPs.
2. Crawling user controlled websites can expose your origin IP (and location) even if your service is hosted behind cloudflare for example.
3. The crawling requests will be coming out from your own network, which untrusted users can leverage to crawl internal non-internet exposed endpoints.

To mitigate those risks, you can do one of the following:

1. Limit access to trusted users
2. Let the browser traffic go through some VPN with restricted network policies.
3. Host the browser container outside of your network.
4. Use a hosted browser as a service (e.g. [browserless](https://browserless.io)). Note: I've never used them before.
- [Installation](https://docs.hoarder.app/installation)
- [Configuration](https://docs.hoarder.app/configuration)
- [Security Considerations](https://docs.hoarder.app/security-considerations)
- [Development](https://docs.hoarder.app/Development/setup)

## Stack

Expand All @@ -80,34 +50,3 @@ I'm a systems engineer in my day job (and have been for the past 7 years). I did
- [memos](https://github.com/usememos/memos): I love memos. I have it running on my home server and it's one of my most used self-hosted apps. I, however, don't like the fact that it doesn't preview the content of the links I dump there and to be honest, it doesn't have to because that's not what it was designed for. It's just that I dump a lot of links there and I'd have loved if I'd be able to figure which link is that by just looking at my timeline. Also, given the variety of things I dump there, I'd have loved if it does some sort of automatic tagging for what I save there. This is exactly the usecase that I'm trying to tackle with Hoarder.
- [Wallabag](https://wallabag.it): Wallabag is a well-established open source read-it-later app written in php and I think it's the common recommendation on reddit for such apps. To be honest, I didn't give it a real shot, and the UI just felt a bit dated for my liking. Honestly, it's probably much more stable and feature complete than this app, but where's the fun in that?
- [Shiori](https://github.com/go-shiori/shiori): Shiori is meant to be an open source pocket clone written in Go. It ticks all the marks but doesn't have my super sophisticated AI-based tagging. (JK, I only found about it after I decided to build my own app, so here we are 🤷).

## Development

### Docker

You can turnup the whole development environment with:
`docker compose -f docker/docker-compose.dev.yml up`

### Manual

Or if you have nodejs installed locally, you can do:

- `pnpm install` in the root of the repo.
- `pnpm db:migrate` to run the db migrations.
- `pnpm web` to start the web app.
- Access it over `http://localhost:3000`.
- `pnpm workers` to start the crawler and the openai worker.
- You'll need to have redis running at `localhost:5379` (configurable with env variables).
- An easy way to get redis running is by using docker `docker run -p 5379:5379 redis`.
- You can run the web app without the workers, but link fetching and automatic tagging won't work.

### Codebase structure

- `packages/db`: Where drizzle's schema lives. Shared between packages.
- `packages/shared`: Shared utilities and code between the workers and the web app.
- `packages/web`: Where the nextjs based web app lives.
- `packages/workers`: Where the background job workers (crawler and openai as of now) run.

### Submitting PRs

- Before submitting PRs, you'll want to run `pnpm format` and include its changes in the commit. Also make sure `pnpm lint` is successful.
6 changes: 5 additions & 1 deletion apps/landing/app/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import screenshot from "@/public/screenshot.png";
import { ExternalLink, Github, PackageOpen } from "lucide-react";

const GITHUB_LINK = "https://github.com/MohamedBassem/hoarder-app";
const DOCS_LINK = "https://docs.hoarder.app";

function NavBar() {
return (
Expand All @@ -15,7 +16,10 @@ function NavBar() {
<p className="text-2xl">Hoarder</p>
</div>
<div className="hidden gap-10 sm:flex">
<Link href="#" className="flex justify-center gap-2 text-center">
<Link
href={DOCS_LINK}
className="flex justify-center gap-2 text-center"
>
Docs
</Link>
<Link
Expand Down
14 changes: 12 additions & 2 deletions docker/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,27 +1,37 @@
version: "3.8"
services:
web:
image: ghcr.io/mohamedbassem/hoarder-web:latest
image: ghcr.io/mohamedbassem/hoarder-web:${HOARDER_VERSION:-release}
restart: unless-stopped
volumes:
- data:/data
ports:
- 3000:3000
env_file:
- .env
environment:
REDIS_HOST: redis
MEILI_ADDR: http://meilisearch:7700
DATA_DIR: /data
redis:
image: redis:7.2-alpine
restart: unless-stopped
volumes:
- redis:/data
meilisearch:
image: getmeili/meilisearch:v1.6
restart: unless-stopped
env_file:
- .env
volumes:
- meilisearch:/meili_data
workers:
image: ghcr.io/mohamedbassem/hoarder-workers:latest
image: ghcr.io/mohamedbassem/hoarder-workers:${HOARDER_VERSION:-release}
restart: unless-stopped
volumes:
- data:/data
env_file:
- .env
environment:
REDIS_HOST: redis
MEILI_ADDR: http://meilisearch:7700
Expand Down
24 changes: 24 additions & 0 deletions docs/docs/01-intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
slug: /
---

# Introduction

Hoarder is an open source "Bookmark Everything" app that uses AI for automatically tagging the content you throw at it. The app is built with self-hosting as a first class citizen.

![Screenshot](https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/screenshots/homepage.png)


## Features

- 🔗 Bookmark links, take simple notes and store images.
- ⬇️ Automatic fetching for link titles, descriptions and images.
- 📋 Sort your bookmarks into lists.
- 🔎 Full text search of all the content stored.
- ✨ AI-based (aka chatgpt) automatic tagging.
- 🔖 [Chrome plugin](https://chromewebstore.google.com/detail/hoarder/kgcjekpmcjjogibpjebkhaanilehneje) for quick bookmarking.
- 📱 [iOS shortcut](https://www.icloud.com/shortcuts/78734b46624c4a3297187c85eb50d800) for bookmarking content from the phone. A minimal mobile app is in the works.
- 💾 Self-hosting first.
- [Planned] Archiving the content for offline reading.

**⚠️ This app is under heavy development and it's far from stable.**
64 changes: 64 additions & 0 deletions docs/docs/02-installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Installation

## Docker (Recommended)

### Requirements

- Docker
- Docker Compose

### 1. Create a new directory

Create a new directory to host the compose file and env variables.

### 2. Download the compose file

Download the docker compose file provided [here](https://github.com/MohamedBassem/hoarder-app/blob/main/docker/docker-compose.yml).

```
$ wget https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/docker/docker-compose.yml
```

### 3. Populate the environment variables

You can use the env file template provided [here](https://github.com/MohamedBassem/hoarder-app/blob/main/.env.sample) and fill it manually using the documentation [here](/configuration).

```
$ wget https://raw.githubusercontent.com/MohamedBassem/hoarder-app/main/.env.sample
$ mv .env.sample .env
```

Alternatively, here is a minimal `.env` file to use:

```
NEXTAUTH_SECRET=super_random_string
NEXTAUTH_URL=<YOUR DEPLOYED URL>
HOARDER_VERSION=release
MEILI_ADDR=http://meilisearch:7700
MEILI_MASTER_KEY=another_random_string
```

You can use `openssl rand -base64 36` to generate the random strings.

Persistent storage and the wiring between the different services is already taken care of in the docker compose file.

### 4. Setup OpenAI

To enable automatic tagging, you'll need to configure open ai. This is optional though but hightly recommended.

- Follow [OpenAI's help](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key) to get an API key.
- Add `OPENAI_API_KEY=<key>` to the env file.

Learn more about the costs of using openai [here](/openai).


### 5. Start the service


Start the service by running:

```
$ docker compose up -d
```
14 changes: 14 additions & 0 deletions docs/docs/03-configuration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Configuration

The app is mainly configured by environment variables. All the used environment variables are listed in [packages/shared/config.ts](https://github.com/MohamedBassem/hoarder-app/blob/main/packages/shared/config.ts). The most important ones are:

| Name | Required | Default | Description |
| ---------------- | ------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------- |
| DATA_DIR | Yes | Not set | The path for the persistent data directory. This is where the db and the uploaded assets live. |
| NEXTAUTH_SECRET | Yes | Not set | Random string used to sign the JWT tokens. Generate one with `openssl rand -base64 36`. |
| NEXTAUTH_URL | Yes | Not set | The url on which the service will be running on. E.g. (`https://demo.hoarder.app`). |
| REDIS_HOST | Yes | localhost | The address of redis used by background jobs |
| REDIS_POST | Yes | 6379 | The port of redis used by background jobs |
| OPENAI_API_KEY | No | Not set | The OpenAI key used for automatic tagging. If not set, automatic tagging won't be enabled. More on that in [here](/openai). |
| MEILI_ADDR | No | Not set | The address of meilisearch. If not set, Search will be disabled. E.g. (`http://meilisearch:7700`) |
| MEILI_MASTER_KEY | Only in Prod and if search is enabled | Not set | The master key configured for meilisearch. Not needed in development environment. Generate one with `openssl rand -base64 36` |
17 changes: 17 additions & 0 deletions docs/docs/04-quick-sharing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# Quick Sharing Extensions

The whole point of Hoarder is making it easy to hoard the content. That's why there are a couple of

## Mobile Apps

<img src="/img/quick-sharing/mobile.png" alt="mobile screenshot" width="300"/>


- iOS app: TODO
- Android App: The app is built in using a cross-platform framework (react native). So technically, the android app should just work, but I didn't test it. If there's enough demand, I'll publish it to the google play store.

## Chrome Extensions

<img src="/img/quick-sharing/extension.png" alt="mobile screenshot" width="300"/>

- To quickly bookmark links, you can also use the chrome extension [here](https://chromewebstore.google.com/detail/hoarder/kgcjekpmcjjogibpjebkhaanilehneje).
11 changes: 11 additions & 0 deletions docs/docs/05-openai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# OpenAI Costs

This service uses OpenAI for automatic tagging. This means that you'll incur some costs if automatic tagging is enabled. There are two type of inferences that we do:

## Text Tagging

For text tagging, we use the `gpt-3.5-turbo-0125` model. This model is [extremely cheap](https://openai.com/pricing). Cost per inference varies depending on the content size per article. Though, roughly, You'll be able to generate tags for almost 1000+ bookmarks for less than $1.

## Image Tagging

For image uploads, we use the `gpt-4-vision-preview` model for extracting tags from the image. You can learn more about the costs of using this model [here](https://platform.openai.com/docs/guides/vision/calculating-costs). To lower the costs, we're using the low resolution mode (fixed number of tokens regardless of image size). The gpt-4 model, however, is much more expensive than the `gpt-3.5-turbo`. Currently, we're using around 350 token per image inference which ends up costing around $0.01 per inference. So around 10x more expensive than the text tagging.
68 changes: 68 additions & 0 deletions docs/docs/06-Development/01-setup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Setup

## Manual Setup
### First Setup

- You'll need to prepare the environment variables for the dev env.
- Easiest would be to set it up once in the root of the repo and then symlink it in each app directory.
- Start by copying the template by `cp .env.sample .env`.
- The most important env variables to set are:
- `DATA_DIR`: Where the database and assets will be stored. This is the only required env variable. You can use an absolute path so that all apps point to the same dir.
- `REDIS_HOST` and `REDIS_PORT` default to `localhost` and `6379` change them if redis is running on a different address.
- `MEILI_ADDR`: If not set, search will be disabled. You can set it to `http://127.0.0.1:7700` if you run meilisearch using the command below.
- `OPENAI_API_KEY`: If you want to enable auto tag inference in the dev env.

### Dependencies

#### Redis

Redis is used as the background job queue. The easiest way to get it running is with docker `docker run -p 6379:6379 redis:alpine`.

#### Meilisearch

Meilisearch is the provider for the full text search. You can get it running with `docker run -p 7700:7700 getmeili/meilisearch:v1.6`.

Mount persistent volume if you want to keep index data across restarts. You can trigger a re-index for the entire items collection in the admin panel in the web app.

#### Chrome

The worker app will automatically start headless chrome on startup for crawling pages. You don't need to do anything there.

### Web App

- Run `pnpm web` in the root of the repo.
- Go to `http://localhost:3000`.

> NOTE: The web app kinda works without any dependencies. However, search won't work unless meilisearch is running. Also, new items added won't get crawled/indexed unless redis is running.
### Workers

- Run `pnpm workers` in the root of the repo.

> NOTE: The workers package requires having redis working as it's the queue provider.
### iOS Mobile App

- `cd apps/mobile`
- `pnpm exec expo prebuild --no-install` to build the app.
- Start the ios simulator.
- `pnpm exec expo run:ios`
- The app will be installed and started in the simulator.

Changing the code will hot reload the app. However, installing new packages requires restarting the expo server.

### Browser Extension

- `cd apps/browser-extension`
- `pnpm dev`
- This will generate a `dist` package
- Go to extension settings in chrome and enable developer mode.
- Press `Load unpacked` and point it to the `dist` directory.
- The plugin will pop up in the plugin list.

In dev mode, opening and closing the plugin menu should reload the code.


## Docker Dev Env

If the manual setup is too much hassle for you. You can use a docker based dev environment by running `docker compose -f docker/docker-compose.dev.yml up` in the root of the repo. This setup wasn't super reliable for me though.
Loading

0 comments on commit 2f21ef2

Please sign in to comment.