Skip to content

Latest commit

 

History

History
220 lines (191 loc) · 8.8 KB

README.md

File metadata and controls

220 lines (191 loc) · 8.8 KB

The HERA Librarian Server

This directory contains tools needed to run the Librarian server.

The bulk of the code is contained in a module named librarian_server located in this directory. It is installed alongside the hera_librarian client module, and the additional dependencies required to run the server can be installed using

pip install .[server]

from the top level of the repo.

The server state is backed by a database. We use Alembic to manage the evolution of the database schema.

Setting up a Server

A Librarian installation requires:

  1. A bunch of computers that store data (the “stores”)
  2. A database server
  3. A machine running the Librarian web server. This could potentially be one of the stores or the database server.

The server machine and store machines need to be able to SSH into each other without passwords. The machine running the web server needs a Python installation with the following modules:

  1. flask
  2. jinja2
  3. sqlalchemy
  4. flask-sqlalchemy
  5. Whichever database driver sqlalchemy will need to talk to your database server.
  6. aipy
  7. numpy
  8. astropy
  9. tornado optionally, for robust HTTP service
  10. alembic
  11. pyuvdata
  12. pytz

A standard Anaconda Python installation can provide all of these except aipy and pyuvdata. These are available through pip or the conda-forge channel, like so:

conda install -c conda-forge aipy pyuvdata

To set up a new Librarian server:

  1. Create a database for the Librarian using the system of your choice. The on-site system uses Postgres so this will always be the best-supported option.
  2. Create a file called server-config.json, using server-config.sample.json as a template. There are a handful of values that need to be set — most importantly, the SQLAlchemy database URL needed to talk to your database. Once this file is created, you must export its full path to the LIBRARIAN_CONFIG_PATH shell variable.
  3. From the top level directory, run alembic upgrade head to initialize the database schema using Alembic’s infrastructure.
  4. Finally, run runserver.py to boot the server.

Updating the Database Schema

When in the course of human events it becomes necessary for a developer to change the schema of the Librarian database, the developer must use Alembic to manage the schema evolution. The workflow for specifying a schema change is this:

  1. Read the Alembic documentation to familiarize yourself with how it works.
  2. Change the schema in the files in the librarian_server module as needed.
  3. In the main repo directory, run alembic revision --autogenerate -m $DESCRIPTION_OF_CHANGE, where $DESCRIPTION_OF_CHANGE is a terse description of the change to the schema. Alembic will do its best to figure out what you did to the schema and record the changes in a new file in the subdirectory alembic/versions.
  4. Review and edit that new file to make sure that it makes sense, provides default values for new columns as needed, etc.
  5. Add the new file to Git and commit it with your schema change. Ideally the commit introducing the changed schema should change nothing else about the Librarian.
  6. Use the Docker-based test rig to verify that everything works.

The workflow for deploying a schema change on a given Librarian instance is:

  1. Shut down the running Librarian server.
  2. Use Git to pull in a version of the codebase with the changed schema.
  3. From the main repo directory, run alembic upgrade head to update the database schema to the newest version. You may need to set the environment variable LIBRARIAN_CONFIG_PATH to point to the Librarian server configuration file if it has not already been set.
  4. Restart the Librarian server.

Obviously, you should not deploy a schema change to one of the production servers (on-site, NRAO) until you are sure that the associated change is one that we want to commit to.

Quick and Dirty Guide for Installing and Testing a Librarian Server

Here is a quick introduction for how to stand up a librarian server and upload a file to it. We assume the user is using Postgres as the backing database, and is running on Ubuntu 18.04. This is a "bare-metal" installation approach, in which the librarian is installed directly on the system. Alternatively, one can refer to the instructions on running a Docker installation for running inside of a container.

  1. Install the postgres package:
    sudo apt update
    sudo apt install postgresql postgresql-contrib
    
  2. Set up postgres to work with the testing database. Note the name of the database is defined in the server config file. This database name matches the one defined in the sample config file:
    sudo su postgres
    psql -c "create database librarian_test;"
    psql -c "create user <your_username>;"
    exit
    
  3. Clone the librarian repo:
    git clone https://github.com/HERA-Team/librarian.git
    
  4. Install the librarian package and dependencies:
    cd librarian
    pip install .[server]
    
  5. Point to the testing server config file:
    export LIBRARIAN_CONFIG_PATH=`pwd`/ci/server-config-ci.json
    
  6. Use alembic to update the database schema:
    alembic upgrade head
    
  7. Copy the high-level client config file to the proper location:
    cp ci/hl_client.cfg ~/.hl_client.cfg
    
  8. Make a scratch directory for the librarian to use as a "store". These are defined in the add-stores section of the server config file:
    mkdir /tmp/librarian
    
  9. Launch the librarian server:
    runserver.py
    
  10. In a separate terminal, attempt to upload a file to the running librarian server:
    librarian upload --null-obsid TestUser README.md foo/README.md
    
  11. Verify that the file was successfully uploaded to the librarian. In a web browser, navigate to localhost:21108. You should use the authenticator string I am a human. After authenticating, you should see README.md listed under "Most Recent Files". Success!

Docker Installation Instructions

As an alternative to the above installation, the librarian server supports running inside of a container using Docker. Before building and launching the container, an ssh key pair should be generated to facilitate interaction between the librarian app and the store. A new ssh key can be generated using the ssh-keygen command:

ssh-keygen -t rsa

The key should have no password associated with it. Once the key pair has been generated, copies should be placed into a folder inside the container folder called secrets. The public key should be saved as id_rsa_pub.txt, and the private key should be id_rsa.txt. These files will be mounted inside of the running containers as secrets. You should also make sure they are accessible on your local machine. In what follows, they are assumed to be available at ~/.ssh/id_docker_rsa.

In addition, the following changes should be made to the ssh config file on your local machine (~/.ssh/config). This allows you to use rsync to read and write data from the libstore container:

Host = libstore
  User = root
  Port = 2222
  IdentityFile = ~/.ssh/id_docker_rsa
  HostName = localhost

To test that the store is accessible, once the containers are running, try to ssh libstore, which should provide access without requiring a password. If this does not yield a shell inside of the running container, then the primary functionality of the librarian will not work.

The full librarian application can be launched by running the following command from the top-level directory:

docker-compose up

This will launch the librarian server using this config file. The docker-compose.yml file may be modified to use an alternative file if this is desired. A local volume is used for supporting long-term storage of both the database data (for postgres) and librarian file data storage. As with the installation above, it can be accessed on localhost:21108, using the same authenticator.