This directory contains tools needed to run the Librarian server.
The bulk of the code is contained in a module named librarian_server
located
in this directory. It is installed alongside the hera_librarian
client module,
and the additional dependencies required to run the server can be installed
using
pip install .[server]
from the top level of the repo.
The server state is backed by a database. We use Alembic to manage the evolution of the database schema.
A Librarian installation requires:
- A bunch of computers that store data (the “stores”)
- A database server
- A machine running the Librarian web server. This could potentially be one of the stores or the database server.
The server machine and store machines need to be able to SSH into each other without passwords. The machine running the web server needs a Python installation with the following modules:
- flask
- jinja2
- sqlalchemy
- flask-sqlalchemy
- Whichever database driver sqlalchemy will need to talk to your database server.
- aipy
- numpy
- astropy
- tornado optionally, for robust HTTP service
- alembic
- pyuvdata
- pytz
A standard Anaconda Python installation can provide all of these except
aipy
and pyuvdata
. These are available through pip
or the conda-forge
channel, like so:
conda install -c conda-forge aipy pyuvdata
To set up a new Librarian server:
- Create a database for the Librarian using the system of your choice. The on-site system uses Postgres so this will always be the best-supported option.
- Create a file called
server-config.json
, usingserver-config.sample.json
as a template. There are a handful of values that need to be set — most importantly, the SQLAlchemy database URL needed to talk to your database. Once this file is created, you must export its full path to theLIBRARIAN_CONFIG_PATH
shell variable. - From the top level directory, run
alembic upgrade head
to initialize the database schema using Alembic’s infrastructure. - Finally, run
runserver.py
to boot the server.
When in the course of human events it becomes necessary for a developer to change the schema of the Librarian database, the developer must use Alembic to manage the schema evolution. The workflow for specifying a schema change is this:
- Read the Alembic documentation to familiarize yourself with how it works.
- Change the schema in the files in the
librarian_server
module as needed. - In the main repo directory, run
alembic revision --autogenerate -m $DESCRIPTION_OF_CHANGE
, where$DESCRIPTION_OF_CHANGE
is a terse description of the change to the schema. Alembic will do its best to figure out what you did to the schema and record the changes in a new file in the subdirectoryalembic/versions
. - Review and edit that new file to make sure that it makes sense, provides default values for new columns as needed, etc.
- Add the new file to Git and commit it with your schema change. Ideally the commit introducing the changed schema should change nothing else about the Librarian.
- Use the Docker-based test rig to verify that everything works.
The workflow for deploying a schema change on a given Librarian instance is:
- Shut down the running Librarian server.
- Use Git to pull in a version of the codebase with the changed schema.
- From the main repo directory, run
alembic upgrade head
to update the database schema to the newest version. You may need to set the environment variableLIBRARIAN_CONFIG_PATH
to point to the Librarian server configuration file if it has not already been set. - Restart the Librarian server.
Obviously, you should not deploy a schema change to one of the production servers (on-site, NRAO) until you are sure that the associated change is one that we want to commit to.
Here is a quick introduction for how to stand up a librarian server and upload a file to it. We assume the user is using Postgres as the backing database, and is running on Ubuntu 18.04. This is a "bare-metal" installation approach, in which the librarian is installed directly on the system. Alternatively, one can refer to the instructions on running a Docker installation for running inside of a container.
- Install the postgres package:
sudo apt update sudo apt install postgresql postgresql-contrib
- Set up postgres to work with the testing database. Note the name of the
database is defined in the server config file. This database name matches the
one defined in the sample config file:
sudo su postgres psql -c "create database librarian_test;" psql -c "create user <your_username>;" exit
- Clone the librarian repo:
git clone https://github.com/HERA-Team/librarian.git
- Install the librarian package and dependencies:
cd librarian pip install .[server]
- Point to the testing server config file:
export LIBRARIAN_CONFIG_PATH=`pwd`/ci/server-config-ci.json
- Use
alembic
to update the database schema:alembic upgrade head
- Copy the high-level client config file to the proper location:
cp ci/hl_client.cfg ~/.hl_client.cfg
- Make a scratch directory for the librarian to use as a "store". These are
defined in the
add-stores
section of the server config file:mkdir /tmp/librarian
- Launch the librarian server:
runserver.py
- In a separate terminal, attempt to upload a file to the running librarian
server:
librarian upload --null-obsid TestUser README.md foo/README.md
- Verify that the file was successfully uploaded to the librarian. In a web
browser, navigate to
localhost:21108
. You should use the authenticator stringI am a human
. After authenticating, you should seeREADME.md
listed under "Most Recent Files". Success!
As an alternative to the above installation, the librarian server supports
running inside of a container using Docker. Before
building and launching the container, an ssh key pair should be generated to
facilitate interaction between the librarian app and the store. A new ssh key
can be generated using the ssh-keygen
command:
ssh-keygen -t rsa
The key should have no password associated with it. Once the key pair has been
generated, copies should be placed into a folder inside the container
folder called secrets
. The public key should be saved as
id_rsa_pub.txt
, and the private key should be id_rsa.txt
. These files will
be mounted inside of the running containers as
secrets. You should also make
sure they are accessible on your local machine. In what follows, they are
assumed to be available at ~/.ssh/id_docker_rsa
.
In addition, the following changes should be made to the ssh config file on your
local machine (~/.ssh/config
). This allows you to use rsync
to read and
write data from the libstore container:
Host = libstore
User = root
Port = 2222
IdentityFile = ~/.ssh/id_docker_rsa
HostName = localhost
To test that the store is accessible, once the containers are running, try to
ssh libstore
, which should provide access without requiring a password. If
this does not yield a shell inside of the running container, then the primary
functionality of the librarian will not work.
The full librarian application can be launched by running the following command from the top-level directory:
docker-compose up
This will launch the librarian server using this config
file. The docker-compose.yml
file may
be modified to use an alternative file if this is desired. A local volume is
used for supporting long-term storage of both the database data (for postgres)
and librarian file data storage. As with the installation above, it can be
accessed on localhost:21108
, using the same authenticator.