Skip to content
This repository has been archived by the owner on Apr 16, 2019. It is now read-only.

How to Run Newsdiffs

Thomas Puppe edited this page Jul 6, 2015 · 2 revisions

NewsDiffs

Requirements

You need to have installed on you machine:

  • Git
  • Python 2.6 or later
  • Django and other Python libraries

It is recommended to install required Django version and rest of the required packages via

$ pip install -r requirements.txt

This assures that the needed versions are installed.

Initial setup

The inital setup initiates the Newsdiffs project. You have to run these commands despite the environment you are working with.

$ python website/manage.py syncdb && python website/manage.py migrate 
$ mkdir articles

Running NewsDiffs Locally

After running the inital setup you can start Newsdiffs directly with the built in webserver. Mostly this is done for develop/testing purposes. To start the webserver for testing:

$ python website/manage.py runserver 0.0.0.0:8000

and visit http://localhost:8000/

Running the scraper

Do the initial setup above. You will also need additional Python libraries; on a Debian- or Ubuntu-based system. If you used the pip install -r requirements.txt command, the additional python libraries should be installed.

Note that we need two versions of BeautifulSoup, both 3.2 and 4.0; some websites are parsed correctly in only one version.

Then run

$ python website/manage.py scraper

This will populate the articles repository with a list of current news articles. This is a snapshot at a single time, so the website will not yet have any changes. To get changes, wait some time (say, 3 hours) and run 'python website/manage.py scraper' again. If any of the articles have changed in the intervening time, the website should display the associated changes. The scraper will log progress to /tmp/newsdiffs_logging (which is overwritten each run) and errors to /tmp/newsdiffs/logging_errs (which is cumulative). To run the scraper every hour, run something like:

$ while true; do python website/manage.py scraper; sleep 60m; done

or make a cron job.


Deploying Newsdiffs in a productive environment

Now that we can run Newsdiffs locally we want to deploy it in a productive environment. This part will show you how the basics how to deploy Newsdiffs with the help of django, uwsgi and nginx. This will run o a Debian system. The tutorial will not explain every aspect of nginx or uwsgi. There are also other methods how to deploy Newsdiffs.

One of the prequesites is that pip is installed and you have engough privilges to execute the commands. Also you should have installed python (on Debian python is installed by default). As first you should install install django and beautiful soap via pip:

$ pip install -r /path/to/newsdiffs/requirements.txt

Now you should have installed everything to start Newsdiffs locally. Before we deploy Newsdiffs we have to initiate it. For that we will execute the command described above in “Initial setup”:

$ python website/manage.py syncdb && python website/manage.py migrate $ mkdir articles

After initalizing Newsdiffs we start it to see if everything is running up to this point. With the following command we should be able to see the Newsdiffs with our browser.

$ python website/manage.py runserver 0.0.0.0:8000

Now we should be able to visit https://localhost:8000 and see Newsdiffs. To stop the server we just have to hit ctrl+c.

uWSGI

Now that Newsdiffs is running we want to deploy it uswgi. For this we have to install uwsgi. Which way you use to install uwsgi is up to you and your enviroment, we will install it with pip:

$ pip install uwsgi

and

$ sudo apt-get install python-dev

uWSGI is installed. To test if everything is working we will run the following command:

$ uwsgi --http :8080 --chdir /path/to/newsdiffs -w /path/to/newsdiffs/website/newsdiffs.wsgi

You should no be able to visit Newsdiffs via http://localhost:8080.

To deploy newsdiffs we will use uWSGI. We will run uWSGI in the Emperor mode, which allows a master process to manage separate applications automatically given a set of configuration files.

Create a directory that will hold your configuration files. Since this is a global process, we will create a directory called /etc/uwsgi/apps-avaible and /etc/uwsgi/apps-enabled to store our configuration files. Move into the directory after you create it:

$ sudo mkdir -p /etc/uwsgi/apps-enabled
$ sudo mkdir -p /etc/uwsgi/apps-available
$ cd /etc/uwsgi/apps-available

In the apps-available folder we will create the newsdiffs.ini that will hold the whole configuration to run newsdiffs with uWSGI In this directory, we will place our configuration files. We need a configuration file for each of the projects we are serving. The uWSGI process can take configuration files in a variety of formats, but we will use.ini files due to their simplicity.Create a file for your first project and open it in your text editor:

sudo nano newsdiffs.ini
[uwsgi]
chdir = path\to\newsdiffs
# the virtualenv (full path)
# home            = /path/to/virtualenv
module = website.wsgi:application
master = true
processes = 5
socket          = /path/to/your/project/mysite.sock
chmod-socket = 664
vacuum = true
uid = www-data
gid = www-data

The last step ist to create a symbolic link from /etc/uwsgi/app-available/newsdiffs.ini to /etc/uwsgi/app-enabled/

sudo ln -s /etc/uwsgi/app-available/newsdiffs.ini /etc/uwsgi/app-enabled/

Now the newsdiffsproject is configured to run with uWSGI. To test the configuration you can run: Depening on your enviroment you can use different approaches to create an upstart script for uWSGI. The next step is to install and configure nginx. At first we will install nginx: sudo apt-get install nginx

Once Nginx is installed, we can go ahead and create a server block configuration file for each of our projects. Start with the first project by creating a server block configuration file:

sudo nano /etc/nginx/sites-available/newsdiffs

server {
    listen 80;
    server_name domain.com www.domain.com;

    location = /favicon.ico { access_log off; log_not_found off; }

    location / {
        include         uwsgi_params;
        uwsgi_pass      unix:/path/to/newsdiffs/newsdiffs.sock;
    }
}

Save and close the file when you are finished. Next, link both of your new configuration files to Nginx's sites-enabled directory to enable them: sudo ln -s /etc/nginx/sites-available/firstsite /etc/nginx/sites-enabled Check the configuration syntax by typing:

sudo service nginx configtest

If no syntax errors are detected, you can restart your Nginx service to load the new configuration:

sudo service nginx restart

You should now be able to reach newsdiffs.

If you want to use