Skip to content

CMUEberlyCenter/eberly-docuscope-classroom

Repository files navigation

DocuScope Classroom Analysis tools

A Docker'ed service for analyzing text that has been processed by CMU_Sidecar/docuscope-tag>

Administration and Support

For any questions regarding overall project or the language model used, please contact suguru@cmu.edu.

The project code is supported and maintained by the Eberly Center at Carnegie Mellon University. For help with this fork, project, or service please contact eberly-assist@andrew.cmu.edu.

Requirements

  1. Docker
  2. Access to a MySQL version 8.0 database.
  3. CMU_Sidecar/docuscope-tag> to tag documents in the database.
  4. CMU_Sidecar/docuscope-olilti> to populate the database using LTI.
  5. The default_tones.json.gz file generated by CMU_Sidecar/docuscope-dictionary-tools/docuscope-tones> ds_tones utility to translate LAT's (identifiers used in tagger) to Clusters (identifiers used in the common_dict.json file).
  6. The common_dict.json file that defines the hierarchical structure of Categories/Subcategories/Clusters and their descriptions. See api/common_dictionary_schema.json for details.

Configuration

The following environment variable should be set so that DocuScope Classroom can access the various required services. The defaults tend to be reasonable values for a development environment where everything is hosted locally and do not reflect values that should be used in any production environment.

Variable Description Default
DICTIONARY_HOME Path to base directory of necessary runtime dictionary files specified above. <Application's base directory>/dictionary
DB_HOST Hostname of the MySQL database for storing processed documents. 127.0.0.1
DB_PORT Port of the MySQL document database. 3306
DB_PASSWORD Password for accessing the document database. 1 2
DB_USER Username for accessing the document database. 1 docuscope
MYSQL_DATABASE Identifier for document database. docuscope

The required DocuScope language model files (common_dict.json and default_tones.json.gz) should be located in DICTIONARY_HOME.

Customization of the interface can be achieved by replacing the contents of the app/static/assets directory. The defaults are found in classroom/src/assets.

Usage

  1. Build docker image: docker build -t <tag> . When deployed, service is bound to port 80 of the docker container.

Acknowledgements

This project was partially funded by the A.W. Mellon Foundation, Carnegie Mello University's Simon Initiative Seed Grant, and the Berkman Faculty Development Fund.


Footnotes

  1. It is recommended to use Docker secrets to get these values. The application is able to retrieve values from specified files if the environment variable has the _FILE affix added. 2

  2. Passwords intentionally default to None value for security reasons.