GitHub - glasgowm148/Calamity: Calamity: Identifying emergency reports and calls for help on social media

An Intelligent System for Disaster Management
Explore the docs »

View Demo · Report Bug · Request Feature

About The Project

Calamity is a general pipeline for feature extraction from tweet datasets. It uses Akka Actors to parse tweet-sets with thread-safe concurrency, extracts 50 features such as sentiment, objectivity, tense, length, verified status, and many more. The text is tokenised and encoded with word embeddings, in total - 500 features are exposed via an API in play-frame which can then be utilised in Jupyter. The Igel provides access to all scikit-learn model and allows you to train/fit, test and use models without writing any code. The results are then evaluated agains the TRECIS-2020A evaluation script.

Built With

Play Framework
- Java
- sbt
- Akka
Jupyter

Overview

There is an accompanying report which will give a greater insight into the uses for this project

For more examples, please refer to the Documentation

Internationally, civil protection, police forces and emergency response agencies are under increasing pressure to more quickly and effectively respond to emergency situations. The mass adoption of mobile internet-enabled devices paired with wide-spread use of social media platforms for communication and coordination has created ways for the public on-the-ground to contact response services.

Moreover, a recent study reported that 63% of people expect responders to answer calls for help on social media. With the rise of social media, emergency service operators are now expected to monitor those channels and answer questions from the public However, they do not have adequate tools or manpower to effectively monitor social media, due to the large volume of information posted on these platforms and the need to categorise, cross-reference and verify that information.

For instance, for a flash flooding event, feeds might include, ‘requests for food/water’, ‘reports of road blockages’, and ‘evacuation requests’. In this way, during an emergency, individual emergency management operators and other stakeholders can register to access to the subset of feeds within their domain of responsibility providing access to relevant social media content.

During the project the student will learn about how to use state-of-the-art machine learning techniques to classify social media posts by the information they contain in real-time.

Evaluation

The product will be evaluated via automatic evaluation of categorization accuracy using text collections built for the TREC Incident Streams evaluation initiative

Getting Started

Installation

Open your favourite terminal and enter the following:

$ curl -s https://github.com/glasgowm148/Calamity/docs/install.sh | bash

If the environment needs tweaking for Calamity to be installed.

run.sh configures environemntal variables for the number of embeddings, the dataset location, and how many lines to give to each Actor before running the code.

Installation

Commands

User Commands

Command	Description
`sbt`	Enter the SBT console
`sbt run`	Run the Application
`sbt clean`	Enter the SBT console
`sbt reload`	Reload changes to build.sbt

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Mark Glasgow - markglasgow@gmail.com

Acknowledgements

TREC Incident Streams provided the dataset and evaluation script.
Ark Tweet NLP - Twokenizer and Part-of-Speech tagger.
Keyword Extraction in Java - Implementation of serveral algorithms for keyword extraction, including TextRank, TF-IDF, and a combination of the two. Cutting words and filtering stop words are performed using on HanLP
GloVe: Global Vectors for Word Representation
- java-text-embedding - provides GloVe word embedding that developer can directly use within their project.

This project was initially built as part of my Honours Individual Project Dissertation at The University of Glasgow

Name		Name	Last commit message	Last commit date
Latest commit History 255 Commits
data		data
docs		docs
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

About The Project

Built With

Overview

Evaluation

Getting Started

Installation

Installation

Commands

User Commands

Roadmap

Contributing

License

Contact

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

glasgowm148/Calamity

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

About The Project

Built With

Overview

Evaluation

Getting Started

Installation

Installation

Commands

User Commands

Roadmap

Contributing

License

Contact

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages