An Intelligent System for Disaster Management
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Calamity is a general pipeline for feature extraction from tweet datasets. It uses Akka Actors to parse tweet-sets with thread-safe concurrency, extracts 50 features such as sentiment, objectivity, tense, length, verified status, and many more. The text is tokenised and encoded with word embeddings, in total - 500 features are exposed via an API in play-frame which can then be utilised in Jupyter. The Igel provides access to all scikit-learn model and allows you to train/fit, test and use models without writing any code. The results are then evaluated agains the TRECIS-2020A evaluation script.
- Play Framework
- Java
- sbt
- Akka
- Jupyter
There is an accompanying report which will give a greater insight into the uses for this project
For more examples, please refer to the Documentation
Internationally, civil protection, police forces and emergency response agencies are under increasing pressure to more quickly and effectively respond to emergency situations. The mass adoption of mobile internet-enabled devices paired with wide-spread use of social media platforms for communication and coordination has created ways for the public on-the-ground to contact response services.
Moreover, a recent study reported that 63% of people expect responders to answer calls for help on social media. With the rise of social media, emergency service operators are now expected to monitor those channels and answer questions from the public However, they do not have adequate tools or manpower to effectively monitor social media, due to the large volume of information posted on these platforms and the need to categorise, cross-reference and verify that information.
For instance, for a flash flooding event, feeds might include, ‘requests for food/water’, ‘reports of road blockages’, and ‘evacuation requests’. In this way, during an emergency, individual emergency management operators and other stakeholders can register to access to the subset of feeds within their domain of responsibility providing access to relevant social media content.
During the project the student will learn about how to use state-of-the-art machine learning techniques to classify social media posts by the information they contain in real-time.
The product will be evaluated via automatic evaluation of categorization accuracy using text collections built for the TREC Incident Streams evaluation initiative
Open your favourite terminal and enter the following:
$ curl -s https://github.com/glasgowm148/Calamity/docs/install.sh | bash
If the environment needs tweaking for Calamity to be installed.
run.sh
configures environemntal variables for the number of embeddings, the dataset location, and how many lines to give to each Actor before running the code.
Command | Description |
---|---|
sbt |
Enter the SBT console |
sbt run |
Run the Application |
sbt clean |
Enter the SBT console |
sbt reload |
Reload changes to build.sbt |
See the open issues for a list of proposed features (and known issues).
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Mark Glasgow - markglasgow@gmail.com
- TREC Incident Streams provided the dataset and evaluation script.
- Ark Tweet NLP - Twokenizer and Part-of-Speech tagger.
- Keyword Extraction in Java - Implementation of serveral algorithms for keyword extraction, including TextRank, TF-IDF, and a combination of the two. Cutting words and filtering stop words are performed using on HanLP
- GloVe: Global Vectors for Word Representation
- java-text-embedding - provides GloVe word embedding that developer can directly use within their project.
This project was initially built as part of my Honours Individual Project Dissertation at The University of Glasgow