Skip to content

Latest commit

 

History

History
298 lines (215 loc) · 11.3 KB

README.md

File metadata and controls

298 lines (215 loc) · 11.3 KB


Logo

Life analyser

This is a telegram bot that parses my telegram channel with daily reports. Based on this data, it generates a summary of past reports, predicts various indicators of well-being for the next day and recommends the best activities for the next day
Explore the docs »

View Demo · Report Bug · Request Feature

About The Project

My channel screen shot

Every day, I make a list of tasks in the Telegram channel. At the end of the day, I mark with an emoji whether I have completed the tasks or not and evaluate my productivity, the interest of the day, and stress level on a 10-point scale.

I decided to create a chatbot. This chatbot will parse my Telegram channel, my Google calendar, data from my fitness bracelet, and, based on the information from there, make various predictions, give advice, and check the correctness of filling out reports.

Feature 1: N-day summary and autoreports

The model will summarize my activities, emotions, etc. during the week and generate a report.

Activiti type recognition

I have a small dataset with 560 activities to classify. For this task, I tried different models: fasttext+gb, fasttext+BiLSTM, and roberta-base. The best score was achieved with roberta, fine-tuned with layer freezing and other specific tricks mentioned in articles: [1], [2], [3]

Model Accuracy(%) F1
fasttext+gb 82
fasttext+BiLSTM 88
roberta-base 92
roberta-base with
additional train data
95

Sentiment detection

For this task I used zero-shot classification model bertweet-sentiment-analysis from - huggingface

Feature 2: Activities recommendation system

My recommendation system consists of two parts: 1 - activity type recommendations 2 - activities generation. First, the model predicts the top 3 activity types for the next day, then the second model generates some activities for each activity type

Activity type reccomendations:

I used pretrained fasttext and LSTM. For each day, I aggregate activity embeddings using average pooling, then concatenate additional numeric features and pooled embedding. Thus, I get a day representation embedding and put them in LSTM sequentially according to the date

Activities generation:

Now when I get activity types recommendation, I train adapters for flan-T5 model [4] to generate activities. For each activity type, I generate 2 activities: one based on the user's positive experience, and another something new that the user has never experienced before.

Architecture:

lstm recsys

My model results:

lstm predictions

Project structure

The project has the following structure:

  • assitant/: .py scripts with data parsing and preprocessing
  • assitant/models: .py scripts with model training and inference modules
  • assitant/bot: .py telegram bot scripts

(back to top)

Roadmap

  • Telegram channel data fetcher and parser

  • Database mechanics and data markup

  • Activity classifier

  • Sentiment detector

  • Day score predictor

  • Activity recommender

  • LLM-based activity generator

  • LLM-based chat bot with intent recognition

  • LLM-based daily/weekly summaries generator

  • Activities detection from images

  • Anomalies detection

Contacts

Telegram: @my_name_is_nikita_hey
Mail: tttonyalpha@gmail.com

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

References

[1] Revisiting Few-sample BERT Fine-tuning. Tianyi Zhang, Felix Wu, Arzoo Katiyar, Kilian Q. Weinberger, Yoav Artzi.
arXiv:2006.05987

[2] Investigating Transferability in Pretrained Language Models. Alex Tamkin, Trisha Singh, Davide Giovanardi, Noah Goodman.
arXiv:2004.14975

[3] Universal Language Model Fine-tuning for Text Classification. Jeremy Howard, Sebastian Ruder.
arXiv:1801.06146

[4] Scaling Instruction-Finetuned Language Models. Team from Google.
arXiv:2210.11416