Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete users that haven't logged in for too long #647

Merged
merged 57 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
5a9ead0
Add deletion_date to users in DB
Splines May 31, 2024
cf17656
Set deletion date in user cleaner
Splines May 31, 2024
0b064ac
Remove old user cleaner code & add first unit test
Splines May 31, 2024
b3f3ec8
Don't just mount `spec` folder, but complete `app` folder for tests
Splines May 31, 2024
9923e79
Revert "Don't just mount `spec` folder, but complete `app` folder for…
Splines May 31, 2024
93d03ee
Add more unit tests for UserCleaner
Splines May 31, 2024
f43e327
Improve tests for deletion of generic users only
Splines Jun 1, 2024
200b2fa
Split test for deletion of users
Splines Jun 1, 2024
f77f26c
Unset deletion date after successful user login
Splines Jun 1, 2024
833aeef
Implement un-setting the deletion date & add tests
Splines Jun 1, 2024
4d7dec9
Rewrite deletion unsetting to reuse existing method
Splines Jun 1, 2024
80413c3
Merge branch 'dev' into feature/user-cleaner
Splines Jun 1, 2024
e338f37
Send initial warning mail about deletion in 40 days
Splines Jun 1, 2024
a770f42
Send additional warning mails when deletion date is near
Splines Jun 1, 2024
5cb47ec
Add schema comment pointing to UserCleaner
Splines Jun 4, 2024
6e8e30f
Test mail delivery
Splines Jun 4, 2024
6905fea
Add "!" to unsafe method
Splines Jun 4, 2024
5da46bb
Use Rails logger to log info about deleted users
Splines Jun 4, 2024
70e0804
Move UserCleaner from models to workers
Splines Jun 4, 2024
9dd69dc
Setup user cleaner cron job to run daily
Splines Jun 4, 2024
465db62
Improve user cleaner doc strings & refactor
Splines Jun 4, 2024
9f8e0e1
Remove user ghost hash remnants
Splines Jun 4, 2024
288ab03
Avoid unnecessary `to_date` conversion
Splines Jun 4, 2024
c4305f5
Improve comment for `after_authentication` hook
Splines Jun 4, 2024
c65a4f0
Fix wording in comment
Splines Jun 4, 2024
18bd4f4
Make better use of `let` RSpec keyword
Splines Jun 5, 2024
184e6a5
Limit the number of users deleted in one run
Splines Jun 5, 2024
cc33802
Simplify unsetting deletion date logic
Splines Jun 5, 2024
7100176
Merge branch 'dev' into feature/user-cleaner
Splines Jun 5, 2024
08cae2a
Define `active_users` to reduce SQL operations
Splines Jun 11, 2024
eda3d7b
Use `7.1` as Rails version specifier in migrations
Splines Jun 11, 2024
df99587
Explicitly set `INACTIVE_USER_THRESHOLD` in tests
Splines Jun 11, 2024
918baf1
Make `MAX_DELETIONS_PER_RUN` an env variable
Splines Jun 11, 2024
2878859
Don't send deletion warning mails to non-generic users
Splines Jun 11, 2024
d970c4f
Send a final deletion mail once the user is deleted
Splines Jun 11, 2024
d65228a
Move user cleaner file back to models folder
Splines Jun 17, 2024
f118486
Rename user cleaner files
Splines Jun 17, 2024
2504169
Count users without last_sign_in_at date as inactive
Splines Jun 17, 2024
05555c2
Improve docstring for `inactive_users`
Splines Jun 17, 2024
badacd7
Shorten user cleaner comment in docker.env
Splines Jul 7, 2024
e52bb41
Introduce `PRODUCTION_NAME` env variable
Splines Jul 7, 2024
6269619
Don't run user cleaner for mampf-experimental/dev
Splines Jul 7, 2024
dd2571f
Merge branch 'dev' into feature/user-cleaner
Splines Jul 7, 2024
afe4ff3
Rewrite logic for when user cleaner is started
Splines Jul 10, 2024
50055d1
Merge branch 'dev' into feature/user-cleaner
Splines Aug 13, 2024
c9a3ca2
Fix rubocop "where range"
Splines Aug 13, 2024
90aa662
Try to use after instead of append_after
Splines Aug 13, 2024
8a76791
Revert "Try to use after instead of append_after"
Splines Aug 13, 2024
069032b
Try to not use cached docker in CI/CD unit tests
Splines Aug 13, 2024
7b148c6
Revert "Try to not use cached docker in CI/CD unit tests"
Splines Aug 13, 2024
3b5cc1c
Move transaction strategy to before each block
Splines Aug 13, 2024
2557aed
Use around each in database cleaner config
Splines Aug 13, 2024
6dfcd96
Use version 2.1.0 of database cleaner
Splines Aug 13, 2024
bd5df2f
Revert "Use version 2.1.0 of database cleaner"
Splines Aug 13, 2024
f08b75e
Add missing require "rails_helper"
Splines Aug 13, 2024
819756e
Try deletion strategy of database cleaner
Splines Aug 13, 2024
04f454f
Merge branch 'dev' into feature/user-cleaner
Splines Aug 13, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions app/mailers/mathi_mailer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,6 @@ class MathiMailer < ApplicationMailer
default from: DefaultSetting::PROJECT_EMAIL
layout false

def ghost_email(user)
return if user.ghost_hash.nil?

@name = user.name
@hash = user.ghost_hash
mail(to: user.email, subject: t("mailer.hash_mail_subject"))
end

def data_request_email(user)
@mail = user.email
@id = user.id
Expand Down
28 changes: 28 additions & 0 deletions app/mailers/user_cleaner_mailer.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
class UserCleanerMailer < ApplicationMailer
layout "warning_mail_layout"

# Creates an email to inform a user that their account will be deleted.
#
# @param [Integer] num_days_until_deletion:
# The number of days until the account will be deleted.
def pending_deletion_email(num_days_until_deletion)
user = params[:user]
sender = "#{t("mailer.warning")} <#{DefaultSetting::PROJECT_EMAIL}>"
I18n.locale = user.locale

@num_days_until_deletion = num_days_until_deletion
subject = t("mailer.pending_deletion_subject",
num_days_until_deletion: @num_days_until_deletion)
mail(from: sender, to: user.email, subject: subject, priority: "high")
end

# Creates an email to inform a user that their account has been deleted.
def deletion_email
user = params[:user]
sender = "#{t("mailer.warning")} <#{DefaultSetting::PROJECT_EMAIL}>"
I18n.locale = user.locale

subject = t("mailer.deletion_subject")
mail(from: sender, to: user.email, subject: subject, priority: "high")
end
end
219 changes: 117 additions & 102 deletions app/models/user_cleaner.rb
Original file line number Diff line number Diff line change
@@ -1,125 +1,140 @@
# PORO class that removes users with inactive emails
# Deletes inactive users from the database.
# See [1] for a description of how the flow works on a high level.
#
# Users have a deletion_date field that is nil by default. It is set to a future
# date if the user has been inactive for too long (i.e. hasn't logged in).
# Before the deletion date is reached, we send warning mails. If users log in
# before the deletion date, that date is reset to nil such that the user is not
# deleted. If the user is still inactive on the deletion date, the user is
# ultimately deleted.
#
# [1] https://github.com/MaMpf-HD/mampf/issues/410#issuecomment-2136875776
class UserCleaner
attr_accessor :imap, :email_dict, :hash_dict

def login
@imap = Net::IMAP.new(ENV.fetch("IMAPSERVER"), port: 993, ssl: true)
@imap.authenticate("LOGIN", ENV.fetch("PROJECT_EMAIL_USERNAME"),
ENV.fetch("PROJECT_EMAIL_PASSWORD"))
# The maximum number of users that can be deleted in one run.
# This is equivalent to the maximum of number of deletion dates set in one run.
#
# This flag can be used to prevent too many mails from being sent at once.
# Keep in mind that the mail server also handles other mails, e.g. notification
# mails etc., so we might want to set the limit very low here such that our
# mail server is not marked as "spam server".
#
# Note that this is just a soft limit, i.e. the actual number of deletion
# warning mails sent on a given day might be higher than this number:
# - If on a given day the cronjob is not run (for whatever reason),
# we have more users with a deletion date lying in the past than
# MAX_DELETIONS_PER_RUN. However, we don't send an additional mail once
# the user is deleted, so this shouldn't be a problem.
# - Despite that there cannot be more than MAX_DELETIONS_PER_RUN users with the
# same deletion date, warning mails might be sent on the same date to users
# with varying deletion dates, since the 40-, 14-, 7- and 2-day warning mails
# can overlap temporally.
MAX_DELETIONS_PER_RUN = ENV.fetch("MAX_DELETIONS_PER_RUN").to_i

# The threshold for inactive users. Users who have not logged in for this time
# are considered inactive.
INACTIVE_USER_THRESHOLD = 6.months

# Returns all users who have been inactive for INACTIVE_USER_THRESHOLD months,
# i.e. their last sign-in date is more than INACTIVE_USER_THRESHOLD months ago.
#
# Users without a last_sign_in_at date are also considered inactive. This is
# the case for users who have never logged in since PR #553 was merged.
def inactive_users
User.where(last_sign_in_at: ...INACTIVE_USER_THRESHOLD.ago)
.or(User.where(last_sign_in_at: nil))
end

def logout
@imap.logout
# Returns all users who have been active in the last INACTIVE_USER_THRESHOLD months,
# i.e. their last sign-in date is less than INACTIVE_USER_THRESHOLD months ago.
def active_users
User.where(last_sign_in_at: INACTIVE_USER_THRESHOLD.ago..)
end

def search_emails_and_hashes
@email_dict = {}
@hash_dict = {}
@imap.examine(ENV.fetch("PROJECT_EMAIL_MAILBOX"))
# Mails containing multiple email addresses (Subject: "Undelivered Mail Returned to Sender")
@imap.search(["SUBJECT",
"Undelivered Mail Returned to Sender"]).each do |message_id|
body = @imap.fetch(message_id,
"BODY[TEXT]")[0].attr["BODY[TEXT]"].squeeze(" ")
# rubocop:disable Layout/LineLength
next unless (match = body.scan(/([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})[\s\S]*?([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})[\s\S]*?User has moved to ERROR: Account expired/))
# rubocop:enable Layout/LineLength

match = match.flatten.uniq
match.each do |email|
add_mail(email, message_id)

try_get_hash(body, email)
end
end
# Mails containing single email addresses (Subject: "Delivery Status Notification (Failure)")
# define array containing all used regex patterns
patterns = [
'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})>[\s\S]*?Unknown recipient',
# rubocop:disable Layout/LineLength
'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4})>[\s\S]*?User unknown in virtual mailbox table'
# rubocop:enable Layout/LineLength
]

@imap.search(["SUBJECT",
"Delivery Status Notification (Failure)"]).each do |message_id|
body = @imap.fetch(message_id,
"BODY[TEXT]")[0].attr["BODY[TEXT]"].squeeze(" ")
patterns.each do |pattern|
next unless (match = body.scan(/#{pattern}/))

match = match.flatten.uniq
match.each do |email|
add_mail(email, message_id)

try_get_hash(body, email)
end
# Sets the deletion date for inactive users and sends an initial warning mail.
#
# This method finds all inactive users whose deletion date is nil (not set yet)
# and updates their deletion date to be 40 days from the current date.
#
# The maximum number of deletion dates set in one run is limited by
# MAX_DELETIONS_PER_RUN.
def set_deletion_date_for_inactive_users
inactive_users.where(deletion_date: nil)
.limit(MAX_DELETIONS_PER_RUN)
.find_each do |user|
user.update(deletion_date: Date.current + 40.days)

if user.generic? # rubocop:disable Style/IfUnlessModifier
UserCleanerMailer.with(user: user).pending_deletion_email(40).deliver_later
end
end
end

def add_mail(email, message_id)
@email_dict = {} if @email_dict.blank?
if @email_dict.key?(email)
@email_dict[email] << message_id
else
@email_dict[email] = [message_id]
end
# Unsets the deletion date for users who have been active recently.
#
# This method finds all users whose deletion date is set and unsets it if the
# user has been active recently.
#
# Note that this method just serves as a safety measure. The deletion date
# should be unset after every successful user sign-in, see the Warden callback
# in `config/initializers/after_sign_in.rb`. If for some reason, the callback
# does not work, this method will prevent active users from being deleted
# as a last resort.
def unset_deletion_date_for_recently_active_users
active_users.where.not(deletion_date: nil).update(deletion_date: nil)
end

def try_get_hash(body, email)
@hash_dict = {} if @hash_dict.blank?
begin
hash = body.match(/Hash:([a-zA-Z0-9]*)/).captures
@hash_dict[email] = hash
rescue StandardError
nil
# Deletes all users whose deletion date is in the past or present.
#
# Technically, there should never be users with a deletion date in the past
# since the cron job is run daily and should delete users on the day of their
# deletion date. Should the cron job not run for some reason, we also delete
# users with a deletion date in the past via this method.
#
# The deletion date for the users must have been set beforehand by calling
# `set_deletion_date_for_inactive_users`.
#
# Even after having called this method, there might still exist users with a
# deletion date in the future, as we only delete generic users.
def delete_users_according_to_deletion_date!
num_deleted_users = 0

User.where(deletion_date: ..Date.current).find_each do |user|
next unless user.generic?

UserCleanerMailer.with(user: user).deletion_email.deliver_later
user.destroy
num_deleted_users += 1
end
end

def send_hashes
@emails = @email_dict.keys
@users = User.where(email: @emails)

@users.each do |user|
user.update(ghost_hash: Digest::SHA256.hexdigest(Time.now.to_i.to_s))
MathiMailer.ghost_email(user).deliver_now
move_mail(@email_dict[user])
end
Rails.logger.info("UserCleaner deleted #{num_deleted_users} stale users")
end

def delete_ghosts
# @hash_dict.each do |mail, hash|
# u = User.find_by(email: mail, ghost_hash: hash)
# move_mail(@email_dict[mail]) if u.present? && @email_dict.present?
# u.destroy! if u&.generic?
# end
end

def move_mail(message_ids, attempt = 0)
return if message_ids.blank?
# Sends additional warning mails to users whose deletion date is near.
#
# In addition to the initial warning mail 40 days before deletion, this method
# sends warning mails 14, 7 and 2 days before the account is deleted.
def send_additional_warning_mails
User.where.not(deletion_date: nil).find_each do |user|
next unless user.generic?

message_ids = Array(message_ids)
return if attempt > 3
num_days_until_deletion = (user.deletion_date - Date.current).to_i

begin
@imap.examine(ENV.fetch("PROJECT_EMAIL_MAILBOX"))
@imap.move(message_ids, "Other Users/mampf/handled_bounces")
rescue Net::IMAP::BadResponseError
move_mail(message_ids, attempt + 1)
if [14, 7, 2].include?(num_days_until_deletion)
UserCleanerMailer.with(user: user)
.pending_deletion_email(num_days_until_deletion)
.deliver_later
end
end
end

def clean!
# TODO: Implement new user cleaner logic
# login
# search_emails_and_hashes
# return if @email_dict.blank?
# Handles inactive users according to the deletion policy documented
# in the UserCleaner class description. Brief: users that haven't logged in
# to MaMpf for too long will be deleted.
def handle_inactive_users!
set_deletion_date_for_inactive_users
unset_deletion_date_for_recently_active_users
delete_users_according_to_deletion_date!

# send_hashes
# sleep(10)
# search_emails_and_hashes
# delete_ghosts
# logout
send_additional_warning_mails
end
end
18 changes: 18 additions & 0 deletions app/views/layouts/warning_mail_layout.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
<!DOCTYPE html>
<html>

<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<%= stylesheet_link_tag :email %>
</head>

<body>
<div class="container text-center">
<%= email_image_tag('/MaMpf-Logo_96x96.png', class: 'img-fluid mb-2 pt-4 pb-3' ) %>
<div class="jumbotron">
<%= yield %>
</div>
</div>
</body>

</html>
1 change: 0 additions & 1 deletion app/views/mathi_mailer/ghost_email.html.erb

This file was deleted.

10 changes: 10 additions & 0 deletions app/views/user_cleaner_mailer/deletion_email.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<div class="container">
<div class="blockquote text-start">
<h3>
<%= t('mailer.deletion_subject') %>
</h3>
<p>
<%= t('mailer.deletion_body_html') %>
</p>
</div>
</div>
12 changes: 12 additions & 0 deletions app/views/user_cleaner_mailer/pending_deletion_email.html.erb
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<div class="container">
<div class="blockquote text-start">
<h3>
<%= t('mailer.pending_deletion_subject',
num_days_until_deletion: @num_days_until_deletion) %>
</h3>
<p>
<%= t('mailer.pending_deletion_body_html',
num_days_until_deletion: @num_days_until_deletion) %>
</p>
</div>
</div>
13 changes: 13 additions & 0 deletions app/workers/user_cleaner.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
class UserCleanerJob
include Sidekiq::Worker

def perform
# Only run this job in production, not for mampf-experimental or mampf-dev.
# Note that Rails.env.production? is not sufficient in this context
# as both mampf-experimental and mampf-dev also run in production mode.
production_name = ENV.fetch("PRODUCTION_NAME", nil)
return unless production_name == "mampf"

Splines marked this conversation as resolved.
Show resolved Hide resolved
UserCleaner.new.handle_inactive_users!
end
end
8 changes: 0 additions & 8 deletions app/workers/user_cleaner_job.rb

This file was deleted.

5 changes: 5 additions & 0 deletions config/environments/production.rb
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,9 @@

# Do not dump schema after migrations.
config.active_record.dump_schema_after_migration = false

config.after_initialize do
production_name = ENV.fetch("PRODUCTION_NAME", nil)
Rails.logger.info("PRODUCTION_NAME: #{production_name}")
end
end
10 changes: 10 additions & 0 deletions config/initializers/after_sign_in.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Callback documentation for Warden (used by Devise):
# https://github.com/wardencommunity/warden/wiki/Callbacks#after_authentication
Warden::Manager.after_authentication do |user, _auth, _opts|
# User might have not logged in for a long time, which is why they have a
# deletion date set. If the user logs in, we unset this date to prevent the
# user from being deleted. See the UserCleaner class for more information.
# In case this callback does not work, a safety net is provided by the method
# `unset_deletion_date_for_recently_active_users` in the UserCleaner class.
user.deletion_date = nil
end
16 changes: 16 additions & 0 deletions config/locales/de.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3053,6 +3053,22 @@ de:
en: 'Englisch'
de: 'Deutsch'
mailer:
warning: "MaMpf Warning"
pending_deletion_subject: "%{num_days_until_deletion} Tage bis zur
Löschung Deines MaMpf-Accounts"
deletion_subject: "Account gelöscht"
pending_deletion_body_html: >
Wir haben festgestellt, dass Du Dich in den letzten 6 Monaten nicht in
MaMpf eingeloggt hast. Dein Account wird in %{num_days_until_deletion}
Tagen <strong>gelöscht, sofern Du Dich nicht vorher einloggst</strong>.
Bitte beachte, dass nach der Löschung Deines Accounts alle bei MaMpf
gespeicherten Daten von Dir unwiderruflich verloren gehen.
deletion_body_html: >
<strong>Dein MaMpf-Account wurde vollständig gelöscht, da Du Dich seit
mehr als 6 Monaten nicht eingeloggt hast.</strong> Beachte, dass wir zuvor
mehrere Erinnerungen gesendet haben. Alle Deine bei MaMpf gespeicherten
Daten sind unwiderruflich verloren gegangen. Vielen Dank, dass Du MaMpf
genutzt hast.
notification: 'MaMpf-Benachrichtigung'
notification_header: 'Benachrichtigung von MaMpf'
medium_subject: 'Neues Medium'
Expand Down
Loading