Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

media rating sync #1681

Merged
merged 12 commits into from
Jan 24, 2024
Merged

Conversation

whatdoineed2do
Copy link
Contributor

@whatdoineed2do whatdoineed2do commented Nov 3, 2023

Closes #1678

  • read rating from local media on scan
  • new library api to automatically update local files when rating changes

Test:

# create dummy test files
cd /svr/Music
ffmpeg -f lavfi -i "sine=frequency=500:duration=5" -af "afade=t=out:st=3:d=2" -ac 2 -ar 48000 foo.mp3
ffmpeg -f lavfi -i "sine=frequency=500:duration=5" -af "afade=t=out:st=3:d=2" -ac 2 -ar 48000 foo.flac
ffmpeg -f lavfi -i "sine=frequency=500:duration=5" -af "afade=t=out:st=3:d=2" -ac 2 -ar 48000 bar.mp3
ffmpeg -f lavfi -i "sine=frequency=500:duration=5" -af "afade=t=out:st=3:d=2" -ac 2 -ar 48000 bar.m4a

# wait for server to find files

# edge case test
chmod 000 foo.mp3

# set rating, m4a will not  work
# will see in logs:
#     scan: Updating rating to 60 on '/.../bar.mp3'
#     scan: No permissions to update metadata on '/.../foo.mp3' - skipping
#     scan: Unsupported metadata (aac) update for 'rating' on '/.../bar.m4a' (aac) - skipping
#     scan: Unsupported media (url) to update metadata on 'https://....' (id=1251)
#     scan: No known DB entry for media, '99999' to update metadata
IDS=$(sqlite3 /.../songs.db  'select id from files where path like "%bar.%" or path like "%foo.%"')

for ID in ${IDS}; do 
  curl -XPUT http://localhost:3689/api/library/tracks/${ID}?rating=60
done

# verify rating value
for i in foo.mp3 foo.flac bar.mp3 bar.m4a; do ffmpeg -hide_banner -i $i; done

# verify server inotifications still work, will see usual "cache: DAAP cache updated"
touch bar.mp3

# verify handling of non local streams - will see in log:  scan: Ignoring non local media (...) requested for rating write
ID=$(sqlite3 /.../songs.db  'select id from files where path like "http%" limit 1')

curl -XPUT http://localhost:3689/api/library/tracks/${ID}?rating=60

@whatdoineed2do
Copy link
Contributor Author

@ejurgensen - can you provide feedback on the inotify handling part? This seems to do what we want with specific inotify removal per file being updated.

I did find an (odd?) situation with that my testing db inotify tbl has less lines than my files tbl and i dont know what causes this or if this is normal.

@ejurgensen
Copy link
Member

I have received a row of notifications for amendments to this PR, please don't create PRs for work in progress. It's of course fine to ask for feedback on a draft, but then mark it as a draft and continue working on another branch.

@ejurgensen ejurgensen marked this pull request as draft November 5, 2023 20:18
@whatdoineed2do
Copy link
Contributor Author

i'm done with changes

@whatdoineed2do
Copy link
Contributor Author

One more fix for inotify removal but this should be ready for review

@whatdoineed2do whatdoineed2do marked this pull request as ready for review November 7, 2023 20:33
@ejurgensen
Copy link
Member

I still don't understand the purpose of the extra endpoint? Is it because you don't want to write the metadata automatically?

@whatdoineed2do
Copy link
Contributor Author

Yes, didnt want auto update. Auto updating at the db update seems too low level to also manipulate the media file esp if there are errors to report.

Also it may not be something a user wants or enables if media is on a read only filesystem (NFS mount like one of my RPis) so I choose the option that required a user request for this to happen.

We could use the existing endpoint which accepts rating updates in jsonapi_reply_library_tracks_put_byid() when a user can force an rating update but this makes it asymetrical for aglorithmic rating updates - ie the media file still becomes out of snyc with what the server/db maintains.

@ejurgensen
Copy link
Member

I think the sync should be automatic, but with a setting/config option that makes it disabled by default. The implementation that does the update should be in the library layer (either an extension of filescanner_ffmpeg or separate module), so not in the API.

@whatdoineed2do
Copy link
Contributor Author

Let me know if this is more what you're after.

The rating update to the media file happens after the db rating is applied and iterates for all the library sources asking them to sync metadata, where (atm) only file scanner implements this. All ffmpeg related items are in filescanner_ffmpeg

I'll cleanup the commits if you're ok with this approach.

To test, find an id of the local supported media file (mp3 or flac):

curl -XPUT http://localhost:3689/api/library/tracks/1239?rating=60

I haven't quited tested the mpd auto-rating update but its through the same path using the virtual_path supplied

@ejurgensen
Copy link
Member

Yes, this structure looks good. Have you looked into if there are any interfaces that update the rating via db_file_update()? I guess that would be challenging to catch and sync.

@whatdoineed2do
Copy link
Contributor Author

db_file_update() used in

  • spotify
  • rss scanner
  • itunes
  • REST endpoint

Looking at these it doesnt makes sense for the remote items (top 3 in list) to try to update remote media.

The REST endpoint may be the only one of interest (jsonapi_reply_library_tracks_put() library_media_save() -> db_file_update()) so i've kept it out of the db layer and left that change to the library layer with an additional commit - looking at the web ui, i dont see rating udpates being sent this way (at the moment)

src/library.c Outdated Show resolved Hide resolved
@whatdoineed2do
Copy link
Contributor Author

@ejurgensen - any more thoughts on this PR? Otherwise I'll squash the commits for the merge

@ejurgensen
Copy link
Member

Overall it looks fine, but I need to find time for a more detailed review. You're welcome to squash.

owntone.conf.in Outdated Show resolved Hide resolved
src/conffile.c Outdated Show resolved Hide resolved
src/db.c Show resolved Hide resolved
src/db.c Outdated

ret = db_query_run(query, 1, 0);

if (ret == 0)
{
db_admin_setint64(DB_ADMIN_DB_MODIFIED, (int64_t) time(NULL));
listener_notify(LISTENER_RATING);

if (cfg_getbool(cfg_getsec(cfg, "library"), "rating_sync"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of this, the db module should generally not be calliing into the library, and especially not the sources. I see there is a LISTENER_RATING, couldn't that be used?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see there are calls to db_file_rating_update_xxx() in dacp, json and mpd. Ideally, they should probably be replaced with library_rating_update_xxx(), and then the library can call the db.

src/library.c Outdated Show resolved Hide resolved
src/library/filescanner.c Outdated Show resolved Hide resolved
src/library/filescanner.c Outdated Show resolved Hide resolved
src/library/filescanner.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Show resolved Hide resolved
Copy link
Member

@ejurgensen ejurgensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's some more comments. If I understand the implementation correctly, it involves rewriting the entire file, which would seem a bit risky. Is that the only way to save metadata with ffmpeg?

src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
src/library/filescanner_ffmpeg.c Outdated Show resolved Hide resolved
@whatdoineed2do
Copy link
Contributor Author

Here's some more comments. If I understand the implementation correctly, it involves rewriting the entire file, which would seem a bit risky. Is that the only way to save metadata with ffmpeg?

Yes, this appears to be correct. not seen any code to suggest otherwise and its how ffmpeg command line does it too (separate input and output files)

@whatdoineed2do whatdoineed2do force-pushed the db-media-rating-sync branch 2 times, most recently from 790b6d4 to b73f8b3 Compare December 2, 2023 19:23
@whatdoineed2do
Copy link
Contributor Author

@ejurgensen - let me know if you have any additional comments on the review fixes, otherwise i'll squash the commits in prep for merge

@ejurgensen
Copy link
Member

You're welcome to do that. I will go through it again and just make final edits myself before merging.

@ejurgensen
Copy link
Member

I started going through this, but there are a few things in filescanner_ffmpeg_write_rating() I couldn't understand:

  1. There is check that the format supports metadata updates, but the actual implementation doesn't check format (container), it checks codecs. That seems strange. What's the reason for that? Did you take this check from somewhere?
  2. A few lines after, safe_snprintf_cat() is used, but not for concatenation, since rating is just an empty string at that point. Why isn't snprintf used?
  3. Why isn't the temp file being stored in /tmp?

You don't need to make any fixes concerning these, I'm fixing it up anyway. Just need to understand the above.

@whatdoineed2do
Copy link
Contributor Author

1. There is check that the format supports metadata updates, but the actual implementation doesn't check format (container), it checks codecs. That seems strange. What's the reason for that? Did you take this check from somewhere?

do it based on the audio codec but this is prob suboptimal to check against codec vs container - however neither AVFormatContext and AVInputFormat give a what appears to be clean what to determine container format other than C-strings unless I've missed something on the libavformat api that could do this cleaner.

2. A few lines after, safe_snprintf_cat() is used, but not for concatenation, since rating is just an empty string at that point. Why isn't snprintf used?

yes, this looks like an oversight when sprintf will do

3. Why isn't the temp file being stored in /tmp?

I am using ffmpeg to write the file and then using rename(2) to replace the original file - the problem with rename() is that it expects the file move to be on the same device/filesystem, and /tmp and the library is not necessarily the case - creating file /tmp leads to EXDEV which would need me to manually byte copy a file to the target location: example below:

scan: Failed to replace library rating file '/export/public/Music/test/bar.mp3' with temp '/tmp/PVGFYw.metadata' - Invalid cross-device link

@ejurgensen
Copy link
Member

Thanks for the info. Locally, I’ve made the adjustments that I want to include before merging (plus fixed the conflicts), but how/when to trigger the write to the files is giving me headaches. I’m not happy with the PR's hooks in db_file_rating_update() and library_media_save(). In additon to being somewhat ugly, I think the hook in db_file_rating_update means that the write will be carried out in the httpd and mpd threads? If so, that’s no good.

A better solution should also work when the rating updates happen via db_file_inc_playcount etc, so that there is no chance that the ratings in the db and in the file diverge.

However, all the alternatives that I can think of also have their problems, so for now I have to think some more about it.

@whatdoineed2do
Copy link
Contributor Author

I noticed that updated files don't have the same owner and permissions as the original, and I don't think that's ok. Can you try to find a solution for that?

Let me take a look over the next few days. Unless we're running as root this might not be fixable (any one can add files to the library) - there was a check on the ownership of the file at the start of the rating update method so the server euid doesn't have write permissions, it bails out but yes file mask needs to check.

I also noticed that my updated file was smaller than the original, which concerns me a bit. Is something being discarded? I couldn't see what that might be, though.

I seem to remember fixing a bug in TagLib a few yrs ago around metadata, depending on what wrote the original, coulld have metadata overallocated (a hole of additional updates instead need to rewrite the entire file) - perhaps this is what is being clawed back by ffmpeg?

The most important test for me was validating that the audio stream hash is identical between the original and the new file; this was done via: ffmpeg -hide_banner -i foo.mp3 -c:a copy -bsf:a null -f hash -. Using exiftool -v3 will also display the file offsets of various blocks that might be able to show anything odd.

@whatdoineed2do
Copy link
Contributor Author

rebased with master and updated for ownership/file mask although this is done as a best efforts ...

To ensure the ownership of the replacement file is correct, the server needs to be running as root but the install/setup seem to default to owntone (as per config) and the server drops privs to the config value as it starts even if started as root. Regardless, i've tested this with the general.uid = root and works as expected

@ejurgensen
Copy link
Member

Looks a bit too hacky to me, and like you say, it won't work in some cases. There must be a cleaner way of doing this.

@whatdoineed2do
Copy link
Contributor Author

The alternative to chmod is to set umask to match st_mode ahead of mkstemp but the ownership item will still be a problem (can anyone other than root change ownership anyway?)

I went with the commit approach to keep the matching logic together.

@ejurgensen
Copy link
Member

ejurgensen commented Dec 22, 2023

I suggest you look into how a normal tagging app like kid3 does this. Or I guess any app that edits a file. I can't imagine that they mess around with chown/chmod/umask.

Btw, start by just sharing possible ideas. Don't push anything to the branch until we have agreed.

@whatdoineed2do
Copy link
Contributor Author

whatdoineed2do commented Dec 22, 2023

No unfortunately, we have two options already outlined since there are no other unixy provisions for what you're requiring:

  • file mask - set umask(2) ahead of the file creation or chmod(2)
  • file ownership - process must have effective uid as root and then we must chown(2) the resulting file - this is the ONLY way to change ownership of file by the process; there is no system call available that lets us create a file 'as another user'.

To verify this, you can see what mirroring utils (rsync/tar/install as root) do: mkdir /tmp/dir ; touch /tmp/foo.txt ; strace install -owner nobody /tmp/foo.txt /tmp/dir - the final operations are a lchown() followed by a chmod()

I would take this approach and match most utils in unix: if you copy a file under unix, the resulting file will belong to you - even as superuser (excluding mirroring rsync, tar, install..) the duplication makes no provisions to reflect ownership of original (or even file mask). We should certainly ensure the filemask (on fedora, root user creates files as 0600) which means we replace the last commit with

stat(original_file, &ts);
orig_mask = umask(st.st_mode & 0777);
mkstemp(...);
umask(orig_mask);

and not do anything with ownership unless we go with more or less what is already in the last commit.

EDIT:
Took a quick look a kid3 which is C++ - i dont use it so can't comment on the files generated; however as a quick look through it, I see uses tablig and id3tag - the former uses fopen(3C) for handling its files and the latter uses std::ofstream - neither have any handling of permissions and thus, relies on the existing process umask.

https://github.com/taglib/taglib/blob/1a1ee8b54fa0a130f2b6dc9b629b7dea970b3eb7/taglib/toolkit/tfilestream.cpp#L100..108.

@whatdoineed2do
Copy link
Contributor Author

actually, fchmod() is a little cleaner than umask() that gives potential race condition:

  fd = mkstemp(new_rating_file);
  if (fd < 0)
    {
      DPRINTF(E_WARN, L_SCAN, "Failed to create temp rating file '%s': %s\n", new_rating_file, strerror(errno));
      ret = -1;
      goto end;
    }

  // best efforts: use existing mode, otherwise something sensible and if that
  // fails, it'll be the process's _current_ umask
  fchmod(fd, stat(in_fmt_ctx->url, &st) == 0 ? st.st_mode & 0777 : 0644);

  ret = file_write_rating(in_fmt_ctx, new_rating_file);

ownership change still problem for outlined previous

@whatdoineed2do
Copy link
Contributor Author

@ejurgensen - any conclusive thoughts on this based on my comments above:

The additional commit would be as follows

@@ -20,6 +20,7 @@
 # include <config.h>
 #endif
 
+#include <sys/stat.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -956,6 +957,7 @@ write_metadata_ffmpeg(struct media_file_info *mfi)
   int max_rating;
   int file_rating;
   int fd = -1;
+  struct stat st;
   int ret;
 
   // ffmpeg's metadata update are limited - some formats do not support rating
@@ -1008,6 +1010,14 @@ write_metadata_ffmpeg(struct media_file_info *mfi)
       goto end;
     }
 
+  ret = stat(in_fmt_ctx->url, &st);
+  if (ret == 0)
+    {
+      // best efforts...
+      fchmod(fd, st.st_mode & 0777);
+      fchown(fd, st.st_uid, st.st_gid);
+    }
+
   ret = file_write_rating(in_fmt_ctx, new_rating_file);
   if (ret < 0)
     {

@ejurgensen
Copy link
Member

I'll have a look at it

@ejurgensen
Copy link
Member

I think the way to go about this is to just write directly to the media file. So abandon writing to a copy and so also abandon renaming/replacing/chmod-ing. It will require an extra file copy, because the source file first has to be copied to tmp file to be read from, but I think it will be more clean. I will add a commit to the PR with this.

@ejurgensen
Copy link
Member

ejurgensen commented Dec 30, 2023

I've made an update to the PR with the different method of updating the file. This method should also mean that the file creation time remains correct, which was also an issue with the other method.

The disabling of inotify might also not be necessary any more.

I won't merge right now, it will be after the next release, so in a few weeks.

@whatdoineed2do
Copy link
Contributor Author

i've added one more commit to this - the sendfile() copies bytes but as you observed previosuly the resulting new file may be smaller size. the ftruncate() ensures theres not junk at the in place overwritten file

@ejurgensen
Copy link
Member

ejurgensen commented Jan 1, 2024

It's not really required with the changed method, since fast_copy() isn't overwriting anything, it's just making a copy of the source. Then ffmpeg does the overwriting from that copy. However, maybe I will change it so that it's ffmpeg first, and then have fast_copy() overwrite to the file. Then ftruncate() will indeed be needed.

whatdoineed2do/Ray and others added 12 commits January 23, 2024 23:17
…dia update changes (at this time 'rating')

Implemented for filescanner via ffmpeg
…on - WIP

Need to find a better solution for the hook
Doesn't catch automatic rating updates, but maybe better to exclude them since
they can lead to whole-playlist file rewriting.
The previous method of writing to a copy and then renaming doesn't assure
that file ownership and mask is preserved.
Not really necessary with the new way of writing the rating update (no risk of
looping).
@ejurgensen ejurgensen force-pushed the db-media-rating-sync branch from 7fd0b97 to ffa2986 Compare January 23, 2024 22:37
@ejurgensen ejurgensen merged commit 2dc448f into owntone:master Jan 24, 2024
4 checks passed
@ejurgensen
Copy link
Member

Squashed and merged now, thanks for the contribution. As you might notice, I removed the disabling of inotify, since with the current solution it isn't really needed. Of course, the file will now be rescanned when the rating is written, but there won't be a loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ID3 rating tag sync
2 participants