Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple proposed changes to resolve retriggering and increase configurability #7

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

irakhlin
Copy link

@irakhlin irakhlin commented Dec 2, 2023

Hello,
This is still a work in process but I made a couple minor and and couple larger changes across the script, let me know if you like any of the ideas as I am more than happy to clean them up but in this current configuration it does seem to work better for me.

  1. Change every processing of an event to use a thread vs a process, I see little benefit from a process and it may just be harder to manage down the road.

  2. Resolve "stationary" objects from re-triggering a new detection

  3. Resolve "updated" but already processed events from being triggered again. My guess is the following code was meant to prevent this case:
    if "summary" in payload["after"] and payload["after"]["summary"]:
    Unfortunately I don't see how that would've been hit as the summary is being published to the MQTT_SUMMARY_TOPIC but only the MQTT_FRIGATE_TOPIC is subed

  4. There is a longish standing issue that often times a single 'object' will trigger multiple almost identical events. Frigate actually has a task that cleans these up in the UI but the result I was seeing was that processing of multiple identical events with different event_ids was occurring.

  5. The above issue, and all the general issues of multiple or unneeded event processing is being handled by two global TTLCache dictionaries

    • ongoing_tasks, allows a max size of 1000 "tasks" that have either been processed or blacklisted to be cached. They have a ttl of 4 hours so they will be evicted. The key value is the event_id, giving us an easy to to prevent the same event from being processed
    • recent_triggered_cameras, allows a max size of 20, with a ttl configured by the value of CAMERA_DEBOUNCE_TIME. The idea here is to allow a configurable "cooldown" timer per camera. When an ongoing_tasks is added to the dictionary and processed, the camera_name is added as a key to recent_triggered_cameras. As the cache will be auto evicted at CAMERA_DEBOUNCE_TIME seconds, we are able to prevent the processing of events from that camera.
  6. I additionally saw a large amount of errors on my frigate server from api requests for a video clip before one was available. Additionally when the script was able to get a clip, it was often too short or incomplete. For this reason I introduce the following options: (these are not meant to be used all together but instead gives a few choices of how the event is processed)

    • process_when_complete: if enabled processing of the event is only done when frigate sets the event type to "end"
    • process_clip: enabled by default, but this is the current way you are obtaining the events, by pulling the video clip and splitting it into frames. If this is used a new config value is set (event_wait_time: GAP_SECS * 3) that will wait before trying to pull the clip. If this is enabled with the above process_when_complete then waiting is skipped because the clip should be mostly complete.
    • process_snapshot: disabled by default but this is the additional processing method I have been using. Frigate provides an snapshot endpoint that actually allows pulling jpg stills of the running event. It also accepts a resize parameter. This is used in conjunction with max_frames_to_process setting to loop and wait to obtain the needed amount of still shots before processing the GPT api request. The added benefit here is the images are stored in memory and do not need to be written to disk. This option is very useful if you are trying to process an event that has NOT ended.
  7. Added configurable parameters for some additional values you can checkout in the sample config including: resized_frame_size, max_response_token, max_frames_to_process, min_frames_to_process, low_detail, update_frigate_sublabel.

  8. I added a small system message to the GPT api request and slightly modified how the prompt is being generated. I was seeing an issue that GPT responses were not consistently proper json, with these changes it seems to be a bit better?

@mhaowork
Copy link
Owner

mhaowork commented Dec 3, 2023

Thanks for the thoughtful changes!

1/ 👍 RE: why it used processes instead of threads: originally, the video decoding is handled within the process, so the process model was better since it's gonna be CPU intensive. But now the decoding work has been offloaded to ffmpeg (commit e4fcc6c) as a subprocess. The difference doesn't matter anymore.

3/ This was to prevent feedback loop. A message containing summary would be received by itself and re-processed without an explicit check. I know this seems like a dumb issue but maybe there's a better fix? I dunno yet.

4 & 5/ Love this. I experienced the issue as well.

6/ The additional modes sound great! One small suggestion would be combining some of the paramaters for simplicity e.g.
process_mode: snapshot OR clip

7/ Not exactly sure what these mean. But I'd recommend breaking them into another PR if possible. Smaller PRs make discussions easier IMO.

8/ 👍. I ran into the bad json problem too. OpenAI's newly introduced json mode isn't available for the GPT-4v model yet unfornately.

I went over your diffs briefly and I think they make sense overall. But your base branch isn't up to date making it a bit harder to review. It'd also be great if you could break down this PR into smaller chunks if possible becuase some of these changes are clearly great and ready to be merged but some need some more discussions.

Thank you again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants