Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hello, may I ask if ATST Frame can publicly disclose the models and scripts used for inference? #12

Open
Angelalilyer opened this issue Jun 28, 2024 · 12 comments · May be fixed by #13
Open

Comments

@Angelalilyer
Copy link

I only found the training code, and my test dataset is unlabeled. I would like to try using ATST Frame to detect sound events in a new dataset. Perhaps you have inference code and models? Looking forward to your reply!

@lmaxwell
Copy link
Contributor

Hi,
Currently, only self-supervised pretrained checkpoint are pulic in this repo. But we are happy to make the finetuned checkpoint public.
As you are targeting for detercting sound events, do you mean the checkpoint and inference code of ATST-Frame finetuned on strongly-labeled audioset?

@Angelalilyer
Copy link
Author

Thanks for your reply~!
Yes, that is exactly what i need! So I wonder When you will release the “checkpoint and reference code of ATST Frame refined on strongly labeled audioset” ?

@SaoYear
Copy link
Member

SaoYear commented Jul 3, 2024

Hi, I would let you know once the uploading finished ; )

@SaoYear
Copy link
Member

SaoYear commented Jul 4, 2024

@Angelalilyer You could try this checkpoint file, hope it helps!

@Angelalilyer
Copy link
Author

@Angelalilyer You could try this checkpoint file, hope it helps!

Thank you so much!!

@SaoYear SaoYear closed this as completed Jul 8, 2024
@Angelalilyer
Copy link
Author

@Angelalilyer You could try this checkpoint file, hope it helps!

Is there a complete inference code available? I tried to modify "audiossl/audiossl/methods/atstframe/downstream/train_strong. py" but kept reporting errors.
Sorry to bother you again~

@lmaxwell
Copy link
Contributor

lmaxwell commented Jul 9, 2024

Did you solve the problem?

@Angelalilyer
Copy link
Author

Did you solve the problem?

I tried to write inference code myself, but I couldn't output predicted labels,may I ask if there is a relatively complete inference code for ATST Frame (Audioset strong label)? Thanks for your help again!

@lmaxwell lmaxwell linked a pull request Jul 9, 2024 that will close this issue
@lmaxwell
Copy link
Contributor

lmaxwell commented Jul 9, 2024

I write a quick solution in a new pull request #13 , can you test it ?

@lmaxwell lmaxwell reopened this Jul 9, 2024
@Angelalilyer
Copy link
Author

I write a quick solution in a new pull request #13 , can you test it ?

thanks!!
I can run this code, but there are some labels in the "Labels" list that I cannot find their corresponding names in the "ontology.json" of strongly-labeled audioset. For example, "/m/0c1tlg"

@SaoYear
Copy link
Member

SaoYear commented Jul 9, 2024

The strong AudioSet includes some extra Mids excluded by the original AudioSet ontology, you could refer to the official page and download the mid_to_display_name.tsv file.

According the the file, /m/0c1tlg refers to Electric rotor drone, quadcopter.

@Angelalilyer
Copy link
Author

The strong AudioSet includes some extra Mids excluded by the original AudioSet ontology, you could refer to the official page and download the mid_to_display_name.tsv file.

According the the file, /m/0c1tlg refers to Electric rotor drone, quadcopter.

Thank you very much for your help! The problem has been solved~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants