DatasetMaker

An app version of the DatasetMaker by Hollowstrawberry

Installation

You need to install python 3.10.9 in order for this program to work.

Then download this repo, you can clone it https://github.com/Ganzosupremo/DatasetMaker.git. Or download it as a zip file.

That's it.

How to Use

Double click the launch_app.bat file, it should open a terminal and start the app.

This app has 5 different tabs. I'll explain what each tab does, you can get more information inside the app.

Project Setup

This tab is for setting up your project name, this will create a folder with the name of your chosing at the Pictures directory inside a Lora folder. The structure is like so ./Pictures/Loras/project_name. Btw you can specify subfolders with the same project name like so: /project_name/subfolder_name. The images will be saved in this subfolder.

Scrape Images

We'll grab the images from Gelbooru. Just input the tags you want in the first field, and the app will do the rest. You can also specify how many images you want to download in total in the second field.

Image Curation

Next, we'll look for duplicated images and delete them from the dataset, you can adjust how similar two images must be for deletion with the slider.

Tagging Images

Now, we'll tag the images, you can use waifu diffusion and blip captioning, depending on the type of your images. Waifu diffusion is used for anime and blip captioning is usually used for photos. You can specify the batch size on the first field.

In the second field you can specify how many tags will be showed to you in the UI after tagging, put 0 for every single one.

In the third field, you can blacklist the tags you don't want in your images, leave empty for default blacklisted tags. With the slider you can adjust the level of confidence to add a tag, higher confidence means more tags.

Curate Dataset's Tags

In this tab you can input an activation tag for your dataset as well remove the tags you don't want, like for example generic tags, i.e. hair color, hair length, eyes color, etc.

Extras

In this tab you can analyze the tags in your dataset, count how many files you have in your dataset and delete the non-image files that could be found in your dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
blip		blip
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app_gui.py		app_gui.py
caption_dataset.py		caption_dataset.py
curate_dataset_tags.py		curate_dataset_tags.py
curate_images.py		curate_images.py
extras.py		extras.py
icon.ico		icon.ico
launch_app.bat		launch_app.bat
main.py		main.py
make_captions.py		make_captions.py
requirements.txt		requirements.txt
scrape_images.py		scrape_images.py
tag_images_by_wd14_tagger.py		tag_images_by_wd14_tagger.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DatasetMaker

Installation

How to Use

Project Setup

Scrape Images

Image Curation

Tagging Images

Curate Dataset's Tags

Extras

That's IT

About

Releases

Packages

Languages

License

Ganzosupremo/DatasetMaker

Folders and files

Latest commit

History

Repository files navigation

DatasetMaker

Installation

How to Use

Project Setup

Scrape Images

Image Curation

Tagging Images

Curate Dataset's Tags

Extras

That's IT

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages