Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dataset from scratch #1

Open
kYroL01 opened this issue May 3, 2016 · 6 comments
Open

Create dataset from scratch #1

kYroL01 opened this issue May 3, 2016 · 6 comments

Comments

@kYroL01
Copy link
Contributor

kYroL01 commented May 3, 2016

First of all, thank you @HamedMP for the huge work.
In my opinion, to extend ImageFlow functionalities, a nice feature to add is giving the possibility to create a new dataset from scratch.
It could be very usefull to anyone who needs to create his personal cnn for image recognition not base on pre-built dataset (i.e. Cifar, MNIST or Imagenet).
What do you think ?

@HamedMP
Copy link
Owner

HamedMP commented May 3, 2016

Thank you @kYroL01

Actually it's the reason why I created ImageFlow which is to make you enable to work with your very own data. It was the problem in my case and I built this solution (not fully complete as I wanted) and decided to publish it.

For example we have used this library for a Car Detector problem which was fully personal dataset and problem.

I would be happy if you explain more what it lacks which make it difficult for your dataset?

@kYroL01
Copy link
Contributor Author

kYroL01 commented May 3, 2016

Thanks for the fast reply.
Unfortunately I'm not discover ImageFlow in deep yet, so, sorry if I say wrong things.
I try to explain me better:
I'm creating a personal cnn to recognize three categories of images and I have to create my own dataset.
I follow tensorflow tutorial for the creation of the model and it was usefull to understand how to create a cnn based on a already-exist model.
The problem is: I don't find any reasonable and illustrative tutorial on how to create a personal dataset, so, looking on github I found your nice project.
But as I read here you convert a directory that contains images AND LABELS, but I have no labels in the beginning, just images in three subfolders (my categories).
Then is very usefull using your library to convert images and labels to data tensor to pass to the tensorflow model.

I also read this but not the 100 % of what I need.

Suggestion ? I hope I was clear.

Thanks

@HamedMP
Copy link
Owner

HamedMP commented May 3, 2016

Your categories are your LABELS. As CNN is supervised learning, you should provide it with labels to make it able to learn its errors and improve it.

So if you have 3 categories, you should give the labels e.g 1, 2, 3 or A, B, C , ... to them and convert them tfrecords if you want to enjoy queueing features.

@kYroL01
Copy link
Contributor Author

kYroL01 commented May 3, 2016

Yes of course, I understand that my categories are the labels.
My question is: how can I create my dataset to say "this category is my label" ?
Then, doing this, I convert images and labels using ImageFlow.
The problem is at the very beginning.

@HamedMP
Copy link
Owner

HamedMP commented May 3, 2016

It can be done in a simple programming level job.

For example you will read images full path and put it into the images_array
At the same time you can read the folder they are containing in, let's say CAT1, ... and also append it to another array named labels_array.

You can do this in python by getting the full path of the images, then separate it by '/' and access to the returned_array[-2] element which will be the folder name. '-1' will be the file name.

@kYroL01
Copy link
Contributor Author

kYroL01 commented May 3, 2016

Ok, I'll try. I think I understand.
And, related to my initial question, do you think is a feature you can add in ImageFlow or not ?

Thank you for the suggestion and for the time spent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants