Create dataset from scratch #1

kYroL01 · 2016-05-03T13:17:29Z

First of all, thank you @HamedMP for the huge work.
In my opinion, to extend ImageFlow functionalities, a nice feature to add is giving the possibility to create a new dataset from scratch.
It could be very usefull to anyone who needs to create his personal cnn for image recognition not base on pre-built dataset (i.e. Cifar, MNIST or Imagenet).
What do you think ?

HamedMP · 2016-05-03T13:22:51Z

Thank you @kYroL01

Actually it's the reason why I created ImageFlow which is to make you enable to work with your very own data. It was the problem in my case and I built this solution (not fully complete as I wanted) and decided to publish it.

For example we have used this library for a Car Detector problem which was fully personal dataset and problem.

I would be happy if you explain more what it lacks which make it difficult for your dataset?

kYroL01 · 2016-05-03T13:50:14Z

Thanks for the fast reply.
Unfortunately I'm not discover ImageFlow in deep yet, so, sorry if I say wrong things.
I try to explain me better:
I'm creating a personal cnn to recognize three categories of images and I have to create my own dataset.
I follow tensorflow tutorial for the creation of the model and it was usefull to understand how to create a cnn based on a already-exist model.
The problem is: I don't find any reasonable and illustrative tutorial on how to create a personal dataset, so, looking on github I found your nice project.
But as I read here you convert a directory that contains images AND LABELS, but I have no labels in the beginning, just images in three subfolders (my categories).
Then is very usefull using your library to convert images and labels to data tensor to pass to the tensorflow model.

I also read this but not the 100 % of what I need.

Suggestion ? I hope I was clear.

Thanks

HamedMP · 2016-05-03T13:56:57Z

Your categories are your LABELS. As CNN is supervised learning, you should provide it with labels to make it able to learn its errors and improve it.

So if you have 3 categories, you should give the labels e.g 1, 2, 3 or A, B, C , ... to them and convert them tfrecords if you want to enjoy queueing features.

kYroL01 · 2016-05-03T14:03:08Z

Yes of course, I understand that my categories are the labels.
My question is: how can I create my dataset to say "this category is my label" ?
Then, doing this, I convert images and labels using ImageFlow.
The problem is at the very beginning.

HamedMP · 2016-05-03T14:09:43Z

It can be done in a simple programming level job.

For example you will read images full path and put it into the images_array
At the same time you can read the folder they are containing in, let's say CAT1, ... and also append it to another array named labels_array.

You can do this in python by getting the full path of the images, then separate it by '/' and access to the returned_array[-2] element which will be the folder name. '-1' will be the file name.

kYroL01 · 2016-05-03T14:14:14Z

Ok, I'll try. I think I understand.
And, related to my initial question, do you think is a feature you can add in ImageFlow or not ?

Thank you for the suggestion and for the time spent.

HamedMP added the enhancement label May 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create dataset from scratch #1

Create dataset from scratch #1

kYroL01 commented May 3, 2016 •

edited

Loading

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016 •

edited

Loading

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016

Create dataset from scratch #1

Create dataset from scratch #1

Comments

kYroL01 commented May 3, 2016 • edited Loading

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016 • edited Loading

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016

HamedMP commented May 3, 2016

kYroL01 commented May 3, 2016

kYroL01 commented May 3, 2016 •

edited

Loading

kYroL01 commented May 3, 2016 •

edited

Loading