I'm working on signature verification and there were a bunch of things I wanted to do using Keras/ OpenCV/ PIL but couldn't find relevant information. I have loaded the dataset folder using Keras.preprocessing.image_dataset_from_directory
and now need to:
Example Images:
Since I'm working in Keras, I thought of working with its functions but couldn't find any. How can I auto crop/ extract a signature in the dataset I've loaded? About image augmentation, should I do this in this image preprocessing stage, or implement this in CNN model I am using? I am new to image processing and Keras.
Also, because of loading entire training folder as a dataset, the labels are "Genuine" and "Forged". However, there are multiple genuine and forged signatures of a person, and there are multiple people. How do I divide the data?
Organize your directories as follows
main_dir
-train_dir
``person1_fake_dir
```person1 fake image
```person1 fake image
---etc
``person1_real_dir
---person1 real image
---person1 real image
--- etc
--person2_fake_dir
--- person2 fake image
--- person2 fake image
--- etc
--person2_real_dir
---person2 real image
---person2 real image
---etc
.
.
.
--personN_fake_dir
---personN fake image
---personN fake image
---etc
--personN_real_dir
---personN real image
---personN real image
--- etc
-test_dir
same structure as train_dir but put test images here
-valid_dir
same structure as train_dir but put validation images here
If you have N persons then you will have 2 X N classes
You can then use tf.keras.preprocessing.image.ImageDataGenerator().flow_from_directory() to input your data. Documentation is here. You don't have to worry about cropping the images just set the image size in flow to something like (256,256). Code below show the rest of the code you need
data_gen=tf.keras.preprocessing.image.ImageDataGenerator(resize=1/255)
train_gen=data_gen.flow_from_directory(train_dir, target_size=(224,224), color-mode='grayscale')
valid_gen=data_gen.flow_from_directory(valid_dir, target_size=(224,224), color-mode='grayscale', shuffle=False)
test_gen=data_gen.flow_from_directory(test_dir, target_size=(224,224), color-mode='grayscale', shuffle=False)
model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.CategoricalCrossentropy(), metrics='accuracy')
history=model.fit(train_gen, epochs=20, verbose=1)
accuracy=model.evaluate (test_gen)[1]*100
print ('Model accuracy is ', accuracy)
Note your model will not be able to tell fake from real in the general case. It should work for persons 1 through N. You could try putting all the fake images in one class directory and all the real images in another class directory and train it but I suspect it will not work well in telling real from fake for the general case.
Hi! Thanks for your answer! It solved most of my queries. To enable it to find fake signature from real ones, I will give a few real signatures and a fake signature of another person as an input