siamese/notes.md
raphael fbc6ee8187 remodels siamese network with vgg16 and 100x100 input
- worse performance than with initial design
- vgg16 pretrained weights are used for the base
  network, which is then piped into a custom head
  model, which
    - flattens the layer (previously done in the base model)
    + Dense Layer
    + Normalization
    + Activation
- training split with 360 fruits used, same as previous mode
- maximum prediction level around 0.95 after ca 60 epochs
2021-07-28 19:02:48 +02:00

2.1 KiB

the steps taken so far, which lead to a successfull detection of an image

  • train the model defined in mnist_siamese_example, which uses the 'siamese.py' model to create a siamese keras model.

    • in this mnist siamese example, the data collection has been updated form the mnist drawing sample to the fruit sample. Lots of work went into setting the arrays up correctly, because the example from towards data science did not correctly seperate the classes. He had originally used 91 classes for teching and the rest for testing, where I now use images of every class for teaching and training.

    • The images were shrunken down to 28 x 28 so the model defined in the siamese example could be used without adaption

    • in this example, there is two teachings going on, once he trains the siamese model (which is saved under 'siamese_checkpoint' and then he reteaches a new model based on this one, with some additonal layers ontop

      I'm not yet sure what these do [todo] but 'I'll figure it out.

  • after you've successfully trained the model, it's now saved to 'model_checkpoint' or 'siamese_checkpoint'

  • note that the current model design has removed the second training layer, it now only creates 'siamese_checkpoint'

  • The following steps can be used to classify two images: Note, that it was so far only tested using images in a 'pdb' shell from the mnist_siamese_example script

import tensorflow.keras as keras
from PIL import image
model = keras.models.load_model('./siamese_checkpoint')
image1 = np.asarray(Image.open('../towards/data/fruits-360/Training/Avocado/r_254_100.jpg').convert('RGB').resize((28, 28)))
image2 = np.asarray(Image.open('../towards/data/fruits-360/Training/Avocado/r_250_100.jpg').convert('RGB').resize((28, 28)))
# note that the double division through 255 is only because the model bas taught with this double division, depends on
# the input numbers of course 

output = model.predict([np.array([image2]), np.array([image1])]) 
# Note here, that the cast to np.array is nencessary - otherwise the input vector is malformed

print(output)