Module 1: Data Visualization

Convolutional Neural Network: Individual Presentation (Gibbs’ Reflective Cycle)

Description

For this assignment, I developed an object recognition model using the CIFAR-10 dataset, focusing on Convolutional Neural Networks, data augmentation, and transfer learning. I began with a deliberately weak baseline CNN to set a low benchmark, then improved it by increasing architectural depth, tuning learning rate and batch size, adding regularisation, and introducing augmentation. Finally, I applied transfer learning using MobileNetV2 to compare a pretrained model against my custom network.

The project was completed individually, which, after a difficult group project previously, allowed me to plan the work in clear stages, learning the theory first, building and experimenting next, and only then shaping the presentation and transcript.

Feelings

I felt excited and slightly intimidated at the start, since image recognition was new to me. I spent the first week learning the concepts, reading articles, watching tutorials, and explaining CNNs to a colleague to test my understanding. His questions exposed gaps that I then closed by returning to sources and interactive demos. I found the topic genuinely fascinating, that a system that processes ones and zeros can “see” by operating on pixel values in red, green, and blue channels. Early misclassifications were revealing and a bit amusing, for example bright yellow cars, trucks, and planes predicted as frogs, which showed the network had latched onto colour cues rather than shape.

Evaluation

Investing early time in understanding CNNs paid off, implementation and tuning felt purposeful rather than trial and error. Building the slide deck framework early gave me a clear narrative, dataset, preprocessing, baseline, tuning, augmentation, transfer learning, comparison. Technically, accuracy progressed from about 57 percent for the baseline, to about 73 percent after tuning, to about 91 percent with MobileNetV2, which made the improvements easy to communicate.

Not everything worked first time. My initial transfer learning run stalled around 14 percent accuracy, I had not frozen and unfrozen layers correctly, and my learning rate was too high for fine tuning. Augmentation helped robustness but did not dramatically raise accuracy, which made sense on reflection because CIFAR-10 is balanced and relatively clean.

Analysis

The project clarified why CNNs outperform classical methods like SVM or KNN on images. CNNs learn hierarchical spatial features directly from pixels, edges in shallow layers, textures and shapes in deeper layers, which removes the need for manual feature engineering. Building a custom CNN taught me how architecture and hyperparameters shape learning, while transfer learning showed how pretrained features from large datasets such as ImageNet can be adapted quickly and effectively. MobileNetV2 reached strong accuracy with modest tuning, consistent with findings that better ImageNet models tend to transfer well.

Working alone removed coordination overhead and let me focus on depth, although it also meant fewer alternative perspectives to challenge my choices as I went.

Conclusion

I finished with a working understanding of how and why CNNs, augmentation, and transfer learning improve performance, not just code that runs. The step by step gains created a clear story for the presentation, and the final results validated the approach. The individual format suited careful learning, although it also highlighted the potential value of diverse viewpoints.

Action Plan

This module, and especially this project, deepened my fascination with machine learning. It is striking that electronic systems operating on binary values can, through mathematics, perceive patterns and make sense of images. I plan to explore other architectures and learning paradigms, for example ResNets, DenseNets, and self supervised learning, and to add interpretability tools such as Grad-CAM to see what the network attends to.

Although I was relieved to work individually after the last project, I also wonder how much better the presentation could have been with a team, different perspectives, and shared experience. I want to combine the depth I gained here with future collaborative practice, learning the less intuitive, but essential, skills of guiding discussion, integrating differing ideas and skill sets, and managing trade offs in a group while still maintaining technical rigor.

References

Bergstra, J. and Bengio, Y. (2012) ‘Random search for hyper-parameter optimization’, Journal of Machine Learning Research, 13(1), pp. 281–305.
Codebasics (2025) Image classification using CNN (CIFAR10 dataset) | Deep Learning Tutorial 24 (TensorFlow & Python) [YouTube video]. Available at: https://youtu.be/7HPwo4wnJeA.
Ekman, M. (2024) Learning Deep Learning, From Perceptron to Large Language Models [online video]. Pearson, O’Reilly. Available at: https://learning.oreilly.com/home/.
Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. Cambridge, MA, MIT Press.
IBM (no date) Convolutional Neural Networks. Available at: https://www.ibm.com/think/topics/convolutional-neural-networks.
Kohavi, R. (1995) ‘A study of cross validation and bootstrap for accuracy estimation and model selection’, in Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), pp. 1137–1145.
Kornblith, S., Shlens, J. and Le, Q.V. (2019) ‘Do better ImageNet models transfer better?’, in Proceedings of the IEEE, CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2661–2671.
Krizhevsky, A. (2009) Learning multiple layers of features from tiny images. Technical Report, University of Toronto.
Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012) ‘ImageNet classification with deep convolutional neural networks’, Advances in Neural Information Processing Systems, 25, pp. 1097–1105.
LeCun, Y., Bengio, Y. and Hinton, G. (2015) ‘Deep learning’, Nature, 521(7553), pp. 436–444.
Nair, V. and Hinton, G.E. (2010) ‘Rectified linear units improve restricted Boltzmann machines’, in Proceedings of the 27th International Conference on Machine Learning (ICML), Omnipress.
Pan, S.J. and Yang, Q. (2010) ‘A survey on transfer learning’, IEEE Transactions on Knowledge and Data Engineering, 22(10), pp. 1345–1359.
Polo Club (no date) CNN Explainer. Available at: https://poloclub.github.io/cnn-explainer/.
Prathammodi001 (no date) ‘Convolutional Neural Networks for Dummies, A Step by Step CNN Tutorial’, Medium. Available at: https://medium.com/@prathammodi001/....
Shorten, C. and Khoshgoftaar, T.M. (2019) ‘A survey on image data augmentation for deep learning’, Journal of Big Data, 6(60). doi, 10.1186/s40537-019-0197-0.
Sidana, N. (2025) ‘Using Convolutional Neural Networks on the CIFAR-10 Dataset’, Medium, 12 February. Available at: https://medium.com/@nsidana123/....
TensorFlow (no date) CIFAR-10 dataset. Available at: https://www.tensorflow.org/datasets/catalog/cifar10.
Towards Data Science (no date) ‘Deep Learning with CIFAR-10 Image Classification’. Available at: https://towardsdatascience.com/....
Ultralytics (2024) CIFAR-10 Dataset. Available at: https://docs.ultralytics.com/datasets/classify/cifar10/.
Yosinski, J., Clune, J., Bengio, Y. and Lipson, H. (2014) ‘How transferable are features in deep neural networks?’, Advances in Neural Information Processing Systems, 27, pp. 3320–3328.