Building Convolutional Neural Networks in Excel
This blog relates to work I’ve done in Excel to build a handwritten digit classifier; basically, a spreadsheet that can read handwriting up to human levels of accuracy. This required a convolutional neural network – the engine behind just about all machine learning related to images. I’m unaware of anyone else who has done this in Excel so please let me know if you come across others.
I have been deeply involved with financial analysis and mathematical models for most of my career but began re-tooling with machine learning over the last two years. I’m currently working through Andrew Ng’s brilliant Deep Learning course on Coursera and I’ve reached Course 4, Week 3. Throughout the course, I’ve been building out the neural net architectures he describes in Excel. Excel is not “yet” the right medium to build convolutional neural nets for real-world applications. However, I know Excel and find it easier to construct these on a spreadsheet rather than in a new language. I certainly hope this state won’t last long.
Building in Excel may be slow, but it makes neural nets very transparent, in that you are visually confronted with the layers, neurons and their associated parameters and related calculations. Excel gives a less abstract view of a neural net than vectorised Python code and it helped me immensely in developing an understanding of these fantastic new tools.
Excel lets you see inside the machine but for me, an even more useful element was the bugs and miss wiring that I introduced. Neural networks are very robust to bugs, in fact, they often continue to learn but fail in odd and interesting ways. Figuring out the failure based on the learning behaviour really forced me to think about the maths and the structure. I would also argue that the speed of Excel gives you time to think as the failures manifest themselves. For a deeper understanding of Neural Nets go to Andrew Ng. My intuitions are still weak but certainly improving.
The classic beginners exercise for deep learning is to build an MNIST digit classifier. This relates to Yann LeCun’s data set of 60,000 handwritten digits (0 to 9) with an associated 10,000-digit test set. Yann has made this data available to all and there are plenty of higher level language examples.
My initial attempts at classifiers using plain vanilla fully connected neural networks are on my YouTube channel below but Mike Pallister was faster off the mark than me and is also worth a look.
Single Digit Classification https://www.youtube.com/watch?v=4P5r0tT7Hsc&t=33s
Batch Digit Classification https://www.youtube.com/watch?v=bJcv9vi4Gqg&t=6s
However, this blog relates to last week’s modelling of a convolutional neural net; again with ADAM optimisation and Batch capability. After several silly but informative mistakes, I completed the model below last night and would welcome your thoughts.
The model has very few layers, neurons and parameters, its tiny in comparison to real-world examples. By way of a full description, there are two convolutional layers with max pooling taking the images of the handwritten digits from 28h x 28w pixels to 24h x 24w x 4c (4 channels) and 12h x 12w x 4c after max pooling. Layer 2 condensed these to 8h x 8w x 8c and then 4h x 4w x 8c after max pooling. The final two fully connected layers had 15 and 10 neurons respectively. Giving a grand total of 936 convolutional parameters and 2,095 in the fully connected layers. This doesn’t sound like a lot of space to capture the vagaries of human handwriting, but it does. On its first 100k iterations I’m seeing 98.75% accuracy on the training data and 98% on the test data. I would love to know what human accuracy levels on this data are but from my experience, it’s not much more than this.
I’d certainly welcome any questions and queries you may have and will be releasing some of the base models on GitHub. Any expressions of interest prior to this would also be welcome.
These videos and associated material if released are available to you under a Creative Commons Attribution-NonCommercial 4.0 International Licence the details of which can be found here: https://creativecommons.org/licenses/by-nc/4.0/legalcode