Part 3 - Come back Google Colab, all is forgiven

After experimenting with RunPod, not liking it, and blowing a whole $10 on it, I've decided to go back to Google Colab and save my models properly and pay for a bit of quality GPU time. Maybe I'll like Colab better if I do that.

MAUI DEV STORIES

Stephen Moreton-Howell

5/17/20262 min read

In part 1 of my MAUI dev stories series I established a pipeline for a neural network model to be created and trained in Google Colab, exported using ONNX and imported into a .NET MAUI app to be used. My next step was to do something more serious than classifying arrays of 10 numbers. I decided on face generation using a Generational Adversarial Network (GAN). For this kind of task, the Wasserstein GAN with Gradient Penalty (WGAN-GP) is good. I'll go into why that is later. But I'll first show how it went when I tried creating a WGAN-GP, trained it on 10000 celebrity faces and tweaked the hyper-parameters to try to get it to learn smoothly.

Trying to train a non-trivial AI model

Badly Drawn Boys and Girls

The task is simple to describe. The model used to achieve it, less so. I want to take a whole load of pictures of human faces and use them to train the model (the AI) as to what the important structures of a human face are. Then it can generate new human faces. Now, here in the year 2026 this is something that's already been going on for a while. So there are some standard ways to do it and standard free data sets that can be used for that training. As I said, the standard model I'm using here is the WGAN. The training data is a set of 10,000 pictures of celebrities (presumably just because those pictures are in the public domain). Here's a sample from that training data:

And here is a snapshot from a little way into the training of an early attempt at a model:

They're recognizable as attempted human faces, but a long way from looking normal yet. As you can see from the caption, that's "epoch 31". I'll go into the concept of an epoch later, but basically it's a stage in the training process, and in this particular training process there were 200 of them. So there was a way to go yet, but I stopped it here because it was clear that the model wasn't getting any better. This is where you have to stop and tweak the model's parameters to see if it learns better. If that doesn't just magically work, you have to actually try to understand better what those parameters mean. So in this blog I'll show the code for the model and explain a bit about what the parameters do.

In the course of all this, hopefully we will end with a model that can learn to reproduce human faces reasonably well.

To be continued.