{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# unit 5.0 - Tips and tricks for training neural nets\n",
        "\n",
        "\n",
        "## How to size your network and data?\n",
        "\n",
        "\n",
        "Suppose that you are trying to create a neural network for example to categorize medical images. Suppose you have a dataset split into:\n",
        "\n",
        "- train --> used to train the neural network\n",
        "- dev --> the validation set to check your performance and adjust parameters and architecture\n",
        "- test --> not used, saved for final check\n",
        "\n",
        "At the beginning of your training you may have a \"desired error\" or \"desired accuracy\". This is the target error or accuracy that is the the upper bound of your training. Usually one can take \"human-level\" as a proxy for  \"desired\" error or accuracy. But it does not need to be.\n",
        "\n",
        "### Situation 1\n",
        "\n",
        "After creating a neural network (nn) architecture, selecting a training method and all hyper-parameters, suppose you get this situation (S1):\n",
        "\n",
        "- desired error: 5%\n",
        "- nn-train error: 10%\n",
        "- nn-dev error: 12%\n",
        "\n",
        "The difference from the desired error to nn-train error can be called \"bias\" of the learning algorithm. In this case there is a high-bias.\n",
        "\n",
        "### Situation 2\n",
        "\n",
        "In a different situation (S2), you instead get:\n",
        "\n",
        "- desired error: 5%\n",
        "- nn-train error: 6%\n",
        "- nn-dev error: 12%\n",
        "\n",
        "The difference from nn-train error to nn-dev error can be called \"variance\" of the learning algorithm. In this case there is a high-variance.\n",
        "\n",
        "### Situation 3:\n",
        "\n",
        "In a yet different situation (S3), you instead get:\n",
        "\n",
        "- desired error: 5%\n",
        "- nn-train error: 10%\n",
        "- nn-dev error: 18%\n",
        "\n",
        "In this case there is high-bias and also high-variance.\n",
        "\n",
        "## What can we do to adjust our learning algorithm?\n",
        "\n",
        "### Part 1: high training error\n",
        "\n",
        "When you just started to craft a neural network architecture and learning technique, you may get a high training error and \"high bias\". In this case, we can use one of the following techniques:\n",
        "\n",
        "- train a bigger model\n",
        "- train longer\n",
        "- use a new model architecture\n",
        "\n",
        "Continue to try one or multiple of these techniques until the train error is closer to the desired value. \n",
        "\n",
        "### Part 2: high dev error\n",
        "\n",
        "As a second step, your train error may be low, but now you have \"high-variance\". In this case we are over-fitting the data. The solution is:\n",
        "\n",
        "- more data\n",
        "- add regularization to lower over-fitting\n",
        "- use a new model architecture\n",
        "\n",
        "Continue to try one or multiple of these techniques until the dev error is close to the train error. Then you ARE DONE!\n",
        "\n",
        "## Summary\n",
        "\n",
        "In general the two ingredient that always help neural network training are:\n",
        "\n",
        "- bigger model\n",
        "- more data\n",
        "\n",
        "They are guaranteed to decrease the error of both train and test, provided that the dataset is correctly setup and balanced.\n",
        "\n",
        "\n",
        "\n",
        "## Reference\n",
        "\n",
        "Inspired by: [this lecture](https://www.youtube.com/watch?v=F1ka6a13S9I)."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.10.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}