Tutorial 1: Train your first model with NNFabrik¶
In this tutorial, we will go through one example usage of the nnfabrik pipeline.
The goal in this tutorial is to train multiple models with different hyperparameters (batchsize, number of layer, learning rate, etc.) in a parallel fashion, and to save the corresponding results in a centralized database table.
We have already defined our three functions dataset-function, model-function and trainer-function. Those are the basic functions which, if used together, train a simple MLP model on the well-known MNIST dataset. Therefore, what we are left with is to fill up the corresponding Datajoint tables with different hyper-parameter values and train models for every possible combination of those hyper-parameters.
The Fabrikant Table¶
The Fabrikant table keeps a record of the users that interact with a specific schema. It is simply an extra level of information to know who is accountable for the entries in the Dataset, Model, and Trainer tables. simply add your information as follows:
fabrikant_info = dict(fabrikant_name="Your Name", email="your@email.com", affiliation="thelab", dj_username="yourname")
Fabrikant().insert1(fabrikant_info)
The Dataset Table¶
Here we need to specify dataset function and the arguments passed to the dataset function, dataset config.
The dataset function is specified as a string. the structure of this string is important since under the hood nnfabrik performs a dynamic import by parsing this string. For example, if you can import the function as
from nnfabrik.datasets import toy_dataset_fn
then you should specify the dataset function as:
"nnfabrik.datasets.toy_dataset_fn"
# specify dataset function as string (the function must be importable) as well as the dataset config
dataset_fn = "nnfabrik.examples.mnist.dataset.mnist_dataset_fn"
dataset_config = dict(batch_size=64) # we specify all the inputs except the ones required by nnfabrik
Dataset().add_entry(dataset_fn=dataset_fn, dataset_config=dataset_config,
dataset_fabrikant="Your Name", dataset_comment="A comment about the dataset!")
Tip
Since nnfabrik would need to import the function, the dataset function must be importable.
Also note that dataset_config
is a dictionary that contains all the arguments that are not required by nnfabrik.
The Model Table¶
Here we need to specify dataset function and the arguments passed to the dataset function, dataset config. Everything explained for the dataset function applied to model function as well.:
# specify model function as string (the function must be importable) as well as the model config
model_fn = "nnfabrik.examples.mnist.model.mnist_model_fn"
model_config = dict(h_dim=5) # we specify all the inputs except the ones required by nnfabrik
Model().add_entry(model_fn=model_fn, model_config=model_config,
model_fabrikant="Your Name", model_comment="A comment about the model!");
Let’s also try h_dim = 15
:
model_config = dict(h_dim=15) # we specify all the inputs except the ones required by nnfabrik
Model().add_entry(model_fn=model_fn, model_config=model_config,
model_fabrikant="Your Name", model_comment="A comment about the model!");
The Trainer Table¶
Here we need to specify trainer function and the arguments passed to the trainer function, trainer config. Everything explained for the dataset function applies to trainer function as well.
# specify trainer function as string (the function must be importable) as well as the trainer config
trainer_fn = "nnfabrik.examples.mnist.trainer.mnist_trainer_fn"
trainer_config = dict(epochs=1) # we specify all the inputs except the ones required by nnfabrik
Trainer().add_entry(trainer_fn=trainer_fn, trainer_config=trainer_config,
trainer_fabrikant="Your Name", trainer_comment="A comment about the trainer!");
The Seed Table¶
Now we have one final table to fill up before we start training our models with all the combinations in Dataset, Model, and Trainer tables. That table is the Seed table.
Seed().insert1({'seed': 2020})
The TrainedModel Table¶
Once we have bunch of trained models, the downstream analysis might be different for each specific project. For this reason, we keep the TrainedModel tables separate from the tables provided in the library. However, the process of creating your own TrainedModel becomes very easy with the template(s) provided by nnfabrik.
Create your TrainedModel table¶
Inheritance of the TrainedModelBase
template
from nnfabrik.templates.trained_model import TrainedModelBase
from nnfabrik.examples import nnfabrik
@schema
class TrainedModel(TrainedModelBase):
table_comment = "Trained models"
nnfabrik = nnfabrik
Populate (fill up) the TrainedModel table¶
Calling populate
on this table fills all combinations of Trainer
, Dataset
, Model
and
Seed
(unless we restrict it)
TrainedModel.populate(display_progress=True)