Originally published here: https://www.notion.so/lankinen/2021-My-working-methods-458bba133c49484ab371b648b873ff6c

I work the best when everything is in order and I don’t need to waste time middle of a task to e.g. find certain notes or files. Over 2 years I have been focusing the way I work, I have learned a few things and took…

This article is about how I tested a hypothesis I had by building a website in 3 days. It’s not my first time building something to test business hypothesis but this time I wanted to do it faster and document the processes.


The hypothesis is that people enjoy commenting 💩…



High school math is enough to understand deep learning.

Lots of data is not needed but some record-breaking results have been made with <50 items.

You don’t need an expensive computer but start of the art results can be made for free.

Deep learning is the best known approach in these areas

Deep learning is the same as…


Language model = a model that tries to predict the next word of a sentence.

A language model works well on transfer learning as the base model because it knows something about language as it can predict the next word of a sentence.

Wikipedia language model is often the starting point

The base language model should be…


Weight decay (L2 regularization)

The idea is to add the sum of all the weights squared to the loss function. This way the model tries to keep the weights as small as possible because bigger weights will increase the final loss.

loss_with_wd = loss + wd * (parameters**2).sum()



learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fine_tune(2, base_lr=0.1)

Learning rate finder helps to pick the best learning rate. The idea is to change the learning rate after every mini-batch and then plot the loss. Good learning rate is somewhere between the steepest point and the minimum. So for example based…


Create dataset

train_x = torch.cat([stacked_threes, stacked_sevens]).view(-1, 28*28)
train_y = tensor([1] * len(threes) + [0] * len(sevens)).unsqueeze(1)
print(train_x.shape, train_y.shape)
CONSOLE: (torch.Size([12396, 784]), torch.Size([12396, 1]))
dset = list(zip(train_x, train_y))
x, y = dset[0]
print(x.shape, y)
PRINT: (torch.Size([784]), tensor([1]))

Create weights

def init_params(size, variance=1.0):
return (torch.randn(size)*variance).requires_grad_()
weights = init_params((28*28,1))
bias = init_params(1)


# Load saved learner
learn_inf = load_learner(path/'export.pkl')
# Predict given image
# See labels

Jupyter Notebook widgets are created using IPython widgets.

Deploying CPU is easier than GPU in most of the cases. GPU is needed only if the deployed model still requires a lot of computation…


Classification = Predict label (e.g. dog or cat)

Regression = Predict number (e.g. how old)

The last time we looked this code:

valid_pct=0.2 (validation percent) means that the loader is going to put 20% of the given data aside in training. Then this data is used to measure how…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store