ml-scratch.md

Supervised

4 importants notions:

dataset
- target (y) which what we want to optimize
- features (x1, x2, x3 etc...) that are the "inputs" or helpers to determine y
model: predicts Y depending on the Xs and some input
cost function: measure the errors between y and the predictions
minimize algorithm: minimize the errors

Also note that m is the number of rows in the dataset and n the number of features

Linear regression (example)

dataset:

y (cost)	x1 (size)	x2(place)	x3(quality)
300k	150m2	Paris	4
200K	100m2	Lyon	2
250k	125m2	Bordeaux	3

model of: f(x) = ax + b (affine func)

Where we don't know a and b => it's the machine role to determine these. At the beginning, we don't know these, so we put in random numbers and we trace the function in a graph.

cost function (conventially called J)

For that, let's use the euclidian norme calculating the distance between two points in the graph: a point of the dataset and the point on the same x that cross the line in the graph. The distance between the two points is calculated with: (f(xi) - yi)^2

We can now define J and make the parameters a and b vary: J(a, b) = (1/2m) x sum(f(xi) - yi)^2 where i is the position of x in the features. This function is called Mean squared error

minimize function

One of these is the Gradient Descent: find the minimum of any convex function( like the square one for example without multiple local minimums)

Alpha (learning rate - it's like a step to find the minimum in the Gradient descent)

Matrice

dimension m x n has m line and n columns

mfrachet/ml-scratch.md

Supervised

4 importants notions:

Linear regression (example)

Matrice