2.3. The Relationship Between Linear Regression and Polynomial Regression#
There is a close relationship between linear regression models and polynomial regression models. Consider a multiple linear regression model which takes 3 input variables.
For example, predicting exam mark (\(y\)) using time spent studying (\(x_1\)), assignment mark (\(x_2\)) and attendance (\(x_3\)). This model would take the form:
A polynomial model of degree 3 would take the form:
y = beta_0 + beta_1 textcolor{blue}{x} + beta_2 textcolor{blue}{x^2} + beta+3 textcolor{blue}{x^3}
The difference is that with the polynomial model, there is only one input variable, and we take that input variable and raise the values do a higher power.
For example, we could predict exam mark (\(y\)) using time spent studying (\(x\)), time spent studying squared (\(x^2\)) and time spent studying cubed (\(x^3\)).
We can build a polynomial regression model by adapting the code we used to
build a multiple linear regression model, we just need to think about the x
values we pass into the function .fit().
For a multiple linear regression model we would feed in a 2D array with \(n\) rows, one or each sample and then a column for each input variable. E.g.
\(x_1\) |
\(x_2\) |
\(x_3\) |
|---|---|---|
Time Spent Studying |
Assignment Mark |
Attendance |
4.5 |
73 |
93 |
8 |
89 |
100 |
1.5 |
65 |
74 |
3.5 |
66 |
88 |
5.5 |
67 |
84 |
[[4.5, 73, 93], [8, 89, 100], [1.5, 65, 74], [3.5, 66, 88], [5.5, 67, 84]]
For a polynomial regression model we would feed in a 2D array with \(n\) rows, one or each sample and then a column raising the input variable to a power. E.g.
\(x\) |
\(x^2\) |
\(x^3\) |
|---|---|---|
Time Spent Studying |
Time Spent Studying Squared |
Time Spent Studying Cubed |
4.5 |
20.25 |
91.124 |
8 |
64 |
512 |
1.5 |
2.25 |
3.375 |
3.5 |
12.25 |
42.875 |
5.5 |
30.25 |
166.375 |
[
[4.5, 20.25, 91.124],
[8, 64, 512],
[1.5, 2.25, 3.375],
[3.5, 12.25, 42.875],
[5.5, 30.25, 166.375],
]