Discover new regression models
In unit 2, you looked at fitting a straight line to data points. But regression can fit many kinds of relationships. Examples include relationships with multiple factors and relationships where the importance of one factor depends on another.
Experiment with models
Regression models are often chosen because they:
- Work with small data samples.
- Are robust.
- Are easy to interpret.
- Come in various models.
Linear regression is the simplest form of regression, with no limit to the number of features used. Linear regression comes in many forms. Names are often derived by the number of features used and the shape of the curve that fits.
Decision trees take a step-by-step approach to predicting a variable. In our bicycle example, the decision tree might be split between variables that occur during spring/summer and autumn/winter. The prediction might also be based on the day of the week. Spring/summer-Monday might have a bike rental rate of 100 per day. Autumn/winter-Monday might have a rental rate of 20 per day.
Ensemble algorithms construct not just one decision tree but a large number of trees. This construction allows for better predictions on more complex data. Ensemble algorithms, such as random forest, are widely used in machine learning and science because of their strong prediction abilities.
As a data scientist, you'll often experiment with using different models. In the following exercise, you'll experiment with different types of models to compare how they perform on the same data.