The system learns from labeled data in supervised learning, which means we already know how the incoming data will turn out. To put it another way, we now have input and output variables and need to connect them with a function. An algorithm that learns from a set of data is known as “supervised learning” (training). The input is an independent variable in this case, whereas the outcome is a dependent variable. The goal is to create a mapping function that is accurate enough for the algorithm to forecast what will happen when new data is introduced. This is a continuous process. We must evaluate the performance of an algorithm every time it provides a forecast. We must repeat the procedure if we are to succeed.

Regression

Because regression is a supervised learning technique, there will be both an input and an output variable, and it’s important to remember that the output variable is a continuous numerical, just like the dependent variable.

Consider the following scenario to better understand regression:

Assume you have two variables: “several hours studied” and “the number of grades earned.” We want to know how a student’s number of marks changes as a function of the number of hours he or she studies, i.e. “Marks scored” is the dependent variable, and “Hours studied” is the independent variable.

I now want to know, “How many hours should a student learn to achieve 60 points?” based on this information. Regression techniques come into play here. The regression model would recognize that each additional hour studied results in a ten-point increase and that a student must study for six hours to achieve 60 points.

It’s important to remember that the dependent variable is “marks scored,” and it’s a continuous numerical.

This is how regression algorithms function in practice. Let’s move on to the classification algorithm, which is the next sort of supervised learning technique.

 

Classification

Both the input and output data are required by classification algorithms. The output variable, also known as the dependent variable, should be categorical.

To further grasp classification, consider the following example.

Consider the following three variables: “Whether or not the person has lung cancer,” “Weight of the person,” and “Number of cigarettes smoked each day.” We want to know if a person has lung cancer depending on his or her weight and the number of cigarettes smoked per day, i.e. “Having lung cancer” is the dependent variable, whereas “weight” and “No of cigarettes smoked” are the independent variables.

 

Decision Tree Classifier

A prominent machine learning classifier is the decision tree. As the name implies, a decision tree has an inverted tree-like form. The root node is at the very top of the tree, and the leaf nodes are at the very bottom. Every node has a test condition, and the tree splits into its left or right child based on that test condition.

Let’s look at this scenario using a decision tree. Based on a set of test conditions, we’re attempting to determine if a person would watch the film “Avengers.”

“Likes action flicks” is the test condition on the root node in this case. If the outcome is correct, you should go to the left kid; otherwise, you should go to the right child. If you enjoy action movies, there are another test criteria on the left child: “Movie length greater than 2 hours.” If this holds, you should return to the left kid, i.e., you are fine viewing a film that lasts more than 2 hours. Another test condition appears when you go to the left child, “Likes Robert Downey Jr.” If this is the case, it suggests you’re looking forward to seeing “Avengers.” So that’s how it works with a decision tree classifier.

Let’s move on to unsupervised learning in this blog on machine learning tutorials now that you know what supervised learning is.