How do leaf nodes behave in regression decision trees?

fog37 · Nov 8, 2023

Hello.

Decision trees are really cool. They can be used for either regression or classification. They are built with nodes and each node represents an if-then statement that gets evaluated to be either true or false. Does that mean there are always and only two edges/branches coming out of an internal node (leaf nodes don't have edges)? Or are there situations in which there can be more than 2 edges?

In the case of classification trees, the leaf nodes are the output nodes, each with a single class output (there can be more leaf nodes than the classes available). In the case of regression trees, how do the leaf nodes behave? The goal is to predict a numerical output (ex: the price of a house). How many leaf nodes are there? One for each possible numerical value? That would be impossible. I know the tree gets trained with a finite number of examples/instances and the tree structure and decision statements are formed...

Thank you for any clarification.

Dale · Nov 8, 2023

I have never heard of using a decision tree for regression. Do you have a source for this?

fog37 · Nov 8, 2023

I do. One Medium article entitled

Decision Tree Regression Explained with Implementation in Python

Many more.

https://www.geeksforgeeks.org/python-decision-tree-regression-using-sklearn/

Dale · Nov 8, 2023

fog37 said:

TL;DR Summary: understand how decision trees and leaf nodes behave in the case of regression...

In the case of regression trees, how do the leaf nodes behave? The goal is to predict a numerical output (ex: the price of a house). How many leaf nodes are there? One for each possible numerical value?

It looks like the leaves themselves can assume continuous outputs. So you would only need one leaf per regression parameter.

fog37 · Nov 9, 2023

Dale said:

It looks like the leaves themselves can assume continuous outputs. So you would only need one leaf per regression parameter.

Thank you. Let me see if I understand correctly. In the example figure below, I notice that the leaf nodes have specific amounts, i.e. the value (last line). What if the inputs are such that the predicted value is none of the values mentioned in the leaf nodes? That is my dilemma. It seems that there is a finite number of leaf nodes having their own value...

Dale · Nov 9, 2023

Sorry, I cannot help you. Literally all I know about it is that one page that you cited where it says "Continuous output means that the output/result is not discrete, i.e., it is not represented just by a discrete, known set of numbers or values".

If you need more technical information then you need to find a more technical source. If you have a more technical source that has the information you need then I can help you understand it, but there simply is not any more information available there than the quote.

fog37 · Nov 10, 2023

Dale said:

Sorry, I cannot help you. Literally all I know about it is that one page that you cited where it says "Continuous output means that the output/result is not discrete, i.e., it is not represented just by a discrete, known set of numbers or values".

If you need more technical information then you need to find a more technical source. If you have a more technical source that has the information you need then I can help you understand it, but there simply is not any more information available there than the quote.

No worries.

After some research, I learned that, in the case of a regression decision tree, the possible numerical output answers are the averages of the values for those instances in the training dataset that together reached a particular leaf node based on the sequence of if-then statements along the tree itself...

How do leaf nodes behave in regression decision trees?

Decision Tree Regression Explained with Implementation in Python

FAQ: How do leaf nodes behave in regression decision trees?

What is the role of leaf nodes in regression decision trees?

How are the values in leaf nodes determined?

What happens if a leaf node has very few data points?

Can leaf nodes handle missing values in the data?

How do leaf nodes affect the interpretability of regression decision trees?

Similar threads

Hot Threads

Recent Insights

How do leaf nodes behave in regression decision trees?

Decision Tree Regression Explained with Implementation in Python​

FAQ: How do leaf nodes behave in regression decision trees?

What is the role of leaf nodes in regression decision trees?

How are the values in leaf nodes determined?

What happens if a leaf node has very few data points?

Can leaf nodes handle missing values in the data?

How do leaf nodes affect the interpretability of regression decision trees?

Similar threads

Hot Threads

Recent Insights

Decision Tree Regression Explained with Implementation in Python