I had a question on the interaction depth parameter in gbm in R. This may be a noob question, for which I apologize, but how does the parameter, which I believe denotes the number of terminal nodes in a tree, basically indicate X-way interaction among the predictors?
Link between interaction.depth and the number of terminal nodes
One as to see interaction.depth
as the number of split nodes. An interaction.depth
fixed at k will result in nodes with k+1 terminal nodes (omitting the NA nodes), so we have :
interaction.depth=#{TerminalNodes}+1
Link between interaction.depth and the interaction order
The link between interaction.depth
and interaction order is more tedious.
Instead of reasoning with the interaction.depth, let's reason with the number of terminal nodes, which we will called J.
Example:
Let's say you have J=4 terminal nodes (interaction.depth=3) you can either :
- do the first split on the root, then the second split on the left node of the root and the third split on the right node of the root. The interaction order for this tree will be 2.
- do the first split on the root, then the second split on the left (respectively right) node of the root, and a third split on this very left (respectively right) node. The interaction order for this tree will be 3.
So you cannot know in advance what will be the interaction order between your features in a given tree. However it is possible to upper bound this value. Let P be the interaction order of the features in a given tree. We have :
P≤min(J−1,n)
with n being the number of observations. For more details see the section 7 of the original article of
Friedman.