论文标题
估计决策树的可学习性,具有各种样本复杂性
Estimating decision tree learnability with polylogarithmic sample complexity
论文作者
论文摘要
We show that top-down decision tree learning heuristics are amenable to highly efficient learnability estimation: for monotone target functions, the error of the decision tree hypothesis constructed by these heuristics can be estimated with polylogarithmically many labeled examples, exponentially smaller than the number necessary to run these heuristics, and indeed, exponentially smaller than information-theoretic minimum required to learn a good decision tree.这增加了一小部分但不断增长的基本学习算法清单,这些算法已被证明是可以根据可学习性估算的。 在此结果的途中,我们设计和分析了自上而下的决策树学习启发式方法的样本高效的Minibatch版本,并表明它们具有与全批批次版本相同的可证明的保证。我们进一步提供了这些启发式方法的“活跃本地”版本:鉴于一个测试点$ x^\ star $,我们展示了如何使用polygarithMarithMarithMarithMarithMarithMarithMarithMarithMarithMarithMarithMainslimed示例来计算的标签$ t(X^\ star)$,指数要小于学习$ t $所需的数字。
We show that top-down decision tree learning heuristics are amenable to highly efficient learnability estimation: for monotone target functions, the error of the decision tree hypothesis constructed by these heuristics can be estimated with polylogarithmically many labeled examples, exponentially smaller than the number necessary to run these heuristics, and indeed, exponentially smaller than information-theoretic minimum required to learn a good decision tree. This adds to a small but growing list of fundamental learning algorithms that have been shown to be amenable to learnability estimation. En route to this result, we design and analyze sample-efficient minibatch versions of top-down decision tree learning heuristics and show that they achieve the same provable guarantees as the full-batch versions. We further give "active local" versions of these heuristics: given a test point $x^\star$, we show how the label $T(x^\star)$ of the decision tree hypothesis $T$ can be computed with polylogarithmically many labeled examples, exponentially smaller than the number necessary to learn $T$.
