我能从决策树中的训练树中提取基本的决策规则(或“决策路径”)作为文本列表吗?
喜欢的东西:
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
我能从决策树中的训练树中提取基本的决策规则(或“决策路径”)作为文本列表吗?
喜欢的东西:
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
当前回答
从这个答案中,您可以得到一个可读且高效的表示:https://stackoverflow.com/a/65939892/3746632
输出如下所示。X为一维向量,表示单个实例的特征。
from numba import jit,njit
@njit
def predict(X):
ret = 0
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
ret += 1
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
ret += 1
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
return ret/10
其他回答
Scikit learn在0.21版(2019年5月)中引入了一个名为export_text的有趣的新方法,用于从树中提取规则。这里的文档。不再需要创建自定义函数。
一旦你适应了你的模型,你只需要两行代码。首先,导入export_text:
from sklearn.tree import export_text
其次,创建一个包含规则的对象。为了使规则看起来更具可读性,使用feature_names参数并传递一个特性名称列表。例如,如果你的模型是model,你的特征是在一个名为X_train的数据框架中命名的,你可以创建一个名为tree_rules的对象:
tree_rules = export_text(model, feature_names=list(X_train.columns))
然后打印或保存tree_rules。输出如下所示:
|--- Age <= 0.63
| |--- EstimatedSalary <= 0.61
| | |--- Age <= -0.16
| | | |--- class: 0
| | |--- Age > -0.16
| | | |--- EstimatedSalary <= -0.06
| | | | |--- class: 0
| | | |--- EstimatedSalary > -0.06
| | | | |--- EstimatedSalary <= 0.40
| | | | | |--- EstimatedSalary <= 0.03
| | | | | | |--- class: 1
下面是一种使用SKompiler库将整个树转换为单个(不一定太容易读懂)python表达式的方法:
from skompiler import skompile
skompile(dtree.predict).to('python/code')
从这个答案中,您可以得到一个可读且高效的表示:https://stackoverflow.com/a/65939892/3746632
输出如下所示。X为一维向量,表示单个实例的特征。
from numba import jit,njit
@njit
def predict(X):
ret = 0
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
ret += 1
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
ret += 1
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
ret += 1
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
ret += 1
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
if X[0] <= 0.5: # if w_pizza <= 0.5
if X[1] <= 0.5: # if w_mexico <= 0.5
if X[2] <= 0.5: # if w_reusable <= 0.5
ret += 1
else: # if w_reusable > 0.5
pass
else: # if w_mexico > 0.5
pass
else: # if w_pizza > 0.5
pass
return ret/10
我已经经历过这些了,但我需要把规则写成这种形式
if A>0.4 then if B<0.2 then if C>0.8 then class='X'
所以我改编了@paulkernfeld的答案(谢谢),你可以根据自己的需要定制
def tree_to_code(tree, feature_names, Y):
tree_ = tree.tree_
feature_name = [
feature_names[i] if i != _tree.TREE_UNDEFINED else "undefined!"
for i in tree_.feature
]
pathto=dict()
global k
k = 0
def recurse(node, depth, parent):
global k
indent = " " * depth
if tree_.feature[node] != _tree.TREE_UNDEFINED:
name = feature_name[node]
threshold = tree_.threshold[node]
s= "{} <= {} ".format( name, threshold, node )
if node == 0:
pathto[node]=s
else:
pathto[node]=pathto[parent]+' & ' +s
recurse(tree_.children_left[node], depth + 1, node)
s="{} > {}".format( name, threshold)
if node == 0:
pathto[node]=s
else:
pathto[node]=pathto[parent]+' & ' +s
recurse(tree_.children_right[node], depth + 1, node)
else:
k=k+1
print(k,')',pathto[parent], tree_.value[node])
recurse(0, 1, 0)
现在可以使用export_text了。
from sklearn.tree import export_text
r = export_text(loan_tree, feature_names=(list(X_train.columns)))
print(r)
来自[sklearn][1]的完整示例
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.tree import export_text
iris = load_iris()
X = iris['data']
y = iris['target']
decision_tree = DecisionTreeClassifier(random_state=0, max_depth=2)
decision_tree = decision_tree.fit(X, y)
r = export_text(decision_tree, feature_names=iris['feature_names'])
print(r)