2025年python过滤器方法中机器学习的特征选择

科技前沿 • 2025-01-29 09:14 • 阅读 69

大家好，我是讯享网，很高兴认识大家。

Too many cooks spoil the broth.

太多的厨师把汤糟了。

Even back in 1575, George Gascoigne already knew that a sumptuous bowl of broth can’t be achieved with too many cooks in the kitchen. The rigor of that proverb extends to modern days, yes, even in Machine Learning.

甚至早在1575年，乔治·加斯科因(George Gascoigne)便已经知道，厨房里的厨师太多就无法获得一碗丰盛的汤。是的，这句谚语的严格性一直延续到现代，即使在机器学习中也是如此。

讯享网

Have you ever wondered why the performance of your model hit a plateau no matter how you fine-tune those hyperparameters? Or even worse that you only see a mediocre improvement on performance after using the most accurate set of data you could ever find? Well, the culprit might actually be the predictors (columns) you use to train your models.

您是否曾经想过，无论如何微调那些超参数，为什么模型的性能都会达到平稳状态？甚至更糟的是，在使用可能找到的最准确的数据集之后，您只会看到性能的中等改善？嗯，罪魁祸首实际上可能是您用来训练模型的预测变量(列)。

Ideally, predictors should be statistically relevant to the output data a model intends to predict, and those predictors should be carefully hand-picked to ensure the best-expected performance. This article will give you a brief walkthrough on what feature selection is all about, accompanied by some practical examples in Python.

理想情况下，预测变量应该与模型要预测的输出数据在统计上相关，并且应该精心挑选这些预测变量以确保获得**预期的性能。本文将为您简要介绍功能选择的全部内容，并附带一些Python实用示例。

2025年python过滤器方法中机器学习的特征选择

相关推荐