Alan Lee secsilm

使用集成学习提升机器学习算法性能

这篇文章是对 PythonWeekly 推荐的一篇讲集成模型的文章的翻译，原文为 Ensemble Learning to Improve Machine Learning Results，由 Vadim Smolyakov 于 2017 年 8 月 22 日发表在 Medium 上，Vadim Smolyakov 是一名 MIT 的研究生，对数据科学和机器学习充满热情。

集成学习（Ensemble Learning）通过联合几个模型来帮助提高机器学习结果。与单一模型相比，这种方法可以很好地提升模型的预测性能。这也是为什么集成模型在很多著名机器学习比赛中被优先使用的原因，例如 Netflix 比赛，KDD 2009 和 Kaggle。

集成方法是一种将几种机器学习技术组合成一个预测模型的元算法（meta-algorithm），以减小方差（bagging），偏差（boosting），或者改进预测（stacking）。

集成方法可以分为两类：

	index token token_len
	75787 "。





	" 7
	19066 "。

	import argparse
	import subprocess
	from pathlib import Path

	from loguru import logger

	parser = argparse.ArgumentParser(description='合并米家摄像头视频，以天为单位。')
	parser.add_argument('indir', help='原米家摄像头视频目录。')
	parser.add_argument('--outdir', default='./', help='合并后视频存放目录，目录不存在会被创建。默认当前目录。')
	args = parser.parse_args()

	from openpyxl import load_workbook


	def read_color(f):
	wb = load_workbook(f)
	ws = wb.active
	for row in ws.iter_rows():
	for cell in row:
	print(f"cell value={cell.value}, cell color={cell.fill.start_color.index}")

	import re
	import shutil
	import warnings
	import zipfile
	from pathlib import Path

	# zip 文件所在的地址
	in_path = Path('D:\BaiduYunDownload\Jay Chou')
	# 解压地址
	out_path = Path('D:\BaiduYunDownload')


	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="utf-8">
	<title>color_scatter.py example</title>

	<link rel="stylesheet" href="https://cdn.pydata.org/bokeh/release/bokeh-0.12.14.min.css" type="text/css" />

	'''
	This code is a example for adding image on the top of map using cartopy.
	The generated image can be found here: https://i.imgur.com/aTY1rYY.png
	'''

	import matplotlib.pyplot as plt
	import cartopy.crs as crs
	from matplotlib.offsetbox import AnnotationBbox, OffsetImage
	from PIL import Image