文章来源:GitHub
计算机视觉
自然语言处理
NLTK:一个领先的平台,用来编写处理人类语言数据的Python程序。(http://www.nltk.org/)
Pattern:Python可用的web挖掘模块,包括自然语言处理、机器学习等工具。(http://www.clips.ua.ac.be/pattern)
TextBlob:为普通自然语言处理任务提供一致的API,以NLTK和Pattern为基础,并和两者都能很好兼容。(http://textblob.readthedocs.io/en/dev/)
jieba:中文断词工具。(https://github.com/fxsjy/jieba#jieba-1)
SnowNLP:中文文本处理库。(https://github.com/isnowfy/snownlp)
loso:另一个中文断词库。(https://github.com/fangpenlin/loso)
genius:基于条件随机域的中文断词库。(https://github.com/duanhongyi/genius)
nut:自然语言理解工具包。(https://github.com/pprett/nut)
通用机器学习
Bayesian Methods for Hackers:Python语言概率规划的电子书。(https://github.com/CamDavidsonPilon/銆侾robabilistic-Programming-and-Bayesian-Methods-for-Hackers)
MLlib in Apache Spark:Spark下的分布式机器学习库。(http://spark.apache.org/docs/latest/mllib-guide.html)
scikit-learn:基于SciPy的机器学习模块。(http://scikit-learn.github.io/stable)
graphlab-create:包含多种机器学习模块的库(回归、聚类、推荐系统、图分析等),基于可以磁盘存储DataFrame(http://graphlab.com/products/create/docs/)
BigML:连接外部服务器的库。(https://bigml.com/)
pattern:Python的web挖掘模块。(https://github.com/clips/pattern)
NuPIC:Numenta公司的智能计算平台。(https://github.com/numenta/nupic)
Pylearn2:基于Theano的机器学习库。(https://github.com/lisa-lab/pylearn2)
hebel:Python编写的使用GPU加速的深度学习库。(https://github.com/hannes-brt/hebel)
gensim:主题建模工具。(https://github.com/RaRe-Technologies/gensim)
PyBrain:另一个机器学习库。(https://github.com/pybrain/pybrain)
Crab:可扩展的、快速推荐引擎。(https://github.com/muricoca/crab)
python-recsys:Python实现的推荐系统。(https://github.com/ocelma/python-recsys)
thinking bayes:关于贝叶斯分析的书籍。(https://github.com/AllenDowney/ThinkBayes)
Restricted Boltzmann Machines:Python实现的受限波尔兹曼机。(https://github.com/echen/restricted-boltzmann-machines)
Bolt:在线学习工具箱。(https://github.com/pprett/bolt)
CoverTree:cover tree的Python实现,scipy.spatial.kdtree便捷的替代。(https://github.com/patvarilly/CoverTree)
nilearn:Python实现的神经影像学机器学习库。(https://github.com/nilearn/nilearn)
Shogun:机器学习工具箱。(https://github.com/shogun-toolbox/shogun)
Pyevolve:遗传算法框架。(https://github.com/perone/Pyevolve)
Caffe:考虑了代码清洁、可读性及速度的深度学习框架。(http://caffe.berkeleyvision.org/)
breze:深度及递归神经网络的程序库,基于Theano。(https://github.com/breze-no-salt/breze)
数据分析/数据可视化
SciPy:基于Python的数学、科学、工程开源软件生态系统。(https://www.scipy.org/)
NumPy:Python科学计算基础包。(http://www.numpy.org/)
Numba:Python的低级虚拟机JIT编译器,Cython and NumPy的开发者编写,供科学计算使用。(http://numba.pydata.org/)
NetworkX:为复杂网络使用的高效软件。(https://networkx.github.io/)
Pandas:这个库提供了高性能、易用的数据结构及数据分析工具。(http://pandas.pydata.org/)
Open Mining:Python中的商业智能工具(Pandas web接口)。(https://github.com/mining/mining)
PyMC:MCMC采样工具包。(https://github.com/pymc-devs/pymc)
zipline:Python的算法交易库。(https://github.com/quantopian/zipline)
PyDy:全名Python Dynamics,协助基于NumPy、SciPy、IPython以及 matplotlib的动态建模工作流。(http://www.pydy.org/)
SymPy:符号数学Python库。(https://github.com/sympy/sympy)
statsmodels:Python的统计建模及计量经济学库。(https://github.com/statsmodels/statsmodels)
astropy:Python天文学程序库,社区协作编写。(http://www.astropy.org/)
matplotlib:Python的2D绘图库。(http://matplotlib.org/)
bokeh:Python的交互式Web绘图库。(https://github.com/bokeh/bokeh)
plotly:Python and matplotlib的协作web绘图库。(https://plot.ly/python/)
vincent:将Python数据结构转换为Vega可视化语法。(https://github.com/wrobstory/vincent)
d3py:Python的绘图库,基于D3.js。(https://github.com/mikedewar/d3py)
ggplot:和R语言里的ggplot2提供同样的API。(https://github.com/yhat/ggpy)
Kartograph.py:Python中渲染SVG图的库,效果漂亮。(https://github.com/kartograph/kartograph.py)
pygal:Python下的SVG图表生成器。(http://pygal.org/en/stable/)
pycascading(https://github.com/twitter/pycascading)
杂项脚本/iPython笔记/代码库
pattern_classification:(https://github.com/rasbt/pattern_classification)
thinking stats 2:(https://github.com/Wavelets/ThinkStats2)
hyperopt:(https://github.com/hyperopt/hyperopt-sklearn)
numpic:(https://github.com/numenta/nupic)
2012-paper-diginorm:(https://github.com/dib-lab/2012-paper-diginorm)
ipython-notebooks:(https://github.com/ogrisel/notebooks)
decision-weights:(https://github.com/CamDavidsonPilon/decision-weights)
Sarah Palin LDA:Sarah Palin关于主题建模的电邮。(https://github.com/Wavelets/sarah-palin-lda)
Diffusion Segmentation:基于扩散方法的图像分割算法集合。(https://github.com/Wavelets/diffusion-segmentation)
Scipy Tutorials:SciPy教程,已过时,请查看scipy-lecture-notes。(https://github.com/Wavelets/scipy-tutorials)
Crab:Python的推荐引擎库。(https://github.com/marcelcaraciolo/crab)
BayesPy:Python中的贝叶斯推断工具。(https://github.com/maxsklar/BayesPy)
scikit-learn tutorials:scikit-learn学习笔记系列。(https://github.com/GaelVaroquaux/scikit-learn-tutorial)
sentiment-analyzer:推特情绪分析器。(https://github.com/madhusudancs/sentiment-analyzer)
group-lasso:坐标下降算法实验,应用于(稀疏)群套索模型。(https://github.com/fabianp/group_lasso)
mne-python-notebooks:使用 mne-python进行EEG/MEG数据处理的IPython笔记。(https://github.com/mne-tools/mne-python-notebooks)
pandas cookbook:使用Python pandas库的方法书。(https://github.com/jvns/pandas-cookbook)
climin:机器学习的优化程序库,用Python实现了梯度下降、LBFGS、rmsprop、adadelta 等算法。(https://github.com/BRML/climin)
Kaggle竞赛源代码
(https://github.com/hammer/wikichallenge)
(https://github.com/amueller/kaggle_insults)
(https://github.com/MLWave/銆俴aggle_acquire-valued-shoppers-challenge)
(https://github.com/zygmuntz/kaggle-cifar)
(https://github.com/zygmuntz/kaggle-blackbox)
(https://github.com/zygmuntz/kaggle-accelerometer)
(https://github.com/zygmuntz/kaggle-advertised-salaries)
(https://github.com/zygmuntz/kaggle-amazon)
(https://github.com/zygmuntz/kaggle-bestbuy_big)
(https://github.com/zygmuntz/kaggle-bestbuy_small)
(https://github.com/kastnerkyle/kaggle-dogs-vs-cats)
(https://github.com/benanne/kaggle-galaxies)
(https://github.com/zygmuntz/kaggle-gender)
(https://github.com/zygmuntz/kaggle-merck)
(https://github.com/zygmuntz/wine-quality)
声明:文章版权归原作者所有 部分文章转自互联网 如有侵权请联系
[邮箱地址] 删除
|