环境:Mac OS X Yosemite 10.10.3
安装Scrapy
学习Python爬虫必须要使用的框架Scrapy,话不多说。
打开终端执行命令:
sudo easy_install pip
pip 和 easy_install 都是 Python 的框架管理命令,pip 是对 easy_install的升级。
然后终端执行命令安装 Scrapy:
sudo pip install Scrapy
如果执行成功,那么 Scrapy 就安装成功了,但往往事与愿违,你很有可能遇到如下错误:
/private/tmp/pip-build-9RYtLC/lxml/src/lxml/includes/etree_defs.h:14:10: fatal error: 'libxml/xmlversion.h' file not found
#include "libxml/xmlversion.h"
^
1 error generated.
error: command 'cc' failed with exit status 1
----------------------------------------
Command "/usr/bin/python -c "import setuptools, tokenize;__file__='/private/tmp/pip-build-9RYtLC/lxml/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-544HZx-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /private/tmp/pip-build-9RYtLC/lxml
屏幕快照 2015-04-21 下午1.30.42.png
解决方法有如下几种:
1、终端执行命令安装或更新命令行开发工具:
xcode-select --install
2、配置路径:C_INCLUDE_PATH
C_INCLUDE_PATH=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/libxml2:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include/libxml2/libxml:/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.10.sdk/usr/include
3、参照官网使用如下命令安装Scrapy
STATIC_DEPS=true pip install lxml
一般此三个方法就可解决错误成功安装Scrapy,如果还是失败,参考 StackOverflow上的一个帖子
安装PIL
PIL是Python的图形处理库,在学习爬虫的时候可以用来处理验证码。
终端输入命令:
sudo pip install pil
恩,出错:
/Library/Python/2.7/site-packages/pip-6.1.1-py2.7.egg/pip/_vendor/requests/packages/urllib3/util/ssl_.py:79: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Could not find a version that satisfies the requirement pil (from versions: )
Some externally hosted files were ignored as access to them may be unreliable (use --allow-external pil to allow).
No matching distribution found for pil
不过提示了添加参数 --allow-external pil
好,改一下命令重新执行:
sudo pip install PIL --allow-external PIL
好的,开始安装了,哎?好像又出错了!!!
_imagingft.c:73:10: fatal error: 'freetype/fterrors.h' file not found
#include <freetype/fterrors.h>
^
1 error generated.
error: Setup script exited with error: command 'cc' failed with exit status 1
提示没找到 freetype/fterrors.h
文件,百度怎么解决,很多文章的解决办法是执行命令:ln -s /usr/local/include/freetype2 /usr/local/include/freetype
然后,试了,不行。
从Finder来到目录 usr/local/include
下,咦?好像有目录freetype2,但是么有freetype,那么...可以复制一个freetype2的副本再改名freetype不行吗?恩,然后我就这样干了。然后在终端重新执行安装PIL的命令:
sudo pip install PIL --allow-external PIL
然后就安装成功了~~
安装BeautifulSoup
首先,官网下载最新的包beautifulsoup4 4.3.2
,然后解压缩,从终端进入该目录。
终端执行
sudo python setup.py install
屏幕快照 2015-05-23 下午5.26.37.png
好,安装成功。
Beautifulsoup的官方文档
补充:
easy_install使用方法:
安装:easy_install PackageName
删除:easy_install -m PackageName
更新:easy_install -U PackageName
pip使用方法:
安装:pip install PackageName
删除:pip uninstall PackageName
更新:pip install -U PackageName
搜索:pip search PackageName
网友评论
Exception:
Traceback (most recent call last):
File "/Library/Python/2.7/site-packages/pip/basecommand.py", line 209, in main
status = self.run(options, args)
File "/Library/Python/2.7/site-packages/pip/commands/install.py", line 317, in run
prefix=options.prefix_path,
File "/Library/Python/2.7/site-packages/pip/req/req_set.py", line 726, in install
requirement.uninstall(auto_confirm=True)
File "/Library/Python/2.7/site-packages/pip/req/req_install.py", line 746, in uninstall
paths_to_remove.remove(auto_confirm)
File "/Library/Python/2.7/site-packages/pip/req/req_uninstall.py", line 115, in remove
renames(path, new_path)
File "/Library/Python/2.7/site-packages/pip/utils/__init__.py", line 267, in renames
shutil.move(old, new)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 302, in move
copy2(src, real_dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 131, in copy2
copystat(src, dst)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/shutil.py", line 103, in copystat
os.chflags(dst, st.st_flags)
OSError: [Errno 1] Operation not permitted: '/var/folders/hv/rnxvykcd4dq98fm44qqh7_sm0000gn/T/pip-49Rk9y-uninstall/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six-1.4.1-py2.7.egg-info'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/scrapy/__init__.py", line 48, in <module>
from scrapy.spiders import Spider
File "/Library/Python/2.7/site-packages/scrapy/spiders/__init__.py", line 10, in <module>
from scrapy.http import Request
File "/Library/Python/2.7/site-packages/scrapy/http/__init__.py", line 12, in <module>
from scrapy.http.request.rpc import XmlRpcRequest
File "/Library/Python/2.7/site-packages/scrapy/http/request/rpc.py", line 7, in <module>
from six.moves import xmlrpc_client as xmlrpclib
ImportError: cannot import name xmlrpc_client
是没有导入包?
sudo rm -rf /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/six*
sudo pip install six
这一段是在stackoverflow上面找到,我也是这个问题,然后解决了,希望可以帮助你。