比较简单的demo,先将dota2官网上的物品介绍页面(“http://www.dota2.com/items/”)的页面代码爬取下来,打印到控制台上。再随意选取一张图片,存到本地。由于没使用正则,所以选择图片是手动选的(low了点)。
实现这个功能,实际上主要是使用了 java.net.URL类,先爬取页面代码:
代码:
public class UrlTest {
public static void main(String[] args) throws IOException {
URL url = new URL("http://www.dota2.com/items/");
URLConnection uc = url.openConnection();
InputStream is = uc.getInputStream();
InputStreamReader isr = new InputStreamReader(is, "utf-8");
int i = 0;
while ((i = is.read()) != -1) {
System.out.print((char)i);
}
}
}
然后从控制台中的HTML代码中选取一个img标签里的src内容,改写代码:
public class UrlTest {
public static void main(String[] args) throws IOException {
URL url = new URL("http://cdn.dota2.com/apps/dota2/images/items/tome_of_knowledge_lg.png");
URLConnection uc = url.openConnection();
InputStream is = uc.getInputStream();
InputStreamReader isr = new InputStreamReader(is, "utf-8");
OutputStream os = new FileOutputStream(new File("C:\\Users\\Administrator\\Desktop\\123.jpg"));
int i = 0;
while ((i = is.read()) != -1) {
os.write(i);
}
}
}
然后运行,可以看到桌面上多了张图片:
爬取的图片
网友评论