微信公众号选择封面时,由于我没有大量的图片,每次现去网上下载几张图片很麻烦,所以搜索网上教材,找到批量下载的代码,下面是步骤。
1.在图片网址,右键选择检查
。

2.选择检查后,页面会变成网页结构信息如下图。找到div class
的代码,记录下=
后面的内容,是media_list
。放在下一步代码hrml_nodes
函数里。

3.代码如下,需要更改的是:
-
url
后面的图片网址,是含有很多图片的那个页面的网址。 -
html_nodes
函数后面的div.
后面的内容就是在上面网页结构图中找到的div class
后面的内容。
url <- 'https://pixabay.com/images/search/?cat=education'
picture<- read_html(url)%>% html_nodes("div.media_list")%>%html_nodes("img")%>%html_attr("src")
head(picture,20)
[1] "https://cdn.pixabay.com/photo/2019/11/19/22/24/watch-4638673__340.jpg"
[2] "https://cdn.pixabay.com/photo/2019/11/04/10/15/book-4600757__340.jpg"
[3] "https://cdn.pixabay.com/photo/2019/11/13/17/04/ringtones-4624296__340.jpg"
[4] "https://cdn.pixabay.com/photo/2019/09/30/14/29/books-4515917__340.jpg"
[5] "https://cdn.pixabay.com/photo/2019/11/21/23/55/dog-4643784__340.jpg"
[6] "https://cdn.pixabay.com/photo/2019/11/20/19/08/confused-4640878__340.jpg"
[7] "https://cdn.pixabay.com/photo/2019/11/03/16/29/woman-4599055__340.png"
[8] "https://cdn.pixabay.com/photo/2019/11/13/11/21/dog-4623296__340.jpg"
[9] "https://cdn.pixabay.com/photo/2019/11/17/16/57/statue-4632875__340.jpg"
[10] "https://cdn.pixabay.com/photo/2019/11/08/12/33/book-4611273__340.jpg"
[11] "https://cdn.pixabay.com/photo/2019/11/17/13/07/read-4632334__340.jpg"
[12] "https://cdn.pixabay.com/photo/2017/03/07/13/02/thought-2123970__340.jpg"
[13] "https://cdn.pixabay.com/photo/2019/09/28/12/36/auto-4510652__340.jpg"
[14] "https://cdn.pixabay.com/photo/2019/11/11/07/38/office-4617459__340.png"
[15] "https://cdn.pixabay.com/photo/2019/11/20/12/08/drawing-4639897__340.jpg"
[16] "https://cdn.pixabay.com/photo/2019/08/14/18/51/school-bus-4406479__340.jpg"
[17] "/static/img/blank.gif"
[18] "/static/img/blank.gif"
[19] "/static/img/blank.gif"
[20] "/static/img/blank.gif"
for(i in 1:length(picture))
{
download(picture[i],paste("/Users/mengmeng/Downloads/脚本杂货铺/批量下载图片/批量下载图片1/picture",i,".jpg",sep = ""), mode = "wb")
}
前面用head(picture,10)
,下载下来图片只有16张,很是疑惑,于是head(picture,20)
发现网址从16向后,确实都是无效网址了,但是我觉得代码没问题,我想应该是这个网址设置了保护(是国外的图片网址),不能批量下载,所以我又重新换了一个网址,依然右键检查,这次找到的div分区是<div class="indexpic">
,如下图所示。

代码如下
url <- 'https://www.quanjing.com/creative/' #换了另一个图片网址
picture<- read_html(url)%>% html_nodes("div.indexpic")%>%html_nodes("img")%>%html_attr("src")#更改的内容是indexpic
head(picture,51)
> head(picture,51)
[1] "https://webimg.quanjing.com/creative/1.jpg" "https://webimg.quanjing.com/creative/2.jpg"
[3] "https://webimg.quanjing.com/creative/3.jpg" "https://webimg.quanjing.com/creative/4.jpg"
[5] "https://webimg.quanjing.com/creative/5.jpg" "https://webimg.quanjing.com/creative/6.jpg"
[7] "https://webimg.quanjing.com/creative/7.jpg" "https://webimg.quanjing.com/creative/8.jpg"
[9] "https://webimg.quanjing.com/creative/9.jpg" "https://webimg.quanjing.com/creative/10.jpg"
[11] "https://webimg.quanjing.com/creative/11.jpg" "https://webimg.quanjing.com/creative/12.jpg"
[13] "https://webimg.quanjing.com/creative/13.jpg" "https://webimg.quanjing.com/creative/14.jpg"
[15] "https://webimg.quanjing.com/creative/15.jpg" "https://webimg.quanjing.com/creative/1547145496707.png"
[17] "https://webimg.quanjing.com/creative/1547146188597.jpg" "https://webimg.quanjing.com/creative/1547145999629.jpg"
[19] "https://webimg.quanjing.com/creative/24.jpg" "https://webimg.quanjing.com/creative/25.jpg"
[21] "https://webimg.quanjing.com/creative/26.jpg" "https://webimg.quanjing.com/creative/2625120517.jpg"
[23] "https://webimg.quanjing.com/creative/5625120517.jpg" "https://webimg.quanjing.com/creative/2525120617.jpg"
[25] "https://webimg.quanjing.com/creative/0203010610.jpg" "https://webimg.quanjing.com/creative/0925120817.jpg"
[27] "https://webimg.quanjing.com/creative/0003011510.jpg" "https://webimg.quanjing.com/creative/2803011910.jpg"
[29] "https://webimg.quanjing.com/creative/1203010810.jpg" "https://webimg.quanjing.com/creative/0525121017.jpg"
[31] "https://webimg.quanjing.com/creative/3125121017.jpg" "https://webimg.quanjing.com/creative/3725121117.jpg"
[33] "https://webimg.quanjing.com/creative/0525121217.jpg" "https://webimg.quanjing.com/creative/0825121117.jpg"
[35] "https://webimg.quanjing.com/creative/4525121217.jpg" "https://webimg.quanjing.com/creative/1546619045476.png"
[37] "https://webimg.quanjing.com/creative/0625120417.jpg" "https://webimg.quanjing.com/creative/1546618925012.png"
[39] "https://webimg.quanjing.com/creative/5525120217.jpg" "https://webimg.quanjing.com/creative/2825120217.jpg"
[41] "https://webimg.quanjing.com/creative/4003014609.jpg" "https://webimg.quanjing.com/creative/3025123018.jpg"
[43] "https://webimg.quanjing.com/creative/3425120918.jpg" "https://webimg.quanjing.com/creative/0425121118.jpg"
[45] "https://webimg.quanjing.com/creative/2425121318.jpg" "https://webimg.quanjing.com/creative/1525121518.jpg"
[47] "https://webimg.quanjing.com/creative/1425121718.jpg" "https://webimg.quanjing.com/creative/5325121818.jpg"
[49] "https://webimg.quanjing.com/creative/4925122218.jpg" "https://webimg.quanjing.com/creative/2625122518.jpg"
[51] "https://webimg.quanjing.com/creative/4925122718.jpg"
for(i in 1:length(picture))
{
download(picture[i],paste("/Users/mengmeng/Downloads/脚本杂货铺/批量下载图片/批量下载图片2/picture",i,".jpg",sep = ""), mode = "wb")
}

这次批量下载就全部成功了,picture总共有51,head(picture,51)
,显示的网址也都是没问题的,最后成功下载51个。这里面只是把结果示例展示了出来,具体含义和为什么可以去看下面的参考哦。
最后友情宣传生信技能树
-
全国巡讲:R基础,Linux基础和RNA-seq实战演练 : 预告:12月28-30长沙站
网友评论