BeautifulSoup-解析html
最后更新于:2022-04-02 02:16:43
[TOC]
## BeautifulSoup-解析html
>文档地址 [中文文档](https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.html)
code
```
html_doc = """
The Dormouse's story
';
The Dormouse's story
Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and they lived at the bottom of a well.
...
""" soup = bs4.BeautifulSoup(html_doc, "html.parser") print(soup.a) #打印Elsie print(soup.a.string) #打印a标签的内容 print(soup.a['href']) #打印a标签的href属性的值 print(soup.find(id='link3')) print(soup.find('a',class_='sister')) #Python的class 有关键字所以加'_' print(soup.find_all('a',class_='sister')) print(soup.find('p',{'class','story'}).get_text()) print(soup.find_all("a",href=re.compile(r'^http://example.com'))) print(soup.find_all("input",type=re.compile('text'))) ```