Scraping the Web

Scraping the Web the Naive Method

The Naive Method of scraping the web is using the static tag, and static attributes (key and value pairs).

Using BeautifulSoup4 we can scrape the following HTML text encapsulated by the tag

<div class="location"> Some text in here... </div>

s.find('div', attrs={'class': 'location'}).text.strip()

Here the static tag is 'div', the static attribute key and value pair is 'class': 'location'.

The advantage of the Naive Method is that it is incredibly accurate. The disadvantage of the Naive Method is that it must be consistently maintained, that is, the web page HTML format may update over time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scraping the Web

Scraping the Web the Naive Method

Clone this wiki locally