How to find HTML selectors for your web scraper
Why we need to define selectors
When we scrape a website, we need to define one or more selectors to tell the web scraper what information we want to retrieve.
A) If you know HTML
Knowing the markup language makes things easier. You just have to inspect the website's source code with your browser, and write down the selectors.
Some examples:
// Get the main title
h1
// Get subtitles
h2
// Get subtitles with class .header
h2.header
// Get subtitles with id #header
h2#header
B) If you don't know HTML
In this case, we can use many tools and browser extensions to easily extract the selectors from a website.
For example, if you are using Google Chrome browser, you can install a widget called SelectorGadget.
Once installed, just click the element you want to scrape, and the extension will give you the unique HTML selector for the element.
With these tips you should be able to find the selectors you need to create the perfect web scraper.