Web Scraping using Python | Weather Data

Krupa Patel
3 min readJul 26, 2021

In this blog, I shall illustrate how to scrape weather prediction data from National Weather Service website.

Web Scraping is a method of extracting unstructured data(HTML format) from the websites and make meaningful sense out of it by transforming into structured data(database/spreadsheet).

Getting Started-

We are using python language along with it’s libraries requests, pandas and BeautifulSoup for web scraping.

Requests- It is a Python HTTP library. It makes HTTP requests simpler. we just need to add the URL as an argument and the get() gets all the information from it.

Pandas- Pandas is a library that may be used to manipulate and analyse data. It is customary to extract data and save it in the desired format.

BeautifulSoup- is another powerful Python library for pulling out data from HTML/XML files. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping.

  1. Find the URL that you want to scrape :

We will be scraping weather forecasts from the National Weather Service. The first step is to find the page we want to scrape. We’ll extract weather information about Chicago from this page.

We’ll extract data about the extended forecast.

As you can see from the image, the page has information about the extended forecast for the next week, including time of day, temperature, and a brief description of the conditions.

2. Inspecting the Page :

The data is usually nested in tags. So, we inspect the page to see, under which tag the data we want to scrape is nested. To inspect the page, just right click on the element and click on “Inspect”.

3. Find the data you want to extract :

We’ll extract the name of the forecast item, the short description, and the temperature first, since they’re all similar.

4. Write the code :

Here, I am using Google Colab for this code.

Import libraries.

import requests
from bs4 import BeautifulSoup
import pandas as pd

5. Run the code and extract the data

open the URL and extract the data.

url = "https://forecast.weather.gov/MapClick.php?lat=41.884250000000065&lon=-87.63244999999995#.XtpdeOfhXIX"
r = requests.get(url)

Using the Find and Find All methods in BeautifulSoup. We extract the data and store into the variable.

soup = BeautifulSoup(r.content,"html.parser")
week = soup.find(id="seven-day-forecast-body")
items = soup.find_all("div",class_ = "tombstone-container")

period_name = [item.find(class_="period-name").get_text() for item in items]
short_desc = [item.find(class_="short-desc").get_text() for item in items]
temp = [item.find(class_="temp").get_text() for item in items]

6. Store the data in the required format

The list is transformed to a dataframe with the pandas. df.to_csv is used to convert the data frame to a CSV(Comma Seperated Values) file.

df = pd.DataFrame({"Period" : period_name,"Short Description" : short_desc,"Temperature" : temp})

df.to_csv("18IT090_WeatherData.csv")
CSV file

You can check my code here.

This is just a basic code which scrapes the weather data found on website into a CSV file which can be used to visualize the data in a meaningful way. I would appreciate feedback or suggestions. :)

--

--

Krupa Patel

Passionate about web designing and Adobe Photoshop.