← Back to Blog
PythonSeleniumWeb Scraping

How to Scrape Meta & Facebook Ads with Python: A Step-by-Step Tutorial

12 min read

In the world of digital marketing, competitive intelligence is gold. Understanding your competitors' advertising strategies—what creatives they're running, what copy they're testing, and which landing pages they're driving traffic to—can provide an invaluable edge. The Meta Ad Library is a public repository of this information, but manually sifting through it is inefficient and impossible to scale.

This is where web scraping comes in. By building a custom scraper, you can automate the collection of this public ad data, turning a manual chore into a powerful, automated intelligence pipeline. This step-by-step guide will walk you through building a robust Facebook Ads scraper using Python, tackling the specific challenges that modern, dynamic websites present, without needing any login credentials.

Why is Scraping the Ad Library Challenging?

Scraping a simple, static website is one thing, but Meta's platform is another beast entirely. A simple HTTP request with libraries like requests won't work because the content is:

  • Dynamically Loaded: Ads are loaded with JavaScript as you scroll down the page. The initial HTML your script receives doesn't contain all the data you see in your browser.
  • Reliant on "Infinite Scroll": There are no "next page" buttons. New content is only loaded when you scroll to the bottom, a mechanism designed for human users, not simple bots.

To succeed, we need a tool that can automate a real web browser to mimic human behavior. For this task, Selenium is the perfect choice.

Prerequisites

Before we begin, make sure you have the following:

  • Python 3.6+ installed
  • pip, the Python package installer
  • A basic understanding of HTML and CSS selectors

Step 1: Setting Up Your Python Environment

First, let's install the necessary libraries. Open your terminal or command prompt and run the following commands:

pip install selenium
pip install beautifulsoup4
pip install pandas
  • Selenium: The core library for automating browser actions
  • BeautifulSoup4: A fantastic library for parsing HTML and extracting data from it
  • Pandas: A powerful data analysis library that makes it incredibly easy to save our scraped data to a CSV file

You will also need a WebDriver, which is the bridge between your Python script and the web browser. We'll use ChromeDriver. Download the version that matches your Google Chrome browser version and place the executable in a known location on your computer.

Step 2: Initializing the Scraper and Navigating

Create a new Python file (e.g., ad_scraper.py). We'll start by importing the libraries and writing the code to launch a browser and navigate to the Ad Library.

import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup

# --- Configuration ---
# Path to your ChromeDriver executable
DRIVER_PATH = '/path/to/your/chromedriver' 
# Example URL for scraping ads related to the keyword 'skincare'
AD_LIBRARY_URL = "https://www.facebook.com/ads/library/?active_status=all&ad_type=all&country=ALL&q=skincare&search_type=keyword_unordered&media_type=all"
OUTPUT_FILENAME = 'facebook_ads_data.csv'

# --- Main Script ---
print("Launching browser...")
driver = webdriver.Chrome(executable_path=DRIVER_PATH)

print(f"Navigating to: {AD_LIBRARY_URL}")
driver.get(AD_LIBRARY_URL)

# Wait for the page to load initial content
time.sleep(5)

Step 3: Handling the Infinite Scroll

This is the most critical part of the scraper. We need to programmatically scroll down the page to force the website to load more ads. We'll do this in a loop until no new ads are loaded.

print("Starting to scroll down the page...")
last_height = driver.execute_script("return document.body.scrollHeight")
ads_scraped = []
unique_ad_texts = set()  # To track unique ads and avoid duplicates

# Set a limit for scrolls to prevent an infinite loop
scroll_attempts = 0
MAX_SCROLLS = 50

while scroll_attempts < MAX_SCROLLS:
    # Scroll to the bottom of the page
    driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    
    # Wait for new content to load
    time.sleep(3) 

    # Calculate new scroll height and compare with last scroll height
    new_height = driver.execute_script("return document.body.scrollHeight")
    if new_height == last_height:
        print("Reached the end of the page or no new content loaded.")
        break  # Exit if no more content is loaded
    last_height = new_height
    scroll_attempts += 1
    print(f"Scroll attempt {scroll_attempts}/{MAX_SCROLLS} successful. New page height: {new_height}")

    # --- Data extraction logic will go here in the next step ---

Step 4: Finding Selectors and Extracting Ad Data

While the page is scrolled, we need to extract the data. To find the right CSS selectors, go to the Ad Library in your browser, right-click on an ad element (like the ad text or image), and click "Inspect." This opens the developer tools and shows you the HTML. Look for repeating parent containers that hold all the information for a single ad.

Disclaimer: These selectors are examples and will change as Meta updates its website. You will need to inspect the page and update them periodically for your scraper to work.

Now, let's add the extraction logic inside our while loop:

# --- This code goes INSIDE the `while` loop from Step 3 ---

# Get the page source and parse it with BeautifulSoup
soup = BeautifulSoup(driver.page_source, 'html.parser')

# Find all ad containers (this selector is an example and WILL change)
ad_containers = soup.find_all('div', class_='x1a2a7pz x1a5g62h x1fw500j x1unuyjp x1iorvi4') 

print(f"Found {len(ad_containers)} potential ad containers on this scroll.")

for ad in ad_containers:
    ad_data = {}
    
    # Extract ad text
    try:
        # This selector is an example, find the correct one for ad copy
        ad_text_element = ad.find('div', class_='_7j2a') 
        ad_data['text'] = ad_text_element.text if ad_text_element else 'N/A'
    except:
        ad_data['text'] = 'N/A'

    # Extract image URL
    try:
        # This selector is an example, find the correct one for the ad image
        img_element = ad.find('img', class_='xt7dq6l xl1xv1r')
        ad_data['image_url'] = img_element['src'] if img_element else 'N/A'
    except:
        ad_data['image_url'] = 'N/A'
        
    # Extract landing page URL
    try:
        # This selector is an example, find the correct one for the ad link
        link_element = ad.find('a', {'rel': 'noopener nofollow'})
        ad_data['landing_page'] = link_element['href'] if link_element else 'N/A'
    except:
        ad_data['landing_page'] = 'N/A'

    # Avoid duplicates by checking the ad text
    if ad_data['text'] != 'N/A' and ad_data['text'] not in unique_ad_texts:
        ads_scraped.append(ad_data)
        unique_ad_texts.add(ad_data['text'])

print(f"Total unique ads scraped so far: {len(ads_scraped)}")

Step 5: Saving the Data and Closing the Browser

Once the loop finishes, we can save our collected data to a CSV file using pandas and gracefully close the browser.

# --- This code goes AFTER the `while` loop ---

print("Scraping finished. Closing browser.")
driver.quit()

# Create a DataFrame and save to CSV
if ads_scraped:
    df = pd.DataFrame(ads_scraped)
    df.to_csv(OUTPUT_FILENAME, index=False, encoding='utf-8')
    print(f"Successfully saved {len(ads_scraped)} ads to {OUTPUT_FILENAME}")
else:
    print("No ads were scraped. The output file was not created.")

Best Practices for Robust Scraping

  • Be Respectful: Don't hammer the server with requests. The time.sleep() calls in our script are essential to mimic human behavior and be less disruptive.
  • Use Proxies: For large-scale scraping, your IP address might get temporarily blocked. Using a pool of rotating residential proxies can help you avoid this issue.
  • Handle Errors: Our try-except blocks are a simple form of error handling. For production scrapers, you'd want more robust logic to handle cases where elements aren't found.
  • Stay Updated: Scrapers are fragile. Meta will inevitably change its website layout, which will break your selectors. Be prepared to inspect the page and update your code regularly.

Building your own scraper gives you unparalleled control over your competitive research. With Python, Selenium, and BeautifulSoup, you have a powerful toolkit to unlock the valuable public data within the Meta Ad Library.

Want to skip the setup?

Building and maintaining scrapers takes time. If you need reliable Facebook Ads data without the hassle, try AdScraping.

Try AdScraping Free