Web Scraping With Swift Tutorial

Web Scraping With Swift Tutorial 2025

In this tutorial, I’ll walk you through how to scrape web content using Swift, with the help of the SwiftSoup library. SwiftSoup is like jQuery for Swift, making it easy to parse HTML and extract the data you need. Whether you’re new to Swift or just want to try scraping in a new language, this guide will help you get started quickly and efficiently. Let’s dive in!

Why Choose Swift for Web Scraping?

Swift is primarily known for iOS and macOS development, but is also a language capable of handling tasks like web scraping. While it’s not as commonly used as Python for this purpose, Swift has several advantages that make it appealing:

  • Performance: Swift is fast and efficient, which is particularly beneficial for handling large amounts of data.
  • Native Support: Swift runs natively on macOS, which makes it a great option for developers working within Apple’s ecosystem.
  • Type Safety: Swift is a type-safe language, meaning that errors can be caught during compile time rather than runtime. This makes your scraping script more robust and less prone to bugs.

⚡ Alternative Option: Use a No-Code Scraper

If you’re looking to scrape data at scale or avoid dealing with browser setup, proxy rotation, or anti-bot challenges, you might want to consider a hosted solution like a Web Scraper API. It’s a fully managed tool that handles JavaScript rendering, CAPTCHA solving, and IP rotation out of the box.

While this tutorial focuses on building a scraper with Swift, tools like Bright Data or ScrapingBee can be a practical alternative for developers who prefer to focus on data analysis rather than infrastructure. It’s especially useful when working with complex or protected websites.

Prerequisites

Before you begin, ensure that you have the following:

  1. Swift Installed: If you’re on macOS, you can install Xcode, which includes Swift. On Windows or Linux, you can download Swift from the official site and follow the installation instructions.
  2. Swift Package Manager (SPM): This tool will help you manage libraries such as SwiftSoup, which you’ll need for parsing HTML.

Once you have Swift installed, open your terminal and verify it by typing:

swift - version

If you see something like “Swift version 5.9.2,” you’re ready to go!

Setting Up Your Swift Project

Step 1: Create a New Swift Project

First, create a new directory for your project and navigate into it. Then, initialize a new Swift command-line tool:

mkdir SwiftScraper
cd SwiftScraper
swift package init - name SwiftScraper - type executable

This will create a new directory with the necessary files for your Swift project. You’ll see a Package.swift file and a Sources folder containing main.swift.

Step 2: Open Your Project

You can open your project in Xcode or any other Swift-compatible IDE. The main.swift file inside the Sources folder contains a simple “Hello, World!” program, which you’ll replace with your web scraping code.

Step 3: Install SwiftSoup

Next, you’ll need to install SwiftSoup, a library that will help you parse and extract data from HTML. Open the Package.swift file and add SwiftSoup as a dependency:

import PackageDescription
let package = Package(
name: "SwiftScraper",
dependencies: [
.package(url: "https://github.com/scinfu/SwiftSoup.git", from: "2.6.0")
],
targets: [
.executableTarget(
name: "SwiftScraper",
dependencies: [
.product(name: "SwiftSoup", package: "SwiftSoup")
]
),
]
)

After updating Package.swift, run the following command to install the package:

swift package update

This will download and integrate SwiftSoup into your project.

Performing Web Scraping with Swift

Now that you’ve set up your project and installed SwiftSoup, it’s time to start scraping data from a website. We’ll use the website “https://scrapeme.live/shop/” as an example. This site has a list of products that we can extract.

Step 1: Fetch HTML Content

To scrape a webpage, the first step is to retrieve the HTML content. Swift provides a native way to make HTTP requests using URLSession, but for simplicity, we’ll use the String initializer to directly fetch the HTML:

import Foundation
import FoundationNetworking
import SwiftSoup
let url = URL(string: "https://scrapeme.live/shop/")!
let html = try! String(contentsOf: url)

In this code snippet, we create a URL object pointing to the target page and use String(contentsOf:) to fetch the HTML content.

Step 2: Parse HTML Content

Once you have the HTML content, the next step is to parse it. SwiftSoup makes this easy with the parse function. This function converts the HTML string into a Document object that you can manipulate:

let document = try! SwiftSoup.parse(html)

At this point, the entire webpage has been loaded and parsed into a structure that you can query and extract data from.

Step 3: Extract Data

Now that the HTML is parsed, you can begin extracting the data you need. For example, you can extract all product names, prices, and URLs from the page. We’ll use CSS selectors to select the relevant elements. Here’s how you can extract data for a single product:

let product = try document.select("li.product").first()!
let url = try product.select("a").first()!.attr("href")
let image = try product.select("img").first()!.attr("src")
let name = try product.select("h2").first()!.text()
let price = try product.select("span").first()!.text()
print("URL: (url)")
print("Image: (image)")
print("Name: (name)")
print("Price: (price)")

In this example:

  • select(“li.product”) selects all product list items.
  • select(“a”), select(“img”), select(“h2”), and select(“span”) are used to target the specific product details like the URL, image, name, and price.
  • attr(“href”) and attr(“src”) extract the values of attributes, while text() gets the text content.

Step 4: Extract Multiple Products

The page contains multiple products, so you’ll need to loop through all product elements to extract data for each one. Here’s how you can do that:

var products: [Product] = []
let productElements = try document.select("li.product")
for element in productElements.array() {
let url = try element.select("a").first()!.attr("href")
let image = try element.select("img").first()!.attr("src")
let name = try element.select("h2").first()!.text()
let price = try element.select("span").first()!.text()
let product = Product(url: url, image: image, name: name, price: price)
products.append(product)
}
for product in products {
print("(product.name) - (product.price) - (product.url)")
}

In this code, we use array() to convert the elements into an array, then iterate over them to extract the necessary data for each product.

Step 5: Save Data to a CSV File

Finally, you might want to save the scraped data to a CSV file for easy analysis. To do this, you can use the CSV.swift library. Add it to your Package.swift dependencies:

.package(url: "https://github.com/yaslab/CSV.swift.git", from: "2.4.3"),

Then, import it into your main.swift:

import CSV
Now, you can write the scraped data to a CSV file:
let stream = OutputStream(toFileAtPath: "products.csv", append: false)!
let csv = try! CSVWriter(stream: stream)
try! csv.write(row: ["URL", "Image", "Name", "Price"])
for product in products {
try! csv.write(row: [product.url, product.image, product.name, product.price])
}
csv.stream.close()

This code creates a CSV file named products.csv and writes the product data into it. Each row will contain the URL, image URL, name, and price of a product.

Handling Pagination and Crawling Multiple Pages

Many websites, especially e-commerce sites, have pagination, which means that the product data is spread across multiple pages. In this case, you’ll need to crawl through multiple pages to gather all the data.

Here’s how you can handle pagination:

  1. Fetch the URL of the first page.
  2. Parse the HTML content of the page.
  3. Extract the links to the next page(s).
  4. Repeat the process for the next page until all pages are scraped.

For example, if the pagination links are located in anchor tags with the class page-numbers, you can extract the next page links like this:

let paginationLinks = try document.select("a.page-numbers")
for link in paginationLinks.array() {
let nextPageUrl = try link.attr("href")
print("Next page: (nextPageUrl)")
}

Conclusion

Swift offers a fast and efficient platform for web scraping, and with libraries like SwiftSoup, you can easily parse and extract data from HTML content. While web scraping can be complex, especially when dealing with pagination and anti-bot measures, Swift provides all the tools you need to build powerful and efficient web scrapers.

Similar Posts