Case study

Webshop Product Analysis

Web scraper development

In this case study, I will guide you through the development of a web scraper and the accompanying challenges, solutions, and the successful outcome.

With this web scraper, I enable the client to automatically collect product data from various online stores. Previously, they did this manually.

Webshop Product Analysis

Type

  • Web scraper, data analysis

Role

  • Software developer
  • Maintenance
  • CI/CD

Timeline

  • 2024 - Present

Client

  • External client

Challenges and goals

The client for this project is involved in data analysis of online products. They monitor specific products, particularly focusing on prices and review scores.

My goal for this project was to automate their manual work so that a new CSV feed with all the collected product data is available daily.

The most suitable programming language to use was Python. Since this was my first professional project where I had to apply Python, it became a bit more challenging for me. Despite that, I delivered a good product.

Python is a suitable and widely used language for building web scrapers. It is easy to use and supports useful data processing libraries.
Python is a suitable and widely used language for building web scrapers. It is easy to use and supports useful data processing libraries.

Architecture and technologies

As mentioned above, the web scraper was implemented in Python. This Python script takes an input CSV feed with products to be analyzed as an argument. The output CSV feed is available for download via a secured HTTP endpoint, using Caddy 2.

The entire system runs on an Ubuntu server. With the help of cron jobs on the server, the Python script is automatically executed at set times.

High-level overview of the architecture and some technologies.
High-level overview of the architecture and some technologies.
For this project, I regularly had meetings with the client to present and explain interim results.
For this project, I regularly had meetings with the client to present and explain interim results.

Development process

During this project, I worked iteratively. During interim meetings, I was able to incorporate changing requirements and feedback.

Since the deployment of the first version, I have continuously implemented updates and performed maintenance.

Testing

As always, I unit test all code. I used the built-in Python unittest & mock libraries for this.

The entire Python application is unit tested.
The entire Python application is unit tested.

Need an experienced software engineer?

I help companies with software solutions (SaaS, automation and more) — from backend to frontend. Feel free to contact me to see how I can contribute to your project.

  • Robust backend: Java/Kotlin, SQL, Spring Framework
  • User-friendly front-end: Next.js, React, Typescript, ES6
  • Rapid development: Continuous integration & deployment, Jenkins
  • Efficient collaboration: Agile, Scrum, Jira, Git, Bitbucket, GitHub
  • Freelance software developer from Arnhem, The Netherlands