Starting with a very practical problem, let's say we are building a scraping tool that scrapes this website:
https://example.com/blog
The URL contains a list of blog articles such as:
https://example.com/blog/how-to-use-make
https://example.com/blog/what-is-repeater
https://example.com/blog/best-way-to-learn-automation
The first URL (https://example.com/blog) contains only the first 10 articles, and the following 10 articles can be found in the paginated second page which is:
https://example.com/blog?page=2
So, the way we need to build our scraping tool is:
But, how do we [1] Create a URL with a page number? Are we going to manually create ?page=1, ?page=2, ?page=3... ? We can technically do that manually, but what if there are over 1,000 pages? Let's not even imagine how time-consuming it would be.
This is where Repeater comes in.
Repeater helps us increment numbers in a certain pattern by taking three inputs:
In the above screenshot, Repeater works like this:
Let's say we set the Initial value field to 2, the Repeats field to 5, and the Step field to 2. We would have the following values:
Do you see the pattern?
Now, let's go back to our scraping demo and build a scenario using Repeater.
In our scraping demo, we'll simply need to increment the number one by one starting at 1.
So, we have this base URL:
https://example.com/blog
We want to create a paginated version of the URL every time Repeater repeats itself. So for the first repeat, we want:
https://example.com/blog?page=1
And for the second repeat, we want:
https://example.com/blog?page=2
Then:
https://example.com/blog?page=2
https://example.com/blog?page=3
https://example.com/blog?page=4
Now, let's build that out!
First things first. Let's set up Repeater. Depending on what you are building, Repeater can come at the very first of your scenario, or somewhere else in your scenario. It really depends. But for this scraping demo, we'll start with Repeater.
I set Initial value to 1 because we are starting page=n at 1. Say for example, you are done scraping the first 100 pages already. In that case, you can set this number to 101. Like I already explained, the Repeats field is literally for specifying the number of times Repeater repeats the flow that comes after it. In this scraping demo, let's assume that we have 10 pages to scrape, so we set the field to 10.
Once that's done, let's go ahead and connect HTTP > Make a request module to Repeater.
The only thing you need to set up in HTTP > Make a request module is the URL field.
What is happening here is every time Repeater repeats itself, i will increment. It starts with 1 because we set the Initial value field to 1, and Repeater keeps repeating itself and everything that comes after it 10 times because we set the Repeats field to 10. Let's try running the scenario.
Now we have what we wanted:
https://www.example.com/?page=1
https://www.example.com/?page=2
https://www.example.com/?page=3
https://www.example.com/?page=4
https://www.example.com/?page=5
https://www.example.com/?page=6
https://www.example.com/?page=7
https://www.example.com/?page=8
https://www.example.com/?page=9
https://www.example.com/?page=10
From here, you can set up another HTTP > Make a request module to scrape each URL, then store the content into a Google Sheets, Airtable, Notion, etc. If bare HTTP requests get blocked by the website, you might want to use an API such as ScrapingBee.
So, that's a wrap!
I published the scenario we used in this demo as a template. You can make a copy of it here.
Iterator is always used against an array. An array is a data structure where each item inside it is treated equally. I'll cover Iterator on another post, but in short, I recommend you always use Iterator if you are performing a certain action against an array for simplicity even though Repeater can do the same thing, and keep Repeater solely for the purpose of incrementing numbers.
Sometimes you don't want Repeater to repeat everything after it, but just some parts of your scenario. There are two ways to achieve this.
The first way is using Filter. Just like the below screenshot, you block Repeater from proceeding with Filter, and only when i reaches a specified number, Repeater can proceed into the rest of the scenario.
A new node, a new rule. Repeater from a different node does not have access to other nodes.Taking advantage of this trait, we can build a structure like the following screenshot.