Proxies. What would we be doing if they weren’t there? Obviously, you are aware of how important proxies are, but you are also aware of how expensive they are as well, am I correct? To elaborate, you purchase proxies for GSA Search Engine Ranker, Scrapebox, and a variety of other applications, and before you know it, you are spending upwards of $1000 on proxies each and every month! Surely, there must be a better way, don’t you think? Here comes GSA Proxy Scraper to the rescue.
GSA Proxy Scraper Features
On May 11, 2015, GSA released their new proxy scraping software – GSA Proxy Scraper. Now, seven years after its initial release, it has significantly improved, and in a nutshell, here is what you can expect from this tool:
- Harvest tens of thousands of proxies in just a few clicks – the software only requires you to click a few times after which you can sit back and enjoy the proxy scraping show.
- Thousands of pre-defined proxy providers – the software will also continue to seek out proxies until told otherwise.
- GSA Proxy Scraper also comes in with a port scanning functionality – giving you the ability to scrape proxies from locations where you would not otherwise be able to find proxies is a powerful tool. IP and port ranges can be quickly and easily scanned.
- Smart tagging functionality – allows you to create custom tags for your proxies, for example, proxies that work for YouTube or Facebook, and then filter the scraped proxies by these tags.
- Very intuitive and easy to use tool – the user interface is extremely simplified (you will see it all in a minute), and it takes no more than a few hours to get used to.
- Strong exporting and filtering functionality – GSA Proxy Scraper is capable of sending proxies to emails, saving them to a target location, uploading them via FTP, and anything else you can think of. This GSA tool also allows for the scheduling of exports, so you can, for example, configure it so that you receive Google working proxies every morning at 6 a.m., for example.
- Powerful testing capabilities – GSA Proxy Scraper can test proxies against Web 2.0 sites, social accounts, email providers, search engines, and custom URLs provided by the user.
- Supports the most popular proxy types – web, connect, socks4, and socks5.
- Great compatibility with top internet marketing tools – GSA Proxy Scanner can export proxies in formats which are ready for use by tools such as GSA SER, Scrapebox, GScraper, NohandsSEO, Ultimate Demon, you name it.
- Has a built-in internal proxy server – this allows you to use GSA Proxy Scraper as a personal VPN which gives your browser a fresh new IP after each refresh. However, always keep in mind that you do not know whether someone is watching what you are doing with a certain proxy.
- Lifetime license and updates – GSA Proxy Scraper has a one-time fee after which you have it for life including the constant updates rolled out by the GSA developers.
GSA Proxy Scraper Tutorial
Basically, we will split this GSA Proxy Scraper tutorial into 5 sections:
- Main Menu – the “Add”, “Test”, “Export”, “Remove”, “Tools”, “Settings”, “Help”, and “Quit” buttons.
- Proxies table – the table below the “Main Menu” where all of the proxies found by GSA PS will show up.
- Proxies info bar – this bar shows info about the proxies in the “Proxies” table i.e. how many are public, how many are anonymous, etc.
- Log – shows you what the software is doing i.e. parsing proxy providers, testing proxies, etc.
- Statistics bar – displays system information about GSA Proxy Scraper.
Alright. Now, that we’ve split this GSA tool into separate sections, let’s look at each of them in more detail, starting with the “Main Menu”.
The Main Menu
As you can see, there are eight buttons on the main menu. Each of them is equipped with a unique set of capabilities. Let’s start from the right to the left because I want to get to the “Settings” button first before moving on to the other buttons.
The Quit Button
Can you guess what this one is supposed to do? Yes, you are correct. It does nothing more than close GSA Proxy Scraper. I’m not sure why it’s there, given that the pretty red “x” button is directly above it, but it’s there for whatever reason.
The Help Button
When you click this one, you will see the following menu show up:
- Manual – clicking this will take you to the official GSA Proxy Scraper manual.
- Homepage – takes you to the homepage of GSA PS.
- Forum – opens the GSA forums.
- View Version History – opens a text file which shows the changelog of the software.
- Create Bugreport – allows you to create a bug report and send it straight to the GSA developers.
- Check for Updates – checks for new updates.
The Settings Button
First and foremost, there is a small arrow to the right of the button for your convenience. You’ll be able to quickly turn on or off the automatic proxy scraping functionality of GSA Proxy Scraper once you click on it – we’ll go over that in a minute.
Whenever you click on the “Settings” button, a popup window with the following options will appear:
As you can see, to the left, we have 5 different tabs. By default, the “Settings” tab is selected.
The first option is “Activate internal proxy server.” This is the default setting. Allow me to provide an illustration of what this accomplishes with an example. Consider the following scenario: you’ve used GSA Proxy Scraper to scrape 100 Google proxies, and you’d like to use them in a GSA SER campaign, for example. To accomplish this, simply select “Activate internal proxy server” from the drop-down menu and then enter the proxy address “127.0.0.1:8080” into Search Engine Ranker. That’s all there is to it. GSA SER will now use all of the proxies that were scraped by GSA PS.
The rest of the options on the “Settings” tab:
- Set Proxy in Browser – you can set GSA Proxy Scraper’s internal server to work with your browser meaning that every new request created by your browser will use a random proxy from the ones scraped in GSA PS.
- Use proxies with tags – this allows you to tell GSA Proxy Scraper to allow other tools that are connected to the software to use only proxies with certain tags i.e. working for Google or Bing for example.
- With authorization – allows you to add login credentials to your internal proxy server, so that tools connecting to it would have to provide the username and password you specify.
- Add a Fake IP for Transparent Proxies in Headers – tries to add another fake IP in the HTTP headers for transparent proxies in order to confuse remote web servers of the real IP.
- Use a Random User Agent – you can tell GSA Proxy Scraper to emulate your requests with different browsers and you can also edit said user agents via the “EDIT” link to the right.
- Use a new proxy each ‘x’ seconds – you can configure GSA PS to switch up proxies at a specified time interval. If left unchecked, the software will pick a random proxy every time.
- Proxy Tags – basically, these are just labels you create for your proxies so you can easily group and filter them later on. There are in-built ones, but you can edit them if you want to. The default ones are:
- Maximum Threads – the maximum number of threads the software will use to scrape and test proxies.
- Connect Timeout – this determines how fast you want each proxy to be. A lower connect timeout means that slower proxies will be filtered out.
- Other Timeout – this is the timeout when the software is being connected to the proxy and is receiving data from it. Again, a lower timeout will filter out slower proxies.
The Provider Tab
This is essentially the location where all of the proxy providers/sources are located. While GSA Proxy Scraper comes pre-loaded with a large number of in-built proxy servers (over 800 at the time of writing and counting), with newer updates adding more and more, you can also add your own:
The first thing on this tab is the “Proxy Providers” table which, as you see, has 4 columns:
- Name – the URL from where GSA PS is extracting the proxies.
- Quality – this score basically shows how good a certain proxy source is in terms of providing working proxies. The higher the quality number the better the proxy source. The maximum value is 100.
- Last Working – the number of proxies that were working versus the number of proxies found.
- Description – a simple note regarding proxy providers. This field is just a note and can be whatever you like it to be.
Below the “Proxy Sources” table, you can see the following 2 checkboxes:
- Parse extracted links from search-engine-tests – this will basically use the proxy as a search query on Google, Bing, Yahoo, and other search engines and will save all the URLs shown in the SERPs. After that, it will try and extract additional proxies from said URLs.
- Use Search Engines to locate proxy lists – GSA Proxy Scraper will search on Google, Bing, Yahoo, etc, to find new proxy lists using already found IPs (the “Proxy” checkbox to the right) and/or custom search queries (the “Other” checkbox to the right).
Finally, you have the buttons at the end:
- Import – allows you to import proxy providers from other programs.
- Add – allows you to add a proxy provider of your own in a number of ways:
- Enter Manually
- Enter URL
- Paste from Clipboard
- Edit – allows you to edit a proxy provider’s information and configuration.
- Delete – allows you to delete a proxy provider. Don’t worry, you will be asked to confirm the action first.
- Clear Cache – GSA Proxy Scraper keeps information about proxy providers every time you scrape for proxies. So if a certain proxy provider proves to give too much none working proxies, it will be ignored on subsequent proxy searches. However, that provider might have gotten better proxies and you’d want to get your hands on them. Clearing the cache will allow GSA Proxy Scraper to go through every proxy provider again and test out all of its proxies.
The Automatic Export Tab
Because it allows you to schedule your proxy exports and simply receive them automatically, this is an extremely useful feature of GSA Proxy Scraper, as it eliminates the need for you to perform any manual work on your end:
It consists of a table listing all of your automatic exports with the four self-explanatory columns you see in the image above, and then three buttons below it that do what they say.
- Add – This functionality allows you to set up scheduled exports of your proxies. For all of the types allowed, you can select a time interval, a custom name, filters which will apply to the proxies (in the cases where you want only certain proxies to be exported, for example, only Google proxies). You have the power to choose from 4 different types of proxy exporting:
- FTP Upload – you can auto-export via FTP.
- Send Email – you can send an email at a specified time interval.
- Save to File – you can save to a certain local destination automatically.
- Web Upload – you can upload to some URLs on the Internet.
- Edit – you can edit the automatic proxy export you have selected from the table above.
- Delete – you can delete a proxy export process. You will be first asked to confirm.
The Automatic Search Tab
This tab allows you to schedule automatic searches, which means that GSA Proxy Scraper will scrape for new proxies at the interval you specify: for example, every hour.
I believe that by default, this functionality is enabled, so make sure to double-check this when you first install this proxy scraping tool to be sure. I believe that all of the settings on this page are self-explanatory, but just in case, here they are:
- Interval ‘x’ minutes – the number of minutes after which GSA Proxy Scraper will fire up and start searching for new proxies.
- Only if less than ‘x’ working proxies are present – the software will start looking for new proxies only if the current number of proxies is less than the specified amount.
- Stop if more than ‘x’ new working proxies are found – if the specified amount of new working proxies is found, GSA PS will stop the scraping and testing process.
- Test proxies against – you can choose what the proxies should be tested against either from the pre-defined templates (by default Anonymous and Google search are selected), or you can add your own custom tests via the “Add” button, or you can edit existing ones via the “Edit” button. You can also remove testing templates and you can test them as well to see if they work as expected.
- Re-Test also present working proxies – this will force GSA Proxy Scraper to re-test the already scraped proxies.
- Only test TAGS – you can filter the proxies which are to be re-tested via tags.
- Re-Test also presents none working proxies – if you want, you can give failed proxies a second chance.
- Automatically remove proxies – enable this if you want GSA Proxy Scraper to remove proxies automatically which meet certain criteria:
- When down for more than ‘x’ minutes – if certain proxies are dead for more than the specified number of minutes, they will be automatically removed.
- Store previously working proxies in a file before removing – this is just in case you want to test these proxies again later.
The Filter Tab
Now that GSA Proxy Scraper has discovered a large number of proxies for you, you’ll want to know how to filter them out:
GSA Proxy Scraper’s “Proxies” table is pre-filtered before it is added to the “Proxies” table, and this tab allows you to do just that:
- Do not accept anonymous (no elite) proxies – GSA PS will not keep anonymous proxies.
- Do not accept transparent proxies – GSA Proxy Scraper will not keep transparent proxies.
- Skip suspicious proxies – suspicious proxies are proxies which might be spying on your activities and will be skipped.
- Accept only if tagged as – keep only proxies that match a certain tag.
- Accept only the following Ports – useful if you only want proxies on a certain set of ports.
- Skip the following ports – useful if you want to avoid using proxies on a certain set of ports.
- Skip duplicate IPs – useful if you want only unique IPs.
- Accept only the following Types – you can also pre-filter proxies by type:
- Accept only the following Regions – and finally, you can pre-filter proxies by location.
And with that, we put an end to the “Options” menu and its 5 tabs.
The Tools Button
Next on our list from the “Main Menu” is the “Tools” button. Clicking the button will show you the following menu:
- Statistics – the statistics functionality allows you to see stats about the proxies scraped by GSA PS such as:
- Proxy Scanner – this is the proxy scanning tool we talked about earlier and it can be quite useful. What you can do is give it a range of IPs and ports and it will scan them and try to find new proxies. Simple as that.
The Remove Button
This allows you to remove:
- Highlighted – removes the proxies you selected from the “Proxies” table of GSA PS.
- Not working – removes all none-working proxies.
- Low Speed – removes proxies with a speed lower than the one you enter.
- From Source – removes proxies from the proxy providers you select.
The Export Button
Allows you to manually export your proxies:
- CSV (Excel)
- Text (host:port:login:password)
- Text (login:password@host:port)
Of course, you will choose depending on the tool you will be importing the proxies into.
The Test Button
Gives you the power to test the proxies found by GSA Proxy Scraper:
- Not Working
The Add Button
At the end of the “Main Menu”, we have the “Add” button which basically allows you to add proxies to GSA PS. You have several options:
- From File
- From Clipboard
- From Proxy Buddy
- Proxy Scanner – allows you to add proxies using GSA Proxy Scraper’s proxy scanning tool.
- Search Online – allows you to add proxies using GSA PS’ proxy providers. You can either choose all or a sub-set.
- Use Search Engines
- Parse Proxy-Search Links – these are links which are acquired by GSA Proxy Scraper by using proxies as search queries on search engines.
- Previously Removed – in case you wrongfully removed some proxies.
And with that, we end our journey through the “Main Menu” of GSA Proxy Scraper. Now comes the fun part.
The Proxies Table
This is the point at which the magic happens. All of the proxies discovered by GSA Proxy Scraper will be displayed in this section. In order to demonstrate what the table looks like when it is populated with proxies, I simply ran the software for 10 seconds:
Now, usually, I would break down the meaning of each column, but it’s more than obvious here. What you can’t see from the image above is the context menu which pops up when you right-click anywhere on the “Proxies” table:
- Test Highlighted – tests the proxies you have selected against the test templates you choose.
- Delete Highlighted – deletes the proxies you have selected.
- Select by TAG – selects proxies from the table which match the tags you choose.
- Set TAG – you can manually set the TAG of the proxies you have selected.
- Set as Browser Proxy – you can set the selected proxy to be used in a browser of your choice.
- Copy Highlighted – you can copy the proxies you have selected directly to your clipboard in the following formats:
- CSV (Excel)
- Text (host:port:login:password)
- Text (login:password@host:port)
- Export Highlighted – does the same thing as the “Export” button from the “Main Menu”, but exports only the proxies you have selected.
- Proxy Scan selected IP Ranges – makes use of the proxy scanning tool to scan selected IP ranges.
- Collect Proxy Information – gathers and shows detailed info about the proxy you have selected.
- Copy Source URL – copies the proxy provider URL of the selected proxy.
- Open Source URL – opens the proxy provider URL of the selected proxy.
And that’s pretty much all there is to the “Proxies” table of GSA Proxy Scraper. Just a nice and easy-to-use proxy management interface. Really user friendly.
The Proxies Info Bar
Specifically, this is the small and thin bar that appears immediately below the “Proxies” table and immediately above the “Log” in GSA Proxy Scraper. Simply put, it displays the total number of proxies that the software has discovered, as well as their type (Socks4, Socks5, and so on) and their TAG (Google, Anonymous, etc).
That’s all there is to it. When scraping for proxies, this is extremely useful because you will almost certainly only be scraping for a specific type of proxy and you will want to know when you have reached your desired amount in order to stop the scraping process.
Simply put, the log shows you what GSA Proxy Scraper is doing – parsing a proxy provider, testing proxies, the number of proxies extracted from a proxy source, when a proxy testing process finished, the amount of time an extraction process took, and pretty much everything else that has to do with the proxy scraping and testing process.
The Statistics Bar
The statistics bar can be found at the bottom of GSA Proxy Scraper, right at the bottom of the window. It simply displays three statistics:
- Threads – the number of threads the software is running.
- Mem – the amount of memory that GSA PS is using.
- CPU – the percentage of CPU that GSA Proxy Scraper is utilizing.
That’s all there is to it. This document provides a comprehensive overview of the entire functionality of the proxy scraping tool. Consider a real-world example, because, let’s face it, you want to see this baby in action before making a decision. We’re going to start from the beginning.
What GSA Proxy Scraper Is Capable Of?
One of the first things I’m going to do is completely clear the cache of my GSA Proxy Scraper application. In order to find as many proxies as possible, I would like it to search through all of the proxy providers. Now, keep in mind that if you’ve downloaded the trial version, it gives you access only to a limited amount of the pre-defined GSA PS proxy providers, so you will get even better results with the full version of the software. Here is the real life scenario.
I set the number of threads to 500 because the machine on which GSA Proxy Scraper is installed is powerful enough to handle it, and the software itself doesn’t require a lot of resources in the first place. Time is also valuable to me and to these proxies, so the sooner the scraping and testing process is completed, the better it is for everyone.
Now, what I’m looking for are Google proxies because I want to see how long I can scrape on Google using Scrapebox while using the proxies that GSA PS has discovered for me. Let’s see how this goes. I’m looking for some target URLs for our GSA Search Engine Ranker instances, so let’s see how it goes.
Just about 27 minutes and 42 seconds after I fired up GSA Proxy Scraper, I am looking at a total of 996 proxies:
Here’s a break down of the proxies:
- 263 web.
- 654 connect.
- 68 socks4
- 11 socks5
As for the tags, we have the following stats:
- 124 transparent
- 34 anonymous
- 830 elite
I just wanted to point out that the process took significantly longer than usual because I had cleared the cache of GSA PS. After allowing the software to cycle through all of the proxy providers a couple of times, it will become significantly faster.
Let’s get down to the business of finding what we’re looking for. Out of the 996 proxies tested, 62 passed the Google test performed by GSA Proxy Scraper. Let’s see what Scrapebox has to say about these proxies, which aren’t too shabby at all. Despite the fact that the software informs me that there are 62 Google proxies, I still want to test all 996 in Scrapebox to confirm this. Who knows, isn’t it?
So, I export all proxies to a text file:
GSA Proxy Scraper Proxies
Okay then. It’s time to open the Scrapebox. As soon as it starts up, I import the proxies from the previously mentioned file and test them to see which ones can be used by SB to search for things on Google. Scrapebox’s Google test finds that 55 of the 996 candidates pass.
Scrapebox Google Proxies
I require blog comments on articles related to link building, so I simply scrape some keywords from Scrapebox using the keyword “link building,” merge them with the blog comments footprints from GSA SER, and voila! I have blog comments on articles related to link building. In the end, I have a total of 4932 keywords.
Scrapebox Target Keywords
And now we’re ready to begin the process of harvesting the crops. Let’s see how these 55 proxies perform in the real world. The harvesting process began at a rate of approximately 50 URLs/s, which is actually quite fast. It maintained its speed for about 10 minutes before dropping to about 20 URLs/s, which is still quite good. The following is a screenshot of the harvesting process after 30 minutes of harvesting time:
The proxies are still able to process 18 URLs per second. That’s still a rate of 64,800 URL requests per hour. Isn’t that pretty good? When there were no more keywords for which URLs could be found, I let Scrapebox run until there were no more URLs to be found for them. Here is a screenshot taken 60 minutes into the game:
As you can see, the proxies are still performing admirably, with 17 URLs/s being served. After another three hours had passed, I returned from the gym to find Scrapebox still going strong:
However, even though the number of URLs has decreased to 15, this is still quite impressive in my opinion. Finally, here is a screenshot of Scrapebox following the completion of the harvesting process:
Everything was completed in less than 5 hours. I was curious to see how many of the proxies would still pass the Google test after more than 5 hours. Out of the 55 proxies that passed the Google test at the start of this GSA Proxy Scraper test, 19 are still operational after more than 5 hours. If you ask me, that’s pretty darn good work.
And with that, our little GSA Proxy Scraper experiment comes to a close.
Honest Review of the GSA Proxy Scraper
To begin, let me respond to the question that brought you here in the first place: can GSA Proxy Scraper completely replace all of your monthly proxy subscriptions, or at the very least some of them? The answer will be highly dependent on your particular circumstances, but allow me to break it down for you:
- If you are a new marketer who doesn’t have enough capital yet, GSA Proxy Scraper is the solution for you. You will get the proxies you need very fast.
- If you are a more advanced marketer, but you are still looking to save a buck or two on monthly proxy expenses, GSA PS is the software for you.
There are many more, but these two were the first to come to mind. Let me now explain how we use it so that you can form your own conclusions. First and foremost, I’d like to state that we do not use GSA PS to fulfill our proxy requirements for GSA Search Engine Ranker. Call me old-fashioned, but I prefer knowing that access to the proxies we use in GSA SER is restricted.
We need a lot of new verified URLs because we run several GSA SER instances. If those lists aren’t updated on a regular basis, Search Engine Ranker won’t be able to perform at its best. Scraping for target URLs 24 hours a day, seven days a week is necessary to keep the lists up to date. For example, BuyProxies now charges $80 per month for 50 private proxies. While some may be unconcerned, I was taught financial optimization at a young age.
50 private proxies are sufficient to scrape target URLs on a daily basis using Scrapebox, GScraper, or whatever, and that’s exactly what we were doing until GSA released their proxy scraper. I have it set up to scrape proxies every morning and email them to me. When I wake up, all I have to do is open Scrapebox and import the proxies from the file, test them all, and keep only the Google proxies. With just these proxies, I receive around a million new target URLs every day.
And all I had to pay was a one-time fee, which was the same price as BuyProxies’ monthly fee for 50 private proxies. I’d just like to point out that the BuyProxies proxies will work much better in terms of URLs/s, but this is money that could be put to better use elsewhere. Obtain what you require rather than what you desire.
Don’t get me wrong: BuyProxies proxies are fantastic, and we use them extensively for our GSA SER instances, but they can be replaced in some minor cases. Which cases are you referring to, precisely? Here’s an example:
- Scraping with Scrapebox.
- Scraping with GScraper.
- Some software like a social bot that requires a couple of proxies for account creation.
- Email account creators.
And pretty much any software that necessitates the use of proxies in a limited capacity. Keep in mind that in our previous experiment, I only scraped proxies for about 30 minutes with GSA. PS: I didn’t let it run all the way to the end. I would have ended up with a lot more proxies if I had done it that way. Then there’s the proxy scanning feature, which, if used correctly, will provide you with some very nice proxies that no one else is likely to use.
While GSA Proxy Scraper does not completely replace our proxy needs, it does help us save money on a monthly basis. You’ve seen an example of this software’s capabilities, so you can decide for yourself whether it can help you save money as well.
The fact that it allows me to completely automate the proxy scraping and exporting process is one of my favorite features of this tool. That’s fantastic. But, in any case, GSA Proxy Scraper is an incredible tool that assists us greatly and saves us a significant amount of money, and I personally see nothing but positive developments in its future. Now, let the proxy scraping games begin.