While the debate rages on whether it is legal or not, web scraping, a classic business logic abuse threat is only getting bigger. It is commonly used by companies to keep pace with the competition. Scraping bots collect huge amounts of information, such as pricing data or product descriptions, and have to continuously perform this operation at scale to be valuable. Scraping bots technically are similar to web crawlers. Web crawlers, however, are typically good bots that help increase SEO rankings while scraping bots almost always create negative consequences for the websites being scraped. Understanding and measuring the economic impact of these consequences is a critical step in managing scraping bots.
The Scraping Marketplace
There are many publically available, affordable services that offer web scraping services. This indicates that there is a real market for such services. Some of the common providers are shown in the screenshots below.
Figure: Web scraping service providers
Also, there are several full-time and freelance job postings from e-commerce companies looking for web developers with scraping experience. The second listing shown below directly states that they want to find products that can be resold on various Amazon marketplaces for a profit of no less than 30%. In some cases, the listings also mention that the target website is protected by a commercial bot mitigation service - a clear indication of the importance of web scraping for such companies. Developers offering scraping services make use of CAPTCHA solving services through APIs and can automate fake account creations to get past the basic bot mitigation techniques on most websites.
Figure: Example postings for roles with web scraping responsibilities
Quantifying the Business Impact of Scraping
Scraping hurts your revenue in more ways than you know. Quantifying the business impact of scraping is a complex exercise for most companies. There are several factors that need to be considered including lost search engine ranking, wasted infrastructure costs and theft of intellectual property or trade secrets.
The research report from Aberdeen Research, The Business Impact of Website Scraping, finds that, “The median annual business impact of website scraping is as much as 80% of overall e-commerce website profitability.” For the Media sector, the research estimates the annual business impact of website scraping is between 3.0% and 14.8% of annual website revenue, with a median of 7.9%. The report takes into account all of the factors that an online business should consider when calculating the revenue impact on their own business. The report has in-depth analysis for several verticals including e-commerce, travel and media. Using mathematical models from the report you can understand the real impact of web scraping beyond the loss of competitive advantage.
Read the report to get insights and takeaways including:
- Estimates for the impact of website scraping on the e-commerce sector as a function of annual website revenue.
- Quantitative analysis on how to measure the business impact of website scraping for your online business.
- Understanding of the risk of web scraping and what steps to take to manage bots to an acceptable level.
Find out how brands like Crunchbase are winning against scraping bots. Read the case study.