CAP enhances Software as a Service – Bot Detection

This white paper is one of a series, outlining how CAP is being used to enhance a Software as a Service application.

Keywords:

Cloud, SaaS, Bots, CDN, Scrapers

The context:

A large retailer is outsourcing their e-commerce application to a third-party provider. This relationship has existed for a long time, however the retailer grew increasingly frustrated with the cost and time involved in enacting any changes to the user experience and the lack of certain features. Adding to the sense of frustration was the fact that any requests were completely at the mercy of the third-party provider.

The solution:

CAP was installed in the Amazon cloud between the end users and the third party provider to help overcome some of these problems. By taking advantage of the CAP Agent’s built-in adaptive static content caching and ability to act inline in real time with minimum performance impact, the retailer resumed control of their brand and reputation.

Deployment diagram:

Why CAP:

No other offering on the market offers a CDN-like capability that also involves enhancing and enriching content on the fly on a massive scale as well as addressing urgent security shortcomings. The retailer needs this capability to have some measure of control over third party software they are otherwise unable to change.

The story:

The retailer was being asked by the third-party provider to increase infrastructure capacity to cover peak traffic times. As well as the cost implication, the retailer was unaware as to the origin of all the traffic hitting the site and therefore unable to effectively manage traffic.

Using CAP, a plan was quickly hatched to resolve the issue:

The first step was to manage allowed traffic to the retailer’s site. To achieve this, a number of geo-location, IP monitoring and blocking rules were implemented. Refer: white paper on Ecom Fraud Risk.

Secondly bot detection and monitoring rules were put in place. CAP was used to inject a 1 x 1 pixel image that only really a bot could detect, with searchable links that bots such as Google bot and MSN bot, would search, identify and select. It could then be assumed that only those arriving at this URL were bots and not real users. By recording user agent data strings, a report was populated with traffic statistics such as referring country. Although the geo-location and IP monitoring rules were in place for normal user access, unwanted bot traffic would not be detected by their existing any web application security tools and these were still accessing the site.

Example dashboard report:

The limitations:

This approach resulted in a bot-traffic monitoring and reporting tool. Additional rules would be required to automate and manage bot-traffic at a more granular level if required.

Business benefits:

Isolating the bot traffic allowed the retailer to manage genuine user and wanted bot or search traffic to the site. Obviously for search engine optimization on ecommerce stores bot traffic is essential. However this solution allowed the retailer to isolate and block unwanted, or reroute, bot traffic, particularly at peak traffic times or times when stock databases were being updated. As a result, only genuine customers could be allowed onto the site at peak times, and the requested upscale of infrastructure was no longer required dramatically reducing performance overhead on the platform.

Rules blocks used:

Http Server execute (web page modification)

String Replacer

Http Request Tracker (get user agent information)

Maxmind Geo Info (IP lookup)

Last updated