What is Data Skewing? (Symptoms, How to Prevent)

What is Data Skewing?

In a skewing attack, attackers attempt to fabricate (or skew) data in order to influence an organization's decision in their favor. Skewing assaults may be divided into two types −

  • Machine Learning Data Poisoning Attacks − It occurs when an attacker alters the training data used by a machine learning algorithm, causing it to make a mistake.

  • Web Analytics Skewing − Attackers manipulate analytics data from systems such as Google Analytics or Adobe Analytics by deploying bots to make a huge number of automated queries. The goal is to make it appear like visitors to a website complete particular things more frequently than they actually do.

How Does Web Analytics Skewing Work?

A typical web analytics skewing attack goes like this −

  • Bots are used by attackers to make automated HTTP requests in order to increase the number of visits to specific pages. These are often pages having transactional value, such as an eCommerce product page.

  • A large number of clicks is recorded by the web analytics system, and the website owners assume that there is a lot of interest in this item.

  • In some circumstances, the skewing bot will try to conduct conversion actions like filling out forms or making purchases. This necessitates a more advanced bot framework comparable to what scalping bots utilize.

Falsified analytics data may lead to a commercial decision, such as promoting the product more prominently on the website or including it in advertising campaigns by the website owner. Since the attackers are affiliates of the product promoted on the targeted page, the commercial choice benefits them.

What are the Consequences of Skewing?

Data is utilized to make critical business choices such as security incident classification, website redesign success or failure, marketing, and even product pricing. If the data is incorrect, the decisions made based on it will be incorrect as well, thereby harming company owners.

Here are some examples of bad business choices that can be influenced by skewing −

  • Misclassifying a spam email or a repeated login attempt as valid.

  • In large eCommerce companies, for example, choosing the wrong design in an A/B test might result in significant financial losses.

  • Making an inaccurate automated judgment, such as incorrectly giving a credit rating to a person

  • It is lowering the cost of Pay Per Click advertising for major advertisers, for example, by incorrectly calculating an ad's quality score.

  • Overcompensation of an affiliate or partner based on product page hits or conversion activities

Symptoms of an Attempt at Skewing

Keep an eye out for the following skews in your website traffic or application usage, and look into them to determine whether they're related to skewing −

  • Peaks in traffic that are unusual

  • Unusual increase in some user groups

  • An atypically high amount of pages or time spent per session

  • The bounce rate is quite high.

  • Within a program, unusual user behavior

  • Unusual use of a product or website feature that compromises security or costs money

Preventing Skewing Attacks

To assist prevent skewing on a website, use the following recommended practices −

  • While experienced attackers may employ contemporary browsers and user agents in their HTTP headers, many "script kiddies" deploy bots based on obsolete browsers. Without the danger of disrupting many genuine users, you can entirely ban these obsolete browser versions or utilize a tough CAPTCHA.

  • Block known harmful hosts and proxies. Compile a list of known malicious hosts and proxy networks. Allowing access from these sorts of sources may deter attackers from attempting to distort your website, API, or mobile apps. Keep in mind that more powerful anonymization techniques, such as residential proxies, are available to attackers.

  • Consider multiple ways bots might connect to your systems via the Internet, outside your website, to protect access points vulnerable to bots. APIs, mobile applications, and any other public-facing endpoint should all be protected. When you find a bot and ban it, be sure to publish the info across all endpoints.

  • Examine traffic sources. Look at analytics or model training data regularly, dive down into it, and seek segments with distinctive characteristics. If you locate one, look into it further to see if it contains data created by bots. Investigate spikes in usage—if your website or application's usage suddenly increases, look into what functionality was impacted.

If you can trace the whole surge back to a single traffic source, user group, or feature, you're dealing with a skewing assault.

Once you've detected skewing attacks, take the following steps to prevent them −

  • In your web analytics, filter out harmful sources.

  • In web analytics, block harmful IPs.

  • Examine firewall logs for harmful bot traffic linked to anomalous analytics data, then set up your firewall to stop it.