# Weighting Methodology

## Weighting Methodology

### Weighting is a process by which data is adjusted to reflect the known population profile.

It's used to balance out any significant variance between actual and target profile.

Weighting is generally done on demographic questions and target profile is mostly census data.

Weighting is processed via two techniques (depending on number of variables in

weighting schemes)

• Cell

• RIM/Raking/IPF Weighing

Weighting is done if in the sample the responses show that particular groups (for example younger people or those living in a particular area) are under represented in the sample. If this is not carried out then the results will not properly reflect the views of the population being considered.

Please review the full documentation below for a more in depth explanation.

• Cell

• RIM/Raking/IPF Weighing

Weighting is done if in the sample the responses show that particular groups (for example younger people or those living in a particular area) are under represented in the sample. If this is not carried out then the results will not properly reflect the views of the population being considered.

Please review the full documentation below for a more in depth explanation.

## Standard Cell Weighting vs. Rim/Rake Weighting

## Cell Weighting

Cell weighting is the most standard weighting scheme, where the weights are computed so the sample totals conform to the target totals on a cell-by-cell basis, which means that the distribution of the target population must be known for each target cell in case of weighting by multiple demographics.

*Please note: the target grand total 1,500 is different than sample grand total 1,000 in this example.There will not be any issue if the total must be kept same as the sample.*

In the above example, the weighting is being done across two different demographics A and B, where A has 4 categories and B has 3 categories. For cell weighting, the weights the computed for each cell of the distribution, by dividing the target size by the sample size. The outcome is given below.

## Disadvantages of Cell Weighting

The disadvantage of cell weighting is that it can lead to a large variability to the distribution of the weighting adjustments and inflating the standard deviations of the survey data. Also, a practical disadvantage is that one needs to know the entire target joint distribution of all the weighting variables. For example, if the weighting needs to be done for a survey in US on 3 demographic variables such as age (5 levels), gender (2 levels), marital status (4 levels), then there are 5x2x4=40 cells. If the US state of residence is added, then it becomes 40×50=2000 cells. Thus, it gets extremely difficult as more weighting variables are added.

Rim weighting overcomes these disadvantages. For rim weighting, only the marginal distribution of the target needs to be known. The complete joint distribution of the weighting factors is not necessary. The weights computed in this scheme are more reasonable, reducing the possibility of requiring any weight capping. We will describe it further below.

## Rim/Rake Weighting

An iterative proportional fitting procedure estimates the individual weights. The first iteration computes weights to match the first dimension (weighting variable) totals, the second iteration matches the second dimension totals, and so on. These steps for all the dimensions are performed repeatedly unless convergence is achieved within an acceptable margin of error. If we look at our previous example, we see that the marginal distributions of the weighting variables A and B are as below:

In the first iteration, we find the weight factors for A (first dimension). The row of A1 is multiplied by 175/100, the row of A2 is multiplied by 550/500, A3 is multiplied by 430/200, A4 is multiplied by 345/200. The outcome is given below:

Obviously, when weighting only using variable A, the counts for B will not match. So, in the next step, the weights for B (second dimension) are computed. The column of B1 is multiplied by 365/356.75, the column of B2 is multiplied by 415/504, and the column of B3 is multiplied by 720/639.25.

It can be observed again that the counts for A are not matching now. Here comes the iterative procedure. The weighting is recalculated for A, and then again for B, and so on. Once the counts have achieved convergence within an acceptable margin of error, the final weights are assigned to the respondents.

The final weights from this procedure are given below:

**b³’s Proprietary Tool**

The proprietary tool by b³ follows the same rim weighting procedure described above. After the weights are built, the weights are extracted along with the respondent id’s and then merged with the original dataset for further processing, such as tabulation in Wincross/UNCLE.

## Weight Capping

If the computed weights become too high or low, sometimes they are forced to not cross certain limits. In our current practice, we use the lower cap to be 0.2 and the upper cap to be 5.0.

## Number of Acceptable Dimensions for Weighting

## Using Quotas with Weighting

If a survey has pre-determined quotas on some variables and has some other variable to perform the weighting on, then the quota variables can be used with the other weighting variables during the procedure. Since the iterative procedure adjusts the data to match the target proportions for all weighting variables, it can maintain the quota proportions within an acceptable margin of error.

## Sample Balance Weighting

In cases where sample sizes across segments or geographies need to be balanced, population weights are generated.

Every record/responder in the data set is given a weight (Px) to start off with that’s based on below formula.

This factor is inserted into RIM weighting as another variable. This ensure not only are the demographics balanced but also sample sizes line up based on requirements.

Most common methodology used for balancing sample across segments/geographies is to use the measure of central tendency (average). This ensures weight factors are kept within acceptable level, adjustments are not extreme.

## Weighting Efficiency Score (WES)

Weight efficiency score is a metric that’s used to determine the efficacy of weighting algorithm. It’s inversely corelated to variance between actual and target proportions. It’s a numeric score between 0 and 100.

# Ri – Raw / Unweighted case of i

# Wi – Weighted case of i

Weight efficiency score would be calculated using

The higher the efficiency score the smaller the variance between target and actual proportions, leading to smaller weight factors. Higher score also indicates the likelihood of majority of weight factors falling in the optimum weighting range (0.8 to 1.2).

Weighting solutions that generate a lower efficiency score are generally not ideal and should be revisited (generate heavy weight factors), score of less than 50 is a good indicator of extreme skews in proportions. Such scores (< 50) also indicate the need for revisiting the algorithm.

## Weighting Report

A report that shows the variance between actual, target and weighted proportions along with weightefficiency score and weight spread (min weight – max weight).

This report helps in showcasing the effectiveness of the weighting solution and acts as a proof that weighting schema achieved the targets.

## Contact

Toronto – Corporate Headquarters

100 Sheppard Ave East. Suite 503

Toronto, Ontario M2N 6N5

+1 (416) 549-8000

+1 (888) 224-6198

info@b3intelligence.com

Copyright © 2020 b3Intelligence, All Rights Reserved | Disclaimer | Privacy Policy