Cookies help us display personalized product recommendations and ensure you have great shopping experience.

By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data analytics in ecommerce
    Analytics Technology Drives Conversions for Your eCommerce Site
    5 Min Read
    CRM Analytics
    CRM Analytics Helps Content Creators Develop an Edge in a Saturated Market
    5 Min Read
    data analytics and commerce media
    Leveraging Commerce Media & Data Analytics in Ecommerce
    8 Min Read
    big data in healthcare
    Leveraging Big Data and Analytics to Enhance Patient-Centered Care
    5 Min Read
    instagram visibility
    Data Analytics Plays a Key Role in Improving Instagram Visibility
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: 2011 Census: Comparing Apples and Oranges
Share
Notification Show More
Font ResizerAa
SmartData CollectiveSmartData Collective
Font ResizerAa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Big Data > Data Quality > 2011 Census: Comparing Apples and Oranges
CommentaryData QualitySocial DataStatistics

2011 Census: Comparing Apples and Oranges

Mario Pineda
Last updated: September 24, 2012 2:31 pm
Mario Pineda
7 Min Read
Image
SHARE

ImageAnyone following Canadian news could hardly have missed the release of the results from the 2011 national census by Statistics Canada this week.

ImageAnyone following Canadian news could hardly have missed the release of the results from the 2011 national census by Statistics Canada this week. The findings of this census have been (and still are) reported extensively by the Canadian media, e.g. here, here, here, here and (of course) on twitter.

This is the third instalment of data Statistics Canada releases from its 2011 census, this time portraying the changes in Canadian families and living arrangements. Earlier this year the results on population and dwelling counts (February) and age and sex (May) were released and in October the final instalment will be released on language.

As the numbers from this census are reported, discussed and, in particular, compared to the results obtained in previous censuses (there is a census every five years) there is a remarkable lack of acknowledgement that this census is fundamentally different from previous ones (with the exception of here). This silence is remarkable because two years ago the statistical community, including Statistics Canada, were up in arms over methodological changes to the census that were introduced by the Canadian government. For all the hoopla, very little noise is raised now as the data from the 2011 census is being released. It almost appears that the concerns have been forgotten, or swept under the rug, or relegated to the depths of technical documents.

More Read

Data Quantity, or Data Quality?

The Data Analytics of the NFL Playoffs
5 Unique Ways People Use Social Data In Their Business
“Some is not a number and soon is not a time”
Anatomy of a Private Cloud: History, Architecture, Platforms and Other Resources

Up to 2011 the Statistics Canada census consisted of a mandatory long-form questionnaire. Ahead of the 2011 census (June 17, 2010 to be exact) the government decided to abolish the long-form questionnaire in favour of a new voluntary National Household Survey (NHS). This sudden and unexpected announcement stunned the statistical community in Canada (and beyond) and caused an uproar of indignation, including from Statistics Canada which had not been involved in making the decision. The backlash escalated to the point of the Chief Statistician of Canada, Munir Sheikh, resigning with the following statement:

I want to take this opportunity to comment on a technical statistical issue which has become the subject of media discussion. This relates to the question of whether a voluntary survey can become a substitute for a mandatory census. It can not.

Introducing a voluntary census is asking for trouble. The United States once attempted a similar experiment, but abandoned it after determining that data from voluntary surveys are unreliable, since marginalised groups are less likely to fill out the forms. Moreover, in order to keep the sample size constant despite a reduced response rate, the government decided to send out more forms, at an additional cost of $30m. Canadians ended up paying more money for less accurate information.

In order for a survey to give a true (i.e.  unbiased) representation of the entire population the individuals (or households) have to be sampled randomly. Although the importance of random sampling in surveys is one of the great insights of statistics it is also non-trivial to implement. In the case of the 2011 census, by making the census voluntary the sample is no longer random, even if the sampled individuals were chosen randomly. It is well-known that response rates vary with income and educational level so by making the response voluntary some part of e.g. the income and educational spectrum will be misrepresented in the resulting data. For the 2011 census we have now way of quantifying or even knowing the bias in the samples. Are changes in variables over the last five years real changes or artefacts arising from the change in methodology? What we do know is that all surveys are subject to non-response bias, even the mandatory long-form census with its 94% response rate. The risk of non-response bias quickly increases, however, as the response rate declines. This is because, in general, non-respondents tend to have characteristics that are different from those of the respondents and thus the results end up not representative of the true population. Given that the National Household Survey achieved a response rate of only 69% there is clearly a substantial risk of non-response bias and unfortunately we have no way of knowing which segment of the population is missing from the sample.

As if this increased uncertainty about the quality of the 2011 census data is not enough the comparison of the results from the current census to results from previous censuses (without acknowledging the methodological differences) is essentially a comparison between apples and oranges. In all fairness, buried in the Statistics Canada’s documentation of the 2011 census the methodological difference are mentioned, but unless you are specifically looking for this information it is unlikely that you would find it.

So does this mean that we should not be comparing the 2011 census to previous year’s censuses? Strictly speaking, no we should not be comparing apples to oranges, particularly when the results are being used to set monetary policies, determine how the labour market is changing and allocations to education and social services. Assuming that comparisons will be made (the temptation may simply be to great even if Statistics Canada refrained from doing it) it becomes even more important to ensure that the limitations and potential biases in the current survey are fully, i.e. publicly, acknowledged. No census is perfect and albeit some of the glitches and limitations in the 2011 census are publicized in the media (e.g. here) the change from mandatory to voluntary methodology, which affects all the data in the census, has received virtually no attention in the media or by Statistics Canada. In the broader scheme of things one can only hope that order will be restore and that scientific evidence based political decisions will some day overturn this very unfortunate turn of events and that the 2011 census will remembered as an anomalous data point in the long and exceptional history of Canadian statistics.

This is from the blog of MPK Analytics (www.mpkanalytics.com). In the business of helping clients transforming data into insight through the power of R.

Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

trusted data management
The Future of Trusted Data Management: Striking a Balance between AI and Human Collaboration
Artificial Intelligence Big Data Data Management
data analytics in ecommerce
Analytics Technology Drives Conversions for Your eCommerce Site
Analytics Exclusive
data grids in big data apps
Best Practices for Integrating Data Grids into Data-Intensive Apps
Big Data Exclusive
AI helps create discord server bots
AI-Driven Discord Bots Can Track Server Stats
Artificial Intelligence Exclusive

Stay Connected

1.2kFollowersLike
33.7kFollowersFollow
222FollowersPin

You Might also Like

A Swarm of Nano Quadrotors: The flying robot video you absolutely must watch

2 Min Read

Proctor & Gamble – A Case Study in Business Analytics

11 Min Read

Case Study: Social Media Action and Response at StubHub

1 Min Read

The CIO Diaries – Bridging the Gap to LBOs

3 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive
ai is improving the safety of cars
From Bolts to Bots: How AI Is Fortifying the Automotive Industry
Artificial Intelligence

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-24 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Username or Email Address
Password

Lost your password?