UK Statistical Disclosure Control Policy for 2011 Census Output

BY IAN WHITE, ONS

Background

The Registrars General (RsG) for England and Wales, for Scotland and for Northern Ireland have agreed, as part of their commitments to UK harmonisation for the 2011 Census, to aim for a common Statistical Disclosure Control (SDC) methodology for 2011 Census outputs. This will help facilitate the aim (as far as is possible) of harmonising the three Censuses where it is in the interests of users to do so.

Adoption of a common SDC methodology across the UK will be widely welcomed by census users within the local authority community, but will only be possible if there is an agreed SDC policy position across the three Census Offices; that is, an agreement about what constitutes a disclosive risk in a Census context and tolerable risk thresholds.

A statement setting out the SDC policy position, as agreed by the RsG, was announced on the National Statistics website in December and has been disseminated to Census users.

UK SDC Policy Position

The UK 2011 Census SDC policy position is based on the principle for protecting confidentiality set out in the National Statistics Code of Practice, which includes the guarantee that “no statistics will be produced that are likely to identify an individual unless specifically agreed with them”.

Because the key strength of the Census is its completeness of coverage and its ability to generate statistics about very small areas and groups of people (as is necessary to ensure that Government and other policies take account of the needs of local communities), it is impracticable to remove entirely the risk of disclosure without harming the utility of the data. With that in mind the RsG have concluded that the NS Code of Practice statement above can be satisfied in relation to Census outputs if no statistics are produced that allow the identification of an individual (or information about an individual) with a high degree of confidence. The RsG consider that as long as there has been systematic perturbation of the data, the Code of Practice guarantee would be met.

It is considered that ‘attribute disclosure’ (that is, learning something new about an individual or a group of individuals) as opposed to ‘identification’ is the key disclosure risk because identification reveals no new information to the user. ‘Attribute disclosure’, however, involves a user discovering something new about an individual from the Census data that was not previously known to him.

In a Census context, where thousands of tables are generated from one database, the risk of attribute disclosure occurring can be addressed by introducing uncertainty about the true value of small cells. In order to meet the agreed interpretation of the Code of Practice, it has thus been agreed that small counts (that is, 0s, 1s, and 2s) could be included in publicly disseminated Census tables provided that: a) uncertainty has been systematically created as to whether or not the small cell is a true value; and b) creating that uncertainty does not significantly damage the data.

The exact threshold of uncertainty required has not been decided. The RsG will make this judgement at a later stage within the context of results from methodological research into the balance of protection afforded, and damage caused, by various SDC methods.

Different levels of disclosure control are applied to Census outputs according to the mode of access. In general, the aim will be to make as much as possible of the Census tabular output publicly accessible. However, if tabular outputs are likely to be seriously compromised by SDC (for example Origin/Destination flows at low geographical levels) then these could be released under other access arrangements (such as under licence or in a safe setting), where restrictions on access would allow less stringent levels of SDC to apply, in order to protect the the utility of the data.

As a result of the Government’s decision to legislate for ONS independence the current NS Code of Practice: Protocol on Data Access and Confidentiality will be replaced. But the obligation to preserve the confidentiality of Census outputs is likely to be heavily informed by the current Code.

Implications of the Proposed SDC Policy Position for SDC Methodology

The decision to allow small cells in publicly disseminated tables means that no methods of SDC have been ruled out, and all methods will be evaluated. These would include pre-tabular approaches (where the perturbation takes place on a master database before tables are produced), posttabular methods (where it is carried out on the individual tabulations), or a combination of both (as was adopted in 2001). The RsG have, however, expressed a preference for pre-tabular methods provided that there is no undue damage to the data.

To ensure that the public and expert audiences alike are confident that confidentiality will be preserved by the measures taken to avoid disclosure, clear explanations would be given on the protection afforded by the SDC strategy, and other steps to protect confidentiality, that had been been applied.

The choice of SDC methodology for 2011 Census outputs will be based on an evaluation of the risk and the utlity of the various possible methods. Methods will be recommended that afford an acceptable level of protection and preserve the highest level of utility of outputs. Consistency and additivity across tabular output is a key requirement for users, and these will be given a high priority in the assessment of the utility of SDC methods.

Next Steps

The principle outlined in the RsG’s statement provide a basis for both consultation with users of Census data and a two-year period of methodological research. The latter will assess both pre- and posttabular SDC methods in terms of the protection they afford together with their impact on the integrity of the data (a risk/utility framework). Because of the interdependence between disclosure control of (predefined) Census tabular data and disclosure control for other types of Census outputs (such as microdata samples and flexible user-defined tabular outputs), SDC methods for all types of Census output will be assessed concurrently, and a key consideration in evaluating SDC methods for tabular data will be the potential impact on these other types of Census output.

Local Authority users will be updated and consulted during the research period. There will also be an independent review through the UK Census Design and Methodology Advisory Committee, members of which include Eileen Howes (GLA) and Jenny Boag (Falkirk Council).

The 2011 Census White Paper for England and Wales, and parallel statements relating to the Census in Scotland and Northern Ireland, are timetabled to be published in October 2008, and will formalise the agreed policy position of the RsG by the inclusion of an SDC policy statement. Recommended SDC methods for all types of 2011 Census outputs will be published in autumn 2008 for consultation, and finalised in spring 2009.

For further details, please contact Ian White at ian.white@ons.gsi.gov.uk

Return to top