|
UK Statistical Disclosure Control Policy for 2011 Census Output
|
BY IAN WHITE, ONS
Background
The Registrars General (RsG) for England and Wales,
for Scotland and for Northern Ireland have agreed,
as part of their commitments to UK harmonisation
for the 2011 Census, to aim for a common Statistical
Disclosure Control (SDC) methodology for 2011
Census outputs. This will help facilitate the aim (as
far as is possible) of harmonising the three Censuses
where it is in the interests of users to do so.
Adoption of a common SDC methodology across the
UK will be widely welcomed by census users within
the local authority community, but will only be
possible if there is an agreed SDC policy position
across the three Census Offices; that is, an
agreement about what constitutes a disclosive risk in
a Census context and tolerable risk thresholds.
A statement setting out the SDC policy position, as
agreed by the RsG, was announced on the National
Statistics website in December and has been
disseminated to Census users.
UK SDC Policy Position
The UK 2011 Census SDC policy position is based on
the principle for protecting confidentiality set out in
the National Statistics Code of Practice, which
includes the guarantee that “no statistics will be
produced that are likely to identify an individual
unless specifically agreed with them”.
Because the key strength of the Census is its
completeness of coverage and its ability to generate
statistics about very small areas and groups of people
(as is necessary to ensure that Government and other
policies take account of the needs of local
communities), it is impracticable to remove entirely
the risk of disclosure without harming the utility of
the data. With that in mind the RsG have concluded
that the NS Code of Practice statement above can be
satisfied in relation to Census outputs if no statistics
are produced that allow the identification of an
individual (or information about an individual) with a
high degree of confidence. The RsG consider that as
long as there has been systematic perturbation of
the data, the Code of Practice guarantee would be
met.
It is considered that ‘attribute disclosure’ (that is,
learning something new about an individual or a group
of individuals) as opposed to ‘identification’ is
the key disclosure risk because identification reveals
no new information to the user. ‘Attribute disclosure’,
however, involves a user discovering something new
about an individual from the Census data that was
not previously known to him.
In a Census context, where thousands of tables are
generated from one database, the risk of attribute
disclosure occurring can be addressed by introducing
uncertainty about the true value of small cells.
In order to meet the agreed interpretation of the
Code of Practice, it has thus been agreed that small
counts (that is, 0s, 1s, and 2s) could be included in
publicly disseminated Census tables provided that:
a) uncertainty has been systematically created as to
whether or not the small cell is a true value; and
b) creating that uncertainty does not significantly
damage the data.
The exact threshold of uncertainty required has not
been decided. The RsG will make this judgement at a
later stage within the context of results from
methodological research into the balance of
protection afforded, and damage caused, by various
SDC methods.
Different levels of disclosure control are applied to
Census outputs according to the mode of access. In
general, the aim will be to make as much as possible
of the Census tabular output publicly accessible.
However, if tabular outputs are likely to be seriously
compromised by SDC (for example Origin/Destination
flows at low geographical levels) then these could be
released under other access arrangements (such as
under licence or in a safe setting), where restrictions
on access would allow less stringent levels of SDC to
apply, in order to protect the the utility of the data.
As a result of the Government’s decision to legislate
for ONS independence the current NS Code of
Practice: Protocol on Data Access and Confidentiality
will be replaced. But the obligation to preserve the
confidentiality of Census outputs is likely to be heavily
informed by the current Code.
Implications of the Proposed SDC Policy Position
for SDC Methodology
The decision to allow small cells in publicly
disseminated tables means that no methods of SDC
have been ruled out, and all methods will be evaluated. These would include
pre-tabular
approaches (where the perturbation takes place on a
master database before tables are produced), posttabular
methods (where it is carried out on the
individual tabulations), or a combination of both (as
was adopted in 2001). The RsG have, however,
expressed a preference for pre-tabular methods
provided that there is no undue damage to the data.
To ensure that the public and expert audiences alike
are confident that confidentiality will be preserved by
the measures taken to avoid disclosure, clear
explanations would be given on the protection
afforded by the SDC strategy, and other steps to
protect confidentiality, that had been been applied.
The choice of SDC methodology for 2011 Census
outputs will be based on an evaluation of the risk
and the utlity of the various possible methods.
Methods will be recommended that afford an
acceptable level of protection and preserve the
highest level of utility of outputs. Consistency and
additivity across tabular output is a key requirement
for users, and these will be given a high priority in
the assessment of the utility of SDC methods.
Next Steps
The principle outlined in the RsG’s statement provide
a basis for both consultation with users of Census
data and a two-year period of methodological
research. The latter will assess both pre- and posttabular
SDC methods in terms of the protection they
afford together with their impact on the integrity of
the data (a risk/utility framework). Because of the
interdependence between disclosure control of (predefined)
Census tabular data and disclosure control for other types of Census outputs
(such as microdata
samples and flexible user-defined tabular outputs),
SDC methods for all types of Census output will be
assessed concurrently, and a key consideration in
evaluating SDC methods for tabular data will be the
potential impact on these other types of Census
output.
Local Authority users will be updated and consulted
during the research period. There will also be an
independent review through the UK Census Design
and Methodology Advisory Committee, members of
which include Eileen Howes (GLA) and Jenny Boag
(Falkirk Council).
The 2011 Census White Paper for England and
Wales, and parallel statements relating to the Census
in Scotland and Northern Ireland, are timetabled to
be published in October 2008, and will formalise the
agreed policy position of the RsG by the inclusion of
an SDC policy statement. Recommended SDC
methods for all types of 2011 Census outputs will be
published in autumn 2008 for consultation, and
finalised in spring 2009.
For further details, please contact Ian White at
ian.white@ons.gsi.gov.uk
Return
to top
|