Microdata: Programme for the International Assessment of Adult Competencies, Australia

Statistics about the competencies of Australians in the domains of literacy, numeracy and problem solving skills in technology-rich environments

Introduction

The 2011-2012 Programme for the International Assessment of Adult Competencies (PIAAC) is an international survey coordinated by the Organisation for Economic Co-operation and Development (OECD). The OECD published a comprehensive International Report on 8 October 2013. Access is provided free of charge to an online Data Explorer tool, with which it is possible to design and create output in the form of tables and graphs. Security measures will ensure the confidentiality of survey respondents. A Public Use Data File is also provided free of charge, which contains unit record files of participating countries which can be used to perform analysis. These outputs are available from the OECD website at www.oecd.org/site/piaac/. Australian data is included in the Data Explorer but not in the OECD Public Use Data File, however the Basic CURF (International Comparison version) available in this release is in a format that can be used to analyse Australian data in conjunction with the International data available on the OECD Public Use Data File.

This product provides a range of information about the release of microdata from the Programme for the International Assessment of Adult Competencies, Australia, 2011-2012 (PIAAC) including details about the survey methodology, and how to use the CURFs. Data item lists and information on the conditions of use and the quality of the microdata as well as the definitions used are also provided.

Microdata are the most detailed information available from a survey and are generally the responses to individual questions on the questionnaire or data derived from two or more questions and are released with the approval of the Australian Statistician.

Comparability of time series

PIAAC was preceded by the Adult Literacy and Life Skills Survey (ALLS) 2006 and Survey of Aspects of Literacy (SAL) 1996. Data previously released in the ALLS and SAL publications are not directly comparable with PIAAC data. The reason for this is that the literacy and numeracy scores previously published for ALLS and SAL were originally based on a model with a response probability (RP) value of 0.8 but PIAAC scores are based on a model with a RP value of 0.67. The latter value was used in PIAAC to achieve consistency with the OECD survey Programme for International Student Assessment (PISA), in the description of what it means to be performing at a particular level of proficiency. The new RP value does not affect the score that was calculated for a respondent. However, it does affect the interpretation of the score. The literacy and numeracy scores for ALLS and SAL have been remodelled to make them consistent with PIAAC, however caution is advised when performing time series comparisons as analysis undertaken by the ABS and internationally has shown that in some cases the observed trend is difficult to reconcile with other known factors and is not fully explained by sampling variability. For more information see the explanatory notes of the publication Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0).

The prose and document literacy scales from ALLS and SAL have been combined to produce a single literacy scale comparable to the PIAAC literacy scale. The numeracy score from ALLS have been recalculated using a model that incorporates the results of all countries that participated in ALLS. (The previous model was based only on countries which participated in the first round of ALLS.) This has resulted in some minor changes to the ALLS numeracy scores. SAL did not collect a numeracy domain which is comparable with ALLS and PIAAC.

Remodelled scores from ALLS and SAL comparable with PIAAC are now available as an updated Expanded and Basic CURF for ALLS and an updated Basic CURF for SAL. In analysing the remodelled scores for ALLS and SAL users should refer to the new skill level descriptions provided in the appendix Scores and skill levels of the PIAAC publication (cat. no. 4228.0). The table below shows the comparability of skill domains across the three adult literacy surveys when using the remodelled scores.

 PIAAC 2011-2012ALLS 2006SAL 1996
LiteracyPP
Prose and Document Literacy from ALLS rescaled as Literacy to be comparable to PIAAC
P
Prose and Document Literacy from SAL rescaled as Literacy to be comparable to PIAAC
NumeracyPP
Numeracy rescaled to be comparable to PIAAC
O
Problem-solving in technology-rich environmentsPO
Problem-solving from ALLS is not comparable to PIAAC
O

Remodelled scores from ALLS and SAL are also included in data cubes available from the Data downloads section of the publication Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0).

For further information about comparing data from PIAAC with the previous surveys refer to the Comparability of Time Series section in the Explanatory Notes of the publication: Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0)

Available products

The following microdata products are available from this survey:

  • Basic CURF on CD-ROM. The Basic CD-ROM allows approved users interactive access in the user’s own environment (via a CD-ROM/DVD).
  • Basic CURF (International Comparison version) as an electronic file. The Basic CURF (International Comparison version) allows approved users interactive access in the user's own environment (via an electronic file) and is set up to enable the user to merge the file with the OECD's Public Use File to produce International comparisons.
  • Expanded CURF via the Remote Access Data Laboratory (RADL) and ABS Data Laboratory (ABSDL). Expanded CURFs allow more detail to be presented for some data items, for example, geography, industry and occupation.

Further information about these services, and other information to assist users in understanding and accessing microdata in general, is available from the Microdata Entry Page on the ABS web site.

Before applying for access to a CURF, users should read and familiarise themselves with the information contained in this product and the User Manual: Responsible Use of ABS CURFS.

Apply for access

To apply for access to the Basic or Expanded CURFs register and apply. To apply for access to the Basic CURF (International Comparison version) please contact microdata.access@abs.gov.au for further information.

Further information

Further information about the survey and the microdata products can be found in this product:

  • A detailed list of data items for the Basic CURFs and Expanded CURF is available in the Data downloads section
  • The Quality Declaration, Abbreviations and Glossary relating to these products can be found in the corresponding sections.

Data available on request

Data obtained in the survey but not contained on the CURF may be available from the ABS, on request, as statistics in tabulated form.

Subject to confidentiality and sampling variability constraints, special tabulations can be produced incorporating data items, populations and geographic areas selected to meet individual requirements. These are available on request, on a fee for service basis. Contact the National Information and Referral Service on 1300 135 070 or client.services@abs.gov.au for further information.

Survey methodology

Scope and coverage

Scope

The statistics in the CURFs were compiled from data collected in the Program for the International Assessment of Adult Competencies (PIAAC) survey, conducted throughout Australia from October 2011 to March 2012.

The scope of the survey is restricted to people aged 15 to 74 years who were usual residents of private dwellings and excludes:

  • diplomatic personnel of overseas governments
  • members of non-Australian defence forces (and their dependants) stationed in Australia
  • overseas residents who have not lived in Australia, or do not intend to do so, for a period of 12 months or more
  • people living in very remote areas
  • persons living in Collection Districts (CDs) which contained Discrete Indigenous Communities.

People living in CDs which contained Discrete Indigenous Communities were not enumerated for operational reasons.

Coverage

Households where all of the residents were less than 18 years of age were excluded from the survey because the initial screening questions needed to be answered by a responsible adult (who was aged 18 years or over).

If a child aged 15-17 years was selected, they could be interviewed with the consent of a parent or responsible adult.

Survey design

Multi-stage sampling techniques were used to select the sample for the survey. After sample loss, the sample included 11,532 households. After exclusions due to scope and coverage the final sample comprised 8,600 respondents. Of these 8,446 were fully responding or provided sufficient detail for scores to be determined. The remaining 154 respondents did not complete the survey due to literacy or language inadequacies and only their age and sex is included. In addition three respondents did not complete the survey for other reasons, and may be missing from some data items.

Data collection methodology

Information for this survey was collected face-to-face. Trained interviewers asked members of each household questions via Computer Assisted Interviewing (CAI) using the following methods:

  • An interview with Any Responsible Adult (ARA) to collect household details.
  • A Personal Interview (PI) with a randomly selected household member in scope to collect information for the Background Questionnaire (BQ). This questionnaire asked about education and training, employment, income and skill use in literacy, numeracy and ICT.
  • A self-enumerated exercise, which was conducted either 1) via a computer delivered instrument on a laptop, 2) by paper booklet, or 3) a mixture of both. Respondents who had experience using a computer, as determined by the BQ, undertook a self-enumerated exercise on the laptop computer that determined whether the respondent had the necessary mouse skills needed to complete a computer-based exercise. Respondents without the necessary skills were given a paper-based exercise.

Weighting, benchmarking and estimation

Weighting

Weighting is the process of adjusting results from the sample survey to infer results for the total in-scope population. To do this, a 'weight' is allocated to each enumerated person. The weight is a value which indicates how many persons in the population are represented by the sample person.

The first step in calculating weights for each person is to assign an initial weight which is equal to the inverse probability of being selected in the survey. For example, if the probability of a person being selected in the survey was one in 300, then the person would have an initial weight of 300 (that is, they represent 300 people).

Non-response adjustment

Non-response adjustments were made to the initial person-level weights with the aim of representing those people in the population that did not respond to PIAAC. Two adjustment factors were applied:

  • a literacy-related non-response adjustment, which was aimed at ensuring survey estimates represented those people in the population that had a literacy or language related problem and could not respond to the survey (these people cannot be represented by survey respondents because their reason for not completing the survey is directly related to the survey outcome, however they are part of the PIAAC target population.)
  • a non-literacy-related non-response adjustment, which was aimed at ensuring survey estimates represented those people in the population that did not have a literacy or language related problem but did not respond to the survey for some other reason.

Benchmarking

After the non-response adjustment, the weights were adjusted to align with independent estimates of the population, referred to as 'benchmarks', in designated categories of sex by age by state by area of usual residence. This process is known as calibration. Weights calibrated against population benchmarks ensure that the survey estimates conform to the independently estimated distributions of the population described by the benchmarks, rather than to the distribution within the sample itself. Calibration to population benchmarks helps to compensate for over or under-enumeration or particular categories of persons which may occur due to either the random nature of sampling or non-response.

The survey was benchmarked to the in scope estimated resident population (ERP).

Further analysis was undertaken to ascertain whether benchmark variables, in addition to geography, age and sex, should be incorporated into the weighting strategy. Analysis showed that including only these variables in the weighting approach did not adequately compensate for undercoverage in the PIAAC sample for variables such as highest educational attainment and labour force status, when compared to other ABS surveys. As these variables were considered to have possible association with adult literacy additional benchmarks were incorporated into the weighting process.

The benchmarks used in the calibration of final weights for PIAAC were:

  • state by highest educational attainment
  • state by sex by age by labour force status
  • state by part of state by age by sex.

The education and labour force benchmarks were obtained from other ABS survey data. These benchmarks are considered 'pseudo-benchmarks' as they are not demographic counts and they have a non-negligible level of sample error associated with them. The 2011 Survey of Education and Work (persons aged 16-64 years) was used to provide a pseudo-benchmark for educational attainment. The monthly Labour Force Survey (aggregated data from November 2011 to March 2012) provided the pseudo-benchmark for labour force status. The sample error associated with these pseudo-benchmarks was incorporated into the standard error estimation.

The process of weighting ensures that the survey estimates conform to persons benchmarks per state, part of state, age and sex. These benchmarks are produced from estimates of the resident population derived independently of the survey. Therefore the PIAAC estimates do not (and are not intended to) match estimates for the total Australian resident population (which include persons and households living in non-private dwellings, such as hotels and boarding houses, and in very remote parts of Australia) obtained from other sources.

Estimation

Survey estimates of counts of persons are obtained by summing the weights of persons with the characteristic of interest.

Note that although the literacy-related non-respondent records (154 people) were given a weight, plausible values were not generated for this population.

Reliability of estimates

All sample surveys are subject to error which can be broadly categorised as either sampling error or non-sampling error.

Sampling error occurs because only a small proportion of the total population is used to produce estimates that represent the whole population. Sampling error can be reliably measured as it is calculated on the scientific methods used to design surveys. Non-sampling error can occur at any stage throughout the survey process. For example, persons selected for the survey may not respond (non-response); survey questions may not be clearly understood by the respondent; responses may be incorrectly recorded by interviewers; or there may be errors when coding or processing the survey data.

Sampling error

One measure of the likely difference between an estimate derived from a sample of persons and the value that would have been produced if all persons in scope of the survey had been included, is given by the Standard Error (SE) which indicates the extent to which an estimate might have varied by chance because only a sample of persons was included. There are about two chances in three (67%) that the sample estimate will differ by less than one SE from the number that would have been obtained if all persons had been surveyed and about 19 chances in 20 (95%) that the difference will be less than two SEs.

Another measure of the likely difference is the Relative Standard Error (RSE), which is obtained by expressing the SE as a percentage of the estimate:

\(RSE \%=\Big(\frac{SE}{Estimate}\Big)\times 100\)

Generally, only estimates (numbers, percentages, means and medians) with RSEs less than 25% are considered sufficiently reliable for most purposes. In the publication Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0) estimates with RSEs between 25% to 50% are annotated to indicate they are subject to high sample variability and should be used with caution. In addition, estimates with RSEs greater than 50% have been annotated to indicate they are considered too unreliable for general use.

In addition to the main weight (as outlined earlier), each record on the CURFs also contain 60 'replicate weights'. The purpose of these replicate weights is to enable the calculation of the standard error on the estimate produced. This method is known as the 60 group Jack-knife variance estimator.

The basic concept behind this replication approach is to select different sub-samples repeatedly (60 times) from the whole sample. For each of these sub-samples the statistic of interest is calculated. The variance of the full sample statistic is then estimated using the variability among the replicate statistics calculated from these sub-samples. As well as enabling variances of estimates to be calculated relatively simply, replicate weights also enable unit record analyses such as chi-square and logistic regression to be conducted which take into account the sample design.

Further information about RSEs and how they are calculated can be referenced in the 'Technical Note' section of the following publication relevant to this microdata: Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0). RSEs for estimates in the tables published in this publication are available in spreadsheet format, as attachments to this publication.

Non-sampling error

Non-sampling error may occur in any collection, whether it is based on a sample or a full count such as a census. One of the main sources of non-sampling error is non-response by persons selected in the survey. Non-response occurs when persons cannot or will not cooperate, or cannot be contacted. Non-response can affect the reliability of results and can introduce a bias. The magnitude of any bias depends upon the rate of non-response and the extent of the difference between the characteristics of those persons who responded to the survey and those that did not.

Every effort was made to reduce non-response and other non-sampling errors by careful design and testing of the questionnaire, training and supervision of interviewers, and undertaking extensive editing and quality control procedures at all stages of data processing.

File structure

Weights and estimation

As the survey was conducted on a sample of households in Australia, it is important to take account of the method of sample selection when deriving estimates. This is particularly important as a person's chance of selection in the survey varied depending on the state or territory in which they lived. Survey 'weights' are values which indicate how many population units are represented by the sample unit.

There is one weight provided: a person weight (FINPWT on the Basic and Expanded CURFs and SPFWT0 on the Basic CURF (International Comparison version)). This should be used when analysing the record level data.

Where estimates are derived, it is essential that they are calculated by adding the weights of persons in each category, and not just by counting the number of records falling into each category. If each person's 'weight' were to be ignored, then no account would be taken of a person's chance of selection in the survey or of different response rates across population groups, with the result that counts produced could be seriously biased. The application of weights ensures that the person estimates conform to an independently estimated distribution of the population by age, sex, state/territory, part of state, labour force status and highest educational attainment.

Standard errors for estimates without plausible values

Each record also contains 60 replicate weights and, by using these weights, it is possible to calculate standard errors for weighted estimates produced from the microdata. This method is known as the 60 group Jack-knife variance estimator.

Under the Jack-knife method of replicate weighting, weights were derived as follows:

  • 60 replicate groups were formed with each group formed to mirror the overall sample (where units from a collection district all belong to the same replicate group and a unit can belong to only one replicate group)
  • one replicate group was dropped from the file and then the remaining records were weighted in the same manner as for the full sample
  • records in the group that were dropped received a weight of zero.

This process was repeated for each replicate group (i.e. a total of 60 times). Ultimately each record had 60 replicate weights attached to it with one of these being the zero weight (WRPWT01 - WRPWT60 on the Basic and Expanded CURFs and SPFWT1-60 on the Basic CURF (International Comparison version)).

Replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit record analyses such as chi-square and logistic regression to be conducted which take into account the sample design. Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate (based on the general weight) is then used to approximate the variance of the full sample.

To obtain the standard error (SE) of a weighted estimate y, the same estimate is calculated using each of the 60 replicate weights. The variability between these replicate estimates (denoted y(g) for group number g) is used to measure the standard error of the original weighted estimate y using the formula:

\(SE(y)=\sqrt{\frac{59}{60}\sum \limits^{60}_{g=1}(y(g)-y)^2}\)

Where:

\(g=\) the replicate group number
\(y(g)\) = the weighted estimate, having applied the weights for replicate group \(g\)
\(y=\) the weighted estimate from the sample

The 60 group Jack-knife method can be applied not just to estimates of the population total, but also where the estimate is a function of the population total, such as a proportion, difference or ratio. For more information on the 60 group Jack-knife method of SE estimation, see Research Paper: Weighting and Standard Error Estimation for ABS Household Surveys (Methodology Advisory Committee) (cat. no. 1352.0.55.029).

Use of the 60 group Jack-knife method for complex estimates, such as regression parameters from a statistical model, is not straightforward and may not be appropriate. The method as described does not apply to investigations where survey weights are not used, such as in unweighted statistical modelling.

Standard errors for estimates with plausible values

In order to minimise respondent burden, the three skill domains of literacy, numeracy and problem solving in technology rich environments were not directly assessed for each respondent. PIAAC used a matrix-sampling design to assign the assessment exercises to individuals so that a comprehensive picture of achievements in each skill domain across the country could be assembled from the components completed by each individual. PIAAC relied on Item Response Theory scaling to combine the individual responses to provide accurate estimates of achievement in the population. With this approach, however, aggregations of individuals values can lead to biased estimates of population characteristics. To address this, the PIAAC scaling procedures also used a multiple imputation or "plausible values" methodology to obtain data for all individuals, even though each individual responded to only a part of the assessment item pool. By using all available data, ten "plausible values" were generated for each respondent for each of the three domains of literacy, numeracy and problem solving in technology rich environments.

For each domain, proficiency is measured on a scale ranging from 0 to 500 points. Each person's score denotes a point at which they have a 67 per cent chance of successfully completing tasks with a similar level of difficulty. To facilitate analysis, these continuous values have been grouped into 6 skill levels for Literacy and Numeracy with 'Below Level 1' being the lowest measured level. The levels indicate specific sets of abilities, and therefore, the thresholds for the levels are not equidistant. As a result, the ranges of values in each level are not identical. The relatively small proportions of respondents who actually reached Level 5 often resulted in unreliable estimates of the number of people at this level. For this reason, whenever results are presented in the main report by proficiency level, Levels 4 and 5 are combined. Further information about the Plausible Values and definitions of the three domains can be found in Scores and skill levels of the publication Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0).

Each record contains the ten plausible values for each of the three domains.

For simple point estimates in any of the domains, it is sufficient to use one of the corresponding ten plausible values (e.g. PVLIT1 for the literacy domain), chosen at random to derive population estimates. If this method is chosen the standard error of the plausible score can be calculated using the formula for standard error for estimates without plausible values, as shown above.

However, a more robust estimate can be obtained by using all ten plausible values in combination. For example in order to report an estimate of the total number of people at Level 1 for literacy, first calculate the weighted estimate of the number of respondents at Level 1 for each of the ten plausible values for literacy (PVLIT1-PVLIT10) individually. Next sum the ten weighted estimates obtained. Then divide the result by ten to obtain the estimate of the total number of people at Level 1 for literacy. The process must then be repeated for each skill level.

Furthermore, when producing estimates by other variables available on the file the process must be performed for each skill level by each category of the by variable(s). For example in order to report an estimate of the total number of males at Level 1 for literacy, first calculate the weighted estimate of the number of males at Level 1 for each of the ten plausible values for literacy (PVLIT1-PVLIT10) individually. Next sum the ten weighted estimates obtained. Then divide the result by ten to obtain the estimate of the total number of males at Level 1 for literacy. The process must then be repeated for each skill level.

All values presented in the publication Programme for the International Assessment of Adult Competencies (PIAAC) (cat. no. 4228.0) are obtained by using all ten plausible values in combination, as described above.

Due to the use of multiple possible exercises and the application of plausible scoring methodology, the PIAAC plausible values also include significant imputation variability. The effect of the plausible values methodology on the estimation can be reliably estimated and is included in the calculated SEs. An accepted procedure for estimating the imputation variance using plausible values is to measure the variance of the plausible values (with an appropriate scaling factor) as follows:

\(var_{imp}(\hat\theta_{mean})=(1+\frac1M)\frac{\sum^M_{i=1}(\hat\theta_i - \hat\theta_{mean})^2}{M-1}\)

where:

\(\hat\theta_{mean}=\) the mean estimate of the plausible values

\(i=1-10\) respectively, for the plausible values \(\hat\theta_1\) to \(\hat\theta_{10}\)

\(M=\) the total number of plausible values used \((M=10 \space for \space PIAAC)\)

Together, the sampling variance and imputation variance can be added to provide a suitable measure of the total variance for the estimate as follows:

\(var(\hat\theta_{mean})=\Bigg[\sum \limits^M_{i=1}\Bigg(\frac{59}{60}\sum \limits^{60}_{g=1}(\hat\theta_{i,(g)}-\hat\theta_i)^2\frac1M\Bigg]+\Bigg[\big(1+\frac1M\big)\frac{\sum^M_{i=1}(\hat\theta_i - \hat\theta_{mean})^2}{M-1}\Bigg]\)

where:

\(\hat\theta_{mean}=\) the mean estimate of the plausible values

\(i=1-10\) respectively,  for the plausible values \(\hat\theta_1\) to \(\hat\theta_{10}\)

\(g=\) the replicate group number

\(M=\) the total number of plausible values used \((M=10 \space for \space PIAAC)\)

The total SE can be then obtained as the square root of the total variance. This SE indicates the extent to which the estimate might have varied by chance because only a sample of persons was included, and because of the significant imputation used in the literacy scaling procedures.

The total Relative Standard Error (RSE), can then be obtained by expressing the total SE as a percentage of the estimate to which it relates:

\(RSE\%=\big(\frac{SE}{Estimate}\big)\times 100\)

Not applicable categories

Some data items included in the microdata include a 'Not applicable' category. The classification value of the 'Not applicable' category, where relevant, is shown in the data item lists in the Data downloads section. In order to comply with the scheme used by other countries participating in PIAAC, the following classification scheme was used to describe 'Not applicable' categories:

  • Valid skip - respondent was sequenced past the question as the question was not appropriate to them on the basis of information previously provided (note that this category was also assigned to missing values for part or full non-responding records)
  • Don't know - respondent didn't know the answer to the question
  • Refused - respondent refused to answer the question
  • Not stated or inferred - answer to the question could not be determined.

Populations

The population relevant to each data item is identified in the data item list and should be borne in mind when extracting and analysing data from the CURFs. The actual population count for each data item is equal to the total cumulative frequency minus the 'Valid skip' category.

Generally all populations, including very specific populations, can be 'filtered' using other relevant data items. For example, if the population of interest is 'Employed persons' the Labour Force status data item CD05 on the Basic CURF can be used by applying the filter CD05 = 1. For the same population of interest on the Expanded CURF the dat item LFSAUS can be used by applying the filter LFSAUS=1.

Using the CURFs

About the CURFs

The data included in the PIAAC Basic and Expanded CURFs are released under the provisions of the Census and Statistics Act 1905. This legislation allows the Australian Statistician to release unit record data, or microdata, provided this is done "in a manner that is not likely to enable the identification of a particular person or organisation to which it relates."

The ABS ensures the confidentiality of the data by:

  • removing name, address and any other information that might uniquely identify any individual
  • changing a small number of values - particularly unusual values - and removing very unusual records
  • controlling the detail available for all records on the Basic and Expanded CURFs
  • perturbing or randomly adjusting income data
  • excluding some data items that were collected
  • controlling the modes of access to restrict access to more detailed data
  • placing restrictions on how the data are used, supported by both information in the User Manual: Responsible Use of ABS CURFs, the undertaking signed by the head of each organisation and the terms and conditions signed by each user.

As a result, data on the Basic and Expanded CURFs will not exactly match other previously published estimates. Any changes to the distribution of values are not significant and the statistical validity of aggregate data is not affected.

Identifiers

Each person has a unique random identifier - ABSPID on the Basic and Expanded CURFs and SEQID on the Basic CURF (International Comparison version).

Basic CURF file names

The PIAAC Basic CURF can be accessed on CD-ROM and is available in SAS, SPSS and STATA formats. The CURF comprises the following files:

Data files

  • PIAAC12B.csv contains the data for the CURF in comma delimited ASCII text format
  • PIAAC12B.sas7bdat contains the data for the CURF in SAS format
  • PIAAC12B.sav contains the data for the CURF in SPSS format
  • PIAAC12B.dta contains the data for the CURF in STATA format.

Information files

  • The Data item list contains all the data items, including details of categories and code values, that are available on the Basic CURF
  • The Formats file is a SAS library containing formats
  • The Frequency file contains data item code values and category labels with weighted person frequencies of each value. This file is in plain text format.

Basic CURF (International comparison version) file names

The PIAAC Basic CURF (International Comparison version) can be accessed as an electronic file and is available in SAS and SPSS formats. The CURF comprises the following files:

Data files

  • PRGAUSP1.sas7bdat contains the data for the CURF in SAS format
  • PRGAUSP1.sav contains the data for the CURF in SPSS format.

Information files

  • The Data item list contains all the data items, including details of categories and code values, that are available on the Basic CURF (International Comparison version)
  • The Formats file is a SAS library containing formats.

Expanded CURF file names

The PIAAC Expanded CURF can be accessed on the RADL or ABSDL and is available in SAS, SPSS and STATA formats. The CURF comprises the following files:

Data files

  • PIAAC12E.csv contains the data for the CURF in comma delimited ASCII text format
  • PIAAC12E.sas7bdat contains the data for the CURF in SAS format
  • PIAAC12E.sav contains the data for the CURF in SPSS format
  • PIAAC12E.dta contains the data for the CURF in STATA format.

Information files

  • The Data item list contains all the data items, including details of categories and code values, that are available on the Expanded CURF
  • The Formats file is a SAS library containing formats.

Data item list

The Program for the International Assessment of Adult Competencies (PIAAC) comprised a questionnaire, and a set of assessments. The questionnaire contained 11 modules and approximately 330 questions, which resulted in approximately 380 data items.

Users intending to purchase a CURF should consult the data item list to ensure that the data they require, and the level of detail they need, are available in the product.

The PIAAC Basic and Expanded CURF files contain 8,600 confidentialised respondent records aged 15 to 74 years. The PIAAC Basic CURF (International Comparison version) contains 7430 confidentialised respondent records aged 16 to 65 years. Subject to the limitations of the sample size and the data classifications used, it is possible to interrogate the CURF, produce tabulations and undertake statistical analyses to individual specifications.

CURF Data

The Basic CURF file contains approximately 310 data items and the Expanded CURF file contains approximately 340 data items. The Basic CURF (International Comparison version) contains approximately 1330 data items, however not all data items contain data. This CURF contains empty data items where additional data may be available for other countries that participated in PIAAC in the OECD's Public Use File (PUF). For a complete list of all data items included on the Basic and Expanded CURFs, including relevant population and classification details, refer to the Excel spreadsheets in the Data downloads section. The data item spreadsheets have 14 worksheets:

  • table of contents
  • population descriptions (Expanded CURF only)
  • data items on demography
  • data items on background information
  • data items on education
  • data items on work
  • data items on work characteristics
  • data items on skill use at work
  • data items on literacy, numeracy and information and communications technology (ICT) use at work
  • data items on literacy, numeracy and information and communications technology (ICT) use in everyday life
  • data items on self perception and wellbeing
  • data items on income
  • plausible values
  • identifiers and weights
  • data items on workflow (Basic CURFs only).

The populations used in the derivations of the data items are listed in the Population column of each data item.

Comparison between the Basic and Expanded CURFs

The differences between the Australian Basic and Expanded CURFs are listed in an Excel spreadsheet available from the Downloads tab. Note that the identifiers for some data items differ between the Basic and Expanded CURF due to the differences in the presentation of data items. Therefore, caution should be exercised if using both the Basic and Expanded CURF. The differences to the Basic CURF (International Comparison version) are not documented in the spreadsheet. A full list of data items on the Basic CURF (International Comparison version) can be found in a separate Excel spreadsheet available from the Downloads tab.

The key differences between data items in the Australian Basic and Expanded CURF are:

  • the Expanded CURF includes derived population items
  • the Expanded CURF includes additional items in areas such as demographic, education, employment and income
  • the Expanded CURF occupation and industry items use ANZSCO and ANZSIC classifications, whereas the Basic CURF uses ISCO and ISCED classifications
  • the Expanded CURF includes the level of the plausible values
  • the Expanded CURF does not include items related to the workflow of the survey
  • the Basic CURF includes workflow data items.

The 2011-2012 PIAAC Basic CURF is distributed on a single CD-ROM and the PIAAC Basic CURF (International Comparison version) is electronically distributed. The PIAAC Expanded CURF is distributed via RADL and ABSDL.

The populations used in the derivations of the data items are listed in the Population column of each data item.

Conditions of use

User responsibilities

The Census and Statistics Act includes a legislative guarantee to respondents that their confidentiality will be protected. This is fundamental to the trust the Australian public has in the ABS, and that trust is in turn fundamental to the excellent quality of ABS information. Without that trust, survey respondents may be less forthcoming or truthful in answering our questionnaires. For more information, see 'Avoiding inadvertent disclosure' and 'Microdata' on our web page How the ABS keeps your information confidential.

CURF data

The release of the CURF data is authorised by Clause 7 of the Statistics Determination made under subsection 13(1) of the Census and Statistics Act 1905. The release of a CURF must satisfy the ABS legislative obligation to release information in a manner that is not likely to enable the identification of a particular person or organisation.

This legislation allows the Australian Statistician to approve release of unit record data. All CURFs released have been approved by the Statistician. Prior to being granted access to CURFs, each organisation's Responsible Officer must submit a CURF Undertaking to the ABS. The CURF Undertaking is required by legislation and states that, prior to CURFs being released to an organisation, a Responsible Officer must undertake to ensure that the organisation will abide by the conditions of use of CURFs. Individual users are bound by the undertaking signed by the Responsible Officer.

All CURF users are required to read and abide by the conditions and restrictions in the User Manual: Responsible Use of ABS CURFs. Any breach of the CURF Undertaking may result in withdrawal of service to individuals and/or organisations. Further information is contained in the Consequences of Failing to Comply web page.

Conditions of sale

All ABS products and services are provided subject to the ABS Conditions of Sale. Any queries relating to these Conditions of Sale should be referred to intermediary.management@abs.gov.au.

Price

Microdata access is priced according to the ABS Pricing Policy and Commonwealth Cost Recovery Guidelines. For details refer to ABS Pricing Policy on the ABS website. For microdata prices refer to the Microdata prices web page.

How to apply for access

Clients wishing to access a CURF should read the How to Apply for Microdata web page. Clients should familiarise themselves with the User Manual: Responsible Use of ABS CURFs and other related microdata information which are available via the Microdata Entry Page, before applying for access.

Australian universities

The ABS/Universities Australia Agreement provides participating universities with access to a range of ABS products and services. This includes access to microdata. For further information, university clients should refer to the ABS/Universities Australia Agreement web page.

Further information

The Microdata Entry page on the ABS website contains links to microdata related information to assist users to understanding and access microdata. For further information users should email microdata.access@abs.gov.au or telephone (02) 6252 7714.

Data downloads

Data files

Previous releases

 TableBuilder data seriesMicrodataDownloadDataLab
Adult Literacy and Life Skills, 2006 Basic microdataDetailed microdata
Aspects of Literacy, 1996 Basic microdata 

History of changes

Show all

Glossary

Show all

Quality declaration

Institutional evironment

Relevance

Timeliness

Accuracy

Coherence

Interpretability

Accessibility

Abbreviations

Show all

Previous catalogue number

This release previously used catalogue number 4228.0.30.001.

Back to top of the page