SynD Framework logoSynthetic Health Data Governance Framework (SHDGF)
Home
ResourcesAbout SynD

Appendix 2

Glossary

Clear, concise definitions for key terms used throughout the Framework.

About synthetic data (Appendix 1) Policy context (Appendix 3) Open original PDF Download PDF

Use while reading

Keep this open while moving through Steps 1–6 and the other appendices.

Searchable

Use search to find a term or phrase across definitions.

Governance-friendly

Definitions reflect the Framework’s usage (not legal advice).

Browse terms

54 terms shown.

A

A

Accountable decision-maker

Usually the Data Sponsor (Executive Director level) or their delegate, or the Data Custodian. For complex data use and sharing proposals it may be a Chief Executive or a Deputy Secretary.

Aggregated data

Data produced by grouping information into categories, typically with a combined count within each category.

API

Application Programming Interface.

APPs

Australian Privacy Principles, found in the Privacy Act.

Attribute disclosure

When new facts can be learned or inferred about an individual from a dataset.

C

C

Collection

A “collection” occurs when information comes into the possession or control of an organisation.

Confidentiality Undertaking

A document containing undertakings made by a data recipient regarding the handling of shared data; may be required before data sharing.

D

D

Data

Facts, statistics, instructions, concepts or other information able to be communicated, analysed or processed. May include or exclude personal, health or special category information.

Data asset or dataset

A body of information managed as a single unit, recognised as valuable and enabling an organisation to perform its functions.

Data breach

Loss, unauthorised access, or unauthorised disclosure of personal information. A breach is “notifiable” if likely to result in serious harm.

Data Custodian

Makes decisions about management, access and release of a data asset, including quality and registration or cataloguing.

Data fidelity

A measure of the accuracy, completeness, reliability and consistency of data in representing real-world subjects.

Data masking

Modifying, obscuring or replacing original data for security or confidentiality purposes.

Data Owner

The person or organisation responsible for creating the data and exercising authority over it. May delegate authority to a Data Custodian.

Data Provider

The organisation holding and controlling the source health data involved in a synthetic health data request.

Data Requestor

The organisation requesting generation of synthetic health data from source data held by one or more organisations.

Data Sponsor

Undertakes data ownership on behalf of an organisation and ensures appropriate governance; may have authority to approve data sharing.

Data Steward

Manages a data asset day to day on behalf of the Data Sponsor, ensuring data quality and standards, and supporting Custodians and Sponsors.

Data utility

A measure of the usefulness or value of data in achieving a goal in a particular context.

De-identified data

Data where a person’s identity is no longer apparent or reasonably ascertainable after applying de-identification techniques.

Disclosure

Provision of personal information to another party outside an organisation.

DSA

Data Sharing Agreement.

DUA

Data Use Agreement.

Dummy data

Placeholder or substitute data fabricated to mimic the structure of real data for testing; not meaningful for analysis.

F

F

Fake data

Artificially generated data; includes dummy data, mock data and synthetic health data.

Five Safes

A framework for assessing and managing privacy risks when sharing data within a controlled setting.

H

H

Health consumer

Individuals who use or will use health services, including their families and carers.

Health information

Personal information about a person’s health, disability, health services, wishes about services, donations, genetic information, healthcare identifiers, or information collected in the course of providing a health service.

HREC

Human Research Ethics Committee.

I

I

Identity disclosure

Occurs when data is re-identified and a person’s identity can be linked to a record.

Information

See “data”.

Insights

Information derived from data once processed or analysed.

M

M

Membership disclosure

Occurs if it can be determined whether an individual’s data was in the source dataset used to generate a synthetic dataset.

Mock data

Simulated or fictitious data not created from real data, but may replicate structure or format.

N

N

NHMRC

National Health and Medical Research Council.

O

O

OAIC

Office of the Australian Information Commissioner.

Output

Results generated from data use, such as analyses, insights, reports or derived information.

P

P

Personal information

Information about a person who is reasonably identifiable. Includes true/false, opinion/fact, recorded or unrecorded information, and includes living or deceased persons.

Perturbation

Modifying data by making small changes (e.g., adding noise) to obscure original values while preserving statistical properties.

PIA

Privacy Impact Assessment.

Privacy Act

Privacy Act 1988 (Cth).

R

R

Real data

Data relating to actual people, places or events.

Redaction

Permanently removing or concealing data for confidentiality.

S

S

Secondary purpose / secondary use

Using personal information for a purpose other than the primary purpose of collection.

Sensitive information

A subset of personal information including racial origin, beliefs, sexual orientation, criminal record, health and genetic information, and some biometric data.

Sharing

Data provided from a Data Provider to a Data Requestor or End User.

Source data

The original data held by the Data Provider from which synthetic health data will be generated.

Statistical disclosure risk

Risk that an individual’s identity or new information about them can be revealed, including attribute and identity disclosure.

Statistical properties

Characteristics of a dataset that can be measured, analysed or interpreted.

Synthetic health data

Artificially generated data that mimics the structure and statistical properties of real health data, and uses real health data as input.

Synthetic health data request

Any request to generate or access synthetic health data for a project or multiple potential purposes.

U

U

Unit record data

Data at the level of a single observation relating to an individual or entity.

Use

The use of personal information by a person within an organisation.

V

V

Very low risk

A level of re-identification risk that is so low as to make identification highly impractical.