Skip to content Learn about the access keys available for Metadata.NSW
A NSW Government website
Metadata.NSW (beta)

Concept help - Data Set

A Data Set describes a record of data, including any location or time boundaries for the data, that has been captured and is available for use under a specific licence. A Data Set may be included in a Data Catalog, and can reference multiple Distributions that record different parts or formats of the data that are available to download.

A a dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset does not have to be available as a downloadable file. For example, a dataset that is available via an API can be defined as an instance of dcat:Dataset and the API can be defined as an instance of dcat:Distribution. DCAT itself does not define properties specific to APIs description. These are considered out of the scope of this version of the vocabulary. Nevertheless, this can be defined as a profile of the DCAT vocabulary.

Fields available on this metadata type

Field ISO definition
Name The primary name used for human identification purposes.
Definition Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. (3.2.39)
Is Federated
Is Not Federable
Version Unique version identifier of this metadata item.
References Significant documents that contributed to the development of the metadata item which were not the direct source for the metadata content.
Origin The source (e.g. document, project, discipline or model) for the item (8.1.2.2.3.5)
Comments Descriptive comments about the metadata item (8.1.2.2.3.4)
Deleted The date after which the item has been soft deleted and is no longer visible in the registry
License Information about the license document under which the dataset is made available.
Rights Information about rights held in and over the dataset.
Release Date Date of formal publication of the dataset.
Modification Date Most recent date on which the dataset was changed, updated or modified.
Frequency The frequency at which dataset is published.
Spatial Coverage Spatial or geographic coverage of the dataset.
Temporal Coverage The temporal or time period that the dataset covers.
Catalog An entity responsible for making the dataset available.
Landing Page A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information
Contact Point Relevant contact information for the Dataset.
Conforming Specification An established standard to which the described resource conforms.
Item Base

Custom Fields

Field Short definition Long definition
Asset Type [Core] The type of data asset being described

Information may be held as different content types. Information assets should be identified by type to enable assets to be assessed and planned for, as different content types will differ in business impact and value. 

Select the type that best suits the asset.  
 

  • Data Product – ready-to-use, enriched or curated datasets packaged up to deliver actionable insights or business value. EG: Curated data layer.
  • Transactional – information that has been created as part of a business operation or transaction.  
  • Open data - data made publicly available on the internet for anyone to access, use and share. These include datasets published on data.nsw. Generally, open data is aggregate data that is proactively published and complies with the NSW government open data policy.
  • Collections & Surveys - data collected from a sample of individuals for the purpose of research and analysis. Also called analytical data.  
  • Reference data - data objects relevant to transactions, consisting of sets of values, statuses or classification schema. eg: regions and its related code. 
Asset Status This field indicates the status of the underlying data asset.

This is does not refer to the status of the data asset registration within the IAR. The meaning of each status is as follows:
 

  • In Development - The asset is currently in development and has not yet been released for use.
  • Active - The asset has been released and actively being used and/or maintained for its intended purpose.
  • Retired -  The asset is no longer in use or maintained, and has been officially retired from it's active status.
Security Classification This field is used to indicate the security classification of the dataset, helping to identify the level of access and protection required. Please select the appropriate classification label to ensure that users understand how the data in the dataset should be handled in accordance with NSW Government policies.

The 'Security Classification' field records the NSW Government's classification label that has been deemed appropriate for the data in this dataset. These labels include the NSW Government's Dissemination Limiting Markers (DLM).

This classification is important for users to ensure that sensitive information is protected and shared in compliance with NSW Government policies. When selecting a classification, consider the content of the data and its potential impact if disclosed.

For more information on determining an appropriate classification, see the NSW Government Information Classification, Labelling and Handling Guidelines - Sensitive Information.

Data Sensitivity Type This field indicates the sensitivity of the data concerning health and personal information. Please select the appropriate option to specify whether the asset contains health information, personal information, both, or neither.

This is used to categorise the sensitivity of the data in relation to health and personal information. It helps ensure that appropriate handling and protection measures are applied according to the nature of the data.

What is personal information?

The Privacy and Personal Information Protection Act 1998 (PPIPA) provides for the protection of personal information held by government agencies. 

Personal Information is information or an opinion (including information or an opinion forming part of a database and whether or not recorded in a material form) about a person whose identity is apparent or can reasonably be ascertained from the information or opinion. It is not restricted to information that clearly identifies a person but may include information which leads to the identification of an individual when considered in association with other available information. It covers information held in paper or electronic records and extends to images, body samples and biometric data such as fingerprints. 

There are a number of exceptions to the definition of personal information. Those most relevant to information held by the department are information about: 
 

  • an individual who has been dead for more than 30 years, 

  • an individual that is contained in a publicly available publication, 3 

  • an individual that is contained in a public interest disclosure or that has been collected in the course of investigation arising out of a public interest disclosure, or 

  • an individual’s suitability for appointment or employment as a public sector official. 

What is health information?

The Health Records and Information Privacy Act 2002 (HRIPA) regulates the collection and handling of health information by public sector agencies and private organisations. 

Health information is personal information that is information or an opinion about an individual’s physical or mental health or disability or the provision of a health service to an individual. It includes personal information about an individual collected in connection with the donation of body parts and genetic information collected in providing a health service that is or could be predictive of the health of that individual or a genetic relative of the individual. It includes healthcare identifiers. 

The HRIPA defines personal information on the same terms as the PPIPA with substantially the same exceptions as that Act. Health information, therefore, does not include information about: 
 

  • an individual who has been dead for more than 30 years, 

  • an individual that is contained in a publicly available publication, 

  • an individual that is contained in a public interest disclosure or that has been collected in the course of investigation arising out of a public interest disclosure, or 

  • an individual’s suitability for appointment or employment as a public sector official. 

Data Owner The role title of the position accountable for the asset within the organisation.
  • This is not the person's name.
  • There should only be one owner assigned.

 

Data Steward The role title of the person nominated by the Data Owner to be responsible for the operational management of the asset.

The Data Steward has detailed and expert knowledge of their data and provides advice on appropriate use and interpretation. Where there is more than one data steward, include only the primary data steward role title in this field.  Do not include the person's name.

Subject Matter Expert 1 Provide the position or name of the Subject Matter Expert. Usually advised by the Steward.
Subject Matter Expert 2 Use this field where there is more than one subject matter expert for the asset.
Subject Matter Expert 3 Use this field where there is more than one subject matter expert for the asset.
External Custodian Where the data is sourced from an external agency, list the agency who owns the data, and preferred contact details.
Purpose A descriptive summary of the intentions with which the asset was developed.

A descriptive summary of the intentions which the data asset was developed and proposed to be used for.  This is why the data was collected in the first place, including what the data is used for from a business perspective.

Please do not repeat the information in the description.

See below for suggested text for this field.

The purpose of this asset is to [Write a brief summary of the reason why this data was collected. Include any reporting requirements, strategic outcomes, operational reasons, enabled functions et.  Include any links to related documentation.] 
Data Quality Statement A link to the data quality statement

A link to the data quality statement.  

This field is created by the Department of Education.

Data Notes Use this field to provide additional context and insights about the dataset. Include any limitations, considerations, or relevant information that can help users understand the dataset's utility and constraints.

This field is intended for capturing essential context and considerations regarding the dataset that may not be covered in other fields. Use this space to outline any limitations or restrictions on the use and analysis of the data, as well as any important details that users should be aware of to fully understand its applicability.

The information could include:
 

  • Specific conditions under which the data is valid.
  • Known issues or biases in the data that could affect analysis.
  • Recommendations on the appropriate use cases or analyses for the dataset.
  • Any legal or ethical considerations that may limit the use of the data.

Providing comprehensive notes in this field will help users make informed decisions about the dataset's relevance and suitability for their purposes, ultimately enhancing the data's value and usability.

See below for examples:
 

  •  The number of preschool students and children in early intervention classes are not included in the full-time equivalent (FTE) total.
  • Figures, except for total number of schools, are consistent with ABS Schools Australia (cat 4221.0) counting rules, and ratios are calculated using FTE students and teachers.
  • Data is from the census of both government and non-government students, undertaken in August each year.
Data Collection Details This field is used to record how the data is collected. Include information that will help the user understand how the data was collected, including the methods, systems, or processes used to gather the data.

Use this field to capture information about how the dataset is collected. If the collection details are available elsewhere, use this field to provide a link to these details.

These details may include specific methods used (e.g., surveys, interviews, annual collections), the systems employed for data collection (e.g., databases, software applications), and the processes followed to collect the data.

Providing detailed information on data collection is important for understanding the dataset's context and quality. When filling out this field, be as descriptive as possible to give users a clear understanding of the data collection approach and its implications for analysis and use.

Data Sharing Details Capture information about any agreements related to the sharing of data within the Department or with external organisations/agencies (both incoming and outgoing data). Please specify any details regarding the conditions under which the data is shared with external parties, and any links or references to the data sharing agreement.

Use this field to document any agreements or arrangements governing the sharing of the data with external organisations or agencies. This includes any contracts, data sharing agreements, ministerial directives, or memorandum of understandings (MoUs) that outline the terms for sharing the data with external parties.

In this field, you should provide:
 

  • A brief description of each relevant agreement or arrangement.
  • The external parties involved in the sharing of the data.
  • Any specific conditions, restrictions, or obligations associated with the external sharing.
  • Links to the full agreements or relevant documentation, if available.

If the data is shared externally, the suggested text is:
 

This data is shared externally with [describe who the data is shared with].

We share the data for the below purpose:

  • [Describe the purpose and details of why the data is shared]

The agreements/Acts/legisation which governs this sharing are:

  • [Refer to the agreement and provide a link/reference if available].


If the data is only shared internally, the suggested text is:
 

This data is only shared internally with  [describe who the data is shared with].

We share the data for the below purpose:

  •  [Describe why the data is shared internally].

[Provide links/references to any agreements if available].

 


If the data is not shared internally or externally, the suggested text is:
 

This data is not shared internally or externally.
Legal Authority The applicable legal authorities under which the organisation is permitted to collect, create, receive, use, or disclose the data. Please specify the relevant legislation, policies, or agreements that govern the handling of this data.

This 'Legal Authority' field captures all legal mandates pertaining to the collection, creation, receipt, use, or disclosure of the data asset. 

See below for suggested text for this field.
 

  • [Choose from the applicable legal authorities from the following bullet points, and remove non-applicable information]
  • The data is collected with consent. The relevant collection notices and consent forms are here:
    [...provide links to collection notices and consent forms.]
  • Consent is not required under the relevant privacy legislation, as the data collection is permitted under another Act or Law:
    […provide a reference to the relevant law and provision.]
    Note that in some instances consent is not required if another Act or law applies. (See s 25(b) of the (NSW) Privacy and Personal Information Protection Act 1998 (PPIPA) or Sch 1, s4(4)(c) NSW Health Records and Information Privacy Act 2002 (HRIPA)).
  • The data is collected under a statutory research exemption and consent is not required in accordance with s 27B of the PPIPA or Sch 1, s 10(1)(f) HRIPA.  

  • The employee data is collected for operational purposes per the employee contract.

  • If none of the above options apply, provide a reference to the relevant legal authority. This might be another exemption under Privacy legislation but cannot be a contract or agreement.]

[If applicable, include any additional useful information from the collection notice or about the consent.] 

Permitted Primary Purpose This field is used to capture the permitted primary purpose for which the personal information was collected.

This relates to ss 17 & 18 of the (NSW) PPIPA and Sch 1, ss 10 & 11 of the HRIPA. (Limits on use and disclosure of personal information).

See below for the suggested format for this field.
 

[Describe permitted primary purpose here. This can usually be found in the collection notice and consent form or under legislation. ]

Examples of how the Department uses this data:

[Give 1-2 examples of how the Department uses this data (if any), making sure that the use aligns to the purposes in the Collection Notice.]

Permitted Secondary Purpose A secondary purpose is directly related to the purpose for which personal information was collected. Where appropriate, use the suggested text below to populate this field.

If the data does not contain health information or personal information, include the below text:
 

The Privacy Commissioner has advised that a directly related purpose “is a purpose that is very closely related to the purpose for collection” and “would be the type of situation that people would quite reasonably expect to occur with their personal information”.

The Privacy Commissioner has specifically suggested that “quality assurance activities such as monitoring, evaluating, auditing” could be directly related secondary purposes.



If the data does contain health information or personal information, include the below text:
 

This data set contains health information and sensitive personal information which limits the permitted secondary purposes for use and disclosure by the department, as outlined below.
 

  • Health information can only be used or disclosed for the purpose for which it was collected, or for a directly related purpose that a person would expect. Otherwise, you would generally need their consent (see HPP 10). 
    Remove this bullet point if the data doesn't contain health information.
  • Agencies cannot disclose sensitive personal information without a person’s explicit consent, for example, information about ethnic or racial origin, political opinions, religious or philosophical beliefs, sexual activities or trade union membership. It can only disclose sensitive information without consent in order to deal with a serious and imminent threat to any person’s health or safety (see IPP 12). 
    Remove this bullet point if the data doesn't contain sensitive personal information.

 

Data Access Details Use this field to provide information on how users can access the data, including necessary links, request procedures, and general access rules.

This field provides users with information on how to access the data, including any existing publications or reports that may meet their needs prior to making a data request.

See below for suggested text for this field.
 

The data​​ is accessible for authorised users via:

  • SCOUT 
    • [Provide the SCOUT report name and a link].
    • To access SCOUT reports, submit a request via the SCOUT access link.
  • [List any other available publications, reports, dashboards or derived assets that may be accessible to a user]

[Remove the above section if data is only accessible by submitting a request form.]

Custom data can be requested through the Data Services request form. Data Services will assess your request, facilitate authorisation, and process your request as appropriate.

[If any other processes are used that have not been listed, please add them to this section].

The following general access and usage rules apply for requested data.
 

  • All requests must be authorised by the Data Owner/agreed delegate.
  • All external research requests must follow the department’s processes where a SERAP is required.
  • Internal linking of data (using identifiers such as name or SRN) with deidentified outputs may be permissible if certain privacy measures are put in place, such as the separation principle. An assessment will be required.
  • If data is deidentified, it is not subject to privacy legislation, however care must be taken to ensure that the data is not re-identifiable. 
  • The identity of students/individuals may be reasonably ascertainable depending on the level of granularity and combination of variables. For example, if school/network level data is proposed to be provided, cell suppression or other deidentification techniques may need to be applied. For further guidance please refer to the IPC Fact Sheet on de-identification and the IPC Fact Sheet on ‘Reasonably Ascertainable Identity’
  • [Outline any other rules that apply to accessing this data. Consider privacy and ways to work within the law using methods like the separation principle), or de-identification. The IPC Fact Sheets above also provide useful advice on this matter.]

 

 

Publications and Outputs Use this field to record the process or conditions required for any release of publications or outputs from using the data.

“Publication” refers to any method that will distribute data or information gained from the data outside the immediate project/work team. Methods may include graphics, tables, presentations, reports, academic journals, and cabinet submissions by any media.

See below for suggested text for this field.
 

The following conditions apply for any publications or outputs using this data. Note that specific conditions will be considered by prior to release of the requested data.
 

  • [Choose from the applicable conditions from the following bullet points]
  • For use and disclosure of personal and health information, see s11 of the HRIP Act and ss 17, 18 and 19 of the PIPP Act.
  • Publications must not identify any person or school.
  • All artefacts created using this data must cite the source of the data.
  • Information is not to be published in a publicly available publication.
  • Publications using the data must be reviewed by the Data Owner prior to release.
  • No attempt will be made to rank or compare schools in accordance with section 18A of the Education Act 1990 (NSW).
  • Contractors may only submit reports to the department per the contract conditions.
  • No external publications will be developed using this data.
  • Data should not be provided to another party outside the requester’s immediate work unit/team.
  • If planning to commission a third party to conduct data analysis:
    • Consult with the Data Owner on the planned work
    • Include appropriate contractual obligations regarding privacy compliance within the services contract
  • Other conditions:
    [Detail any other conditions applicable to this asset.]

"Publication” refers to any method that will distribute data or information gained from the data outside the immediate project/work team. Methods may include graphics, tables, presentations, reports, academic journals, and cabinet submissions by any media.

 

Asset Documentation This field contains links or references to file locations that provide documentation related to administration and management of the dataset.

Included information may include TRIM folder links, Confluence links, approval records, briefs, metadata documentation or any other relevant materials.

Providing this documentation allows internal users to easily find and locate related information about the asset, reducing the risk of documentation being lost and providing additional context.

Connected Systems Use this field to record the systems that collect, access, and use the data. Include any relevant systems that interact with or rely on this dataset.

This field documents the systems that collect, access, and use the data. This information is important for understanding the data ecosystem and how different systems interact with the dataset.

Providing information about connected systems helps users understand the context in which the data is used and the potential implications for data management and sharing.

Keyword Word(s) or tags that describe the data asset's content.

These word(s) or terms describe the topic(s) covered by the data asset. It answers the question “what is this data asset about?” and supports data discovery. When selecting keywords, consider what search terms your users may choose when searching for the data asset.

Where multiple keywords apply, separate the terms with a comma ‘,’.

Alert Message Use this field to highlight any important information or updates that users need to be aware of. This message should generally be displayed at the top of the item page.

This alert serves as a key communication tool to inform users about important information, updates, or notices related to the item. It can be used to convey critical messages that users need to be aware of.

Possible reasons to use the alert include:
 

  • The item is in draft form and not yet finalised.
  • The dataset is due to be retired and will no longer be available.
  • The dataset is currently under development and cannot be used.
  • The dataset is historical and will no longer be updated, but has been made available for historical purposes.
Admin Only This field is only to be used by administrators; the information is not to be shared to public. This may include metadata working files, TRIM links to relevant information. This field is added by the Department of Education.
Data Custodian (DoE FIELD TO BE POSSIBLY DECOMMISSIONED) [Core] The custodian(s) of the data asset.

Official Definition

A representation of a dataset in a catalog. Data Catalog Vocabulary (DCAT): 5.3 Class: Dataset