Concept help - Data Set
Table of Contents
A Data Set describes a record of data, including any location or time boundaries for the data, that has been captured and is available for use under a specific licence. A Data Set may be included in a Data Catalog, and can reference multiple Distributions that record different parts or formats of the data that are available to download.
A a dataset in DCAT is defined as a "collection of data, published or curated by a single agent, and available for access or download in one or more formats". A dataset does not have to be available as a downloadable file. For example, a dataset that is available via an API can be defined as an instance of dcat:Dataset and the API can be defined as an instance of dcat:Distribution. DCAT itself does not define properties specific to APIs description. These are considered out of the scope of this version of the vocabulary. Nevertheless, this can be defined as a profile of the DCAT vocabulary.
Fields available on this metadata type
Field | ISO definition |
---|---|
Name | The primary name used for human identification purposes. |
Definition | Representation of a concept by a descriptive statement which serves to differentiate it from related concepts. (3.2.39) |
Is Federated | |
Is Not Federable | |
Version | Unique version identifier of this metadata item. |
References | Significant documents that contributed to the development of the metadata item which were not the direct source for the metadata content. |
Origin | The source (e.g. document, project, discipline or model) for the item (8.1.2.2.3.5) |
Comments | Descriptive comments about the metadata item (8.1.2.2.3.4) |
Deleted | The date after which the item has been soft deleted and is no longer visible in the registry |
License | Information about the license document under which the dataset is made available. |
Rights | Information about rights held in and over the dataset. |
Release Date | Date of formal publication of the dataset. |
Modification Date | Most recent date on which the dataset was changed, updated or modified. |
Frequency | The frequency at which dataset is published. |
Spatial Coverage | Spatial or geographic coverage of the dataset. |
Temporal Coverage | The temporal or time period that the dataset covers. |
Catalog | An entity responsible for making the dataset available. |
Landing Page | A Web page that can be navigated to in a Web browser to gain access to the dataset, its distributions and/or additional information |
Contact Point | Relevant contact information for the Dataset. |
Conforming Specification | An established standard to which the described resource conforms. |
Item Base |
Custom Fields
Field | Short definition | Long definition |
---|---|---|
Security Classification | [Core] Security classification of information |
[Core] Security classification of information
The security classification applied to the asset as specified by the Australian Government Protective Security Policy Framework (PSPF)
The Australian Government uses 3 security classifications:
* PROTECTED
* SECRET
* TOP SECRET
All other information from business operations and services is OFFICIAL or, where it is sensitive, OFFICIAL: Sensitive. [NB: the old UNCLASSIFIED classification was renames OFFICIAL (see PSPF v2018.1 Sep 2018)]
The originator of the data asset is responsible for applying the relevant Security Classification.
This attribute relates to Sensitive Data and Access Rights.
|
Sensitive Data | [Additional] The type of sensitivity of the data asset, where applicable. |
[Additional] The type of sensitivity of the data asset, where applicable.
If Security Classification, as specified by the Australian Government Protective Security Policy Framework (PSPF), has value “OFFICIAL: Sensitive”, provide type of sensitivity. Where multiple sensitivity types exist within the data asset, provide the most restrictive dissemination limiting marker (DLM).
This attribute relates to Security Classification and Access Rights.
|
Data Custodian | [Core] The custodian(s) of the data asset. | |
Contact Point | [Core] Key data roles. The relevant contact information from which information for the asset can be obtained |
[Core] Key data roles. The relevant contact information from which information for the asset can be obtained:
Directorate, Asset owner position, Asset steward position, Current Owner, Current Steward, Subject matter expert, External Custodian
|
Keyword | [Core] Word(s) or tags that describe the data asset subject matter. |
[Core] Word(s) or tags that describe the data asset subject matter.
These word(s) or terms describe the topic(s) covered by the data asset. It answers the question “what is this data asset about?” and supports data discovery. When selecting keywords, consider what search terms your users may choose when searching for the data asset.
It is recommended to include at least one term from the Australian Governments’ Interactive Functions Thesaurus (AGIFT) that covers words and terms related to Australian Government agencies’ core business functions and activities. Also include words such as Indigenous, Disability or Gender if appropriate to better support the Government’s priority data activities.
Where multiple keywords apply, separate the terms with a comma ‘,’.
|
Resource Type | [Core] The type of data asset being described |
[Core] The type of data asset being described
This attribute specifies the type of data asset. The most common types of data asset applicable are listed below with their definitions. (This attribute could be supplemented by attribute Format.)
collection
an aggregation of items. The term collection means that the resource is described as a group; its parts may be separately described and navigated.
dataset
structured information encoded in lists, tables, databases, etc., which will normally be in a format available for direct machine processing. For example - spreadsheets, databases, GIS data, midi data. Note that unstructured numbers and words would be considered as text.
image
the content is primarily symbolic visual representation other than text. For example - images and photographs of physical objects, paintings, prints, drawings, other images and graphics, animations and moving pictures, film, diagrams, maps, musical notation. Note that image may include both electronic and physical representations.
interactive resource
a resource which requires interaction from the user to be understood, executed, or experienced. For example - forms on web pages, applets, multimedia learning objects, virtual reality.
model
an abstraction of the real thing, i.e. some generalisation and interpretation. Models could be considered a symbolic representation. Examples include performance models, cost models, mechanical models, etc.
service
a system that provides one or more functions of value to the end-user. Examples include: a photocopying service, a banking service, an authentication service, interlibrary loans, a Z39.50 or Web server.
software
a computer program in source or compiled form which may be available for installation non-transiently on another machine. For software which exists only to create an interactive environment, use interactive instead.
sound
a resource whose content is primarily audio or intended to be realised in audio. For example - music, speech, recorded sounds. This category includes musical notation, including score, which is unrealised in sound.
|
Purpose | [Additional] A descriptive summary of the intentions with which the asset was developed. |
[Additional] A descriptive summary of the intentions which the data asset was developed and proposed to be used for. (This field supplements the attribute Description)
|
Legal Authority | [Additional] All legal mandates under which the data asset was collected, created, received, used or disclosed. |
[Additional] All legal mandates under which the data asset was collected, created, received, used or disclosed.
Legal mandates could include Memorandum of Understanding; Legislation; Machinery of Government; Government policies or acts; etc. It could include the authority, e.g. (Australian Government) Federal Register of Legislation or Data Availability and Transparency Act 2022.
Where multiple legal mandates exist, separate their URLs with a comma ‘,’.
This information may be sourced through the agency’s legal department.
|
Admin Only | This field is only to be used by administrators; the information is not to be shared to public. This may include metadata working files, TRIM links to relevant information. This field is added by the Department of Education. |
Official Definition
A representation of a dataset in a catalog. Data Catalog Vocabulary (DCAT): 5.3 Class: Dataset