Open data refers to sharing research outputs (software, datasets, excludes personal data, and some government data) "available under an open (data) license that permits anyone freely to access, reuse and redistribute." See the Open data handbook
The NZ government has an open data declaration providing transparency in data towards more informed decision making. Overseas, more than 160 research-intensive universities through eight university networks, signed the Sorbonne Declaration on Research Data Rights in January 2020. The declaration sets out the needs and benefits of research data being open, by default.
Best practices in open data include:
More and more publishers are supporting or requiring the underlying data or data sets researchers create and use in their published work to be made accessible. You can find some examples of journals and publishers with these data policies here:
In order to make the most of your data set you may be able to make your data discoverable and useable to other researchers. Citations on datasets are becoming another way of showing your own academic impact.
There are a number of repositories where you might be able to store your datasets depending on the size, confidentiality, and field of research your data set relates to. When choosing where to store your data you should consider things such as:
Repository Name | Fees/Cost | Size Limits |
---|---|---|
Figshare | Free for up to 20GB private and unlimited public space , additional fees apply for larger datasets. | Files larger than 5GB will require you to contact support in order to upload (up to 250GB) |
Zenodo | Free, but large file uploading and storage will prompt a discussion of donations toward sustainability. | 50GB per dataset, larger files can be discussed |
Mendeley Data | Free | 10GB per dataset |
Dryad | $120 USD for first 50 GB, and $50 USD for each additional 10 GB | Files larger than 300GB will require you to contact support in order to upload |
GitHub (Code storage) |
Free to use for public and open source projects. | Recommended size limit of uploads of less than 5GB, strict limit of files exceeding 100MB in size |
For a list of subject-specific data repositories check Data repositories
It is important to properly describe your data when you store it on a repository. The more information you can give to adequately describe the dataset and its potential will make the data more discoverable to other researchers. This information is what we call metadata and it is best practice to adhere to a set of metadata standards to make your work discoverable. The Digital Curation Centre (DCC) provides a list of general research data metadata standards and discipline specific metadata standards which you can review.
The information you provide about your data should provide context, enable proper reuse, communicate restrictions or limitations, support research replication and validation, provide links to research publications and finally attract disciplinary content management systems, aggregators, publishers and search engines (i.e. Google). Here are some things to consider when entering your dataset into a repository:
Register and share study design and methods before undertaking research, eg clinicaltrials.gov/ (includes NZ studies)