Open data refers to sharing research outputs (software, datasets, excludes personal data, and some government data) "available under an open (data) license that permits anyone freely to access, reuse and redistribute." See the Open data handbook
The NZ government has an open data declaration providing transparency in data towards more informed decision making.
Best practices in open data include:
More and more publishers are supporting or requiring the underlying data or data sets researchers create and use in their published work to be made accessible. You can find some examples of journals and publishers with these data policies here:
In order to make the most of your data set you may be able to make your data discoverable and useable to other researchers. Citations on datasets are becoming another way of showing your own academic impact.
There are a number of repositories where you might be able to store your datasets depending on the size, confidentiality, and fieldof research your data set relates to. When choosing where to store your data you should consider things such as:
|Repository Name||Fees/Cost||Size Limits|
|Figshare||100GB free, additional fees apply for larger datasets||1 TB per dataset|
|Zenodo||Free||50GB per dataset, larger files can be discussed|
|Mendeley Data||Free||10GB per dataset|
|Dryad||$120 USD for first 20 GB, and $50 USD for each additional 10 GB||Files larger than 1GB will require you to contact support in order to upload|
GitHub (Code storage)
|Free to use for public and open source projects.||Repository limit of 1GB, limit of files exceeding 100MB in size|
It is important to properly describe your data when you store it on a repository. The more information you can give to adequately describe the dataset and its potential will make the data more discoverable to other researchers. This information is what we call metadata and it is best practice to adhere to a set of metadata standards to make your work discoverable. The Digital Curation Centre (DCC) provides a list of general research data metadata standards and discipline specific metadata standards which you can review.
The information you provide about your data should provide context, enable proper reuse, communicate restrictions or limitations, support research replication and validation, provide links to research publications and finally attract disciplinary content management systems, aggregators, publishers and search engines (i.e. Google). Here are some things to consider when entering your dataset into a repository: