Research data management

Document, organise and describe

Organise and describe your research material to enable you and research collaborators to locate data efficiently when needed, and to understand the processes followed during data collection and analysis.  Include documentation about data within headers and filenames or in a structured document. This contributes to efficient and long term access and enables collaboration.

This cartoon video (4mins 40 secs) from the New York University Health Sciences Library shows what can happen when research data is not described adequately and how this can impact future research.

What is involved in organising and describing the research?

  • Name and organise files. Early in the research develop a consistent way of naming files you create and store to help you and colleagues to identify your files and manage version control. As you revise and update files, name them to indicate date and version and to group them in a hierarchy, eg YYYYMMDD_filename_version number. Keep filenames short, as long names can be rejected by some computer operating systems.
  • File structures. Create levels or hierarchies of subfolders to break down and enable discovery and re-use of stored data. See this example from the UK Data Archive.
  • File formats. The format of files affects the use and reuse of data in the future. Formats recommended for accessibility are non-proprietary, open, documented standards commonly used by research communities (such as ASCII, Unicode), unencrypted and uncompressed. For more detail on different data formats see the UK Data Archive File Formats Table.
  • Data conversion. Data may need to be converted to a more durable or accessible format for storage, archiving and sharing.
  • Record decisions. Create a readme or structured text document to record decisions made about managing, organising, storing and sharing research data to maintain good understanding and practices for yourself and collaborative researchers.
  • Document versions. Develop a strategy to manage different versions and copies of files, especially when shared with others. For example, how many versions to keep; include the status of a file such as draft, revised, final.
  • Describe your research data. Use a metadata schema to enable the data to be read by computers.

What is metadata?

Metadata refers to documentation about your data, ie documenting the elements someone else would need to find, understand and/or re-use and cite your data. Depending on data types, metadata includes title; name of creator or data collector; geographic location of data collection (eg, photos or transcript data); parameters or units of measurement; equipment used; access conditions. See the ANDS Metadata Guide for more detail.

Why use metadata standards?

Using a metadata standard means you can apply standardised elements and don’t have to come up with your own. You can select from a range of existing metadata schemes, and the Digital Curation Centre (DCC) provides many examples of discipline-based, or general research metadata schemes. Dublin Core is a widely used scheme.

If you create your own metadata scheme, document it in a table or text file (eg a ReadMe file) to accompany the data. Cornell University provides a useful Guide to writing “readme” style metadata.

What resources/tools are available?

An alphabetical list of metadata tools is available from the Digital Curation Centre, and the DCC metadata standards page includes examples of tools within disciplines. For more information and advice, contact the Library.