Examples of data management plans

These examples of data management plans (DMPs) were provided by University of Minnesota researchers. They feature different elements. One is concise and the other is detailed. One utilizes secondary data, while the other collects primary data. Both have explicit plans for how the data is handled through the life cycle of the project.

School of Public Health featuring data use agreements and secondary data analysis

All data to be used in the proposed study will be obtained from XXXXXX; only completely de-identified data will be obtained. No new data collection is planned. The pre-analysis data obtained from the XXX should be requested from the XXX directly. Below is the contact information provided with the funding opportunity announcement (PAR_XXX).

Types of data : Appendix # contains the specific variable list that will be used in the proposed study. The data specification including the size, file format, number of files, data dictionary and codebook will be documented upon receipt of the data from the XXX. Any newly created variables from the process of data management and analyses will be updated to the data specification.

Data use for others : The post-analysis data may be useful for researchers who plan to conduct a study in WTC related injuries and personal economic status and quality of life change. The Injury Exposure Index that will be created from this project will also be useful for causal analysis between WTC exposure and injuries among WTC general responders.

Data limitations for secondary use : While the data involve human subjects, only completely de-identified data will be available and used in the proposed study. Secondary data use is not expected to be limited, given the permission obtained to use the data from the XXX, through the data use agreement (Appendix #).

Data preparation for transformations, preservation and sharing : The pre-analysis data will be delivered in Stata format. The post-analysis data will also be stored in Stata format. If requested, other data formats, including comma-separated-values (CSV), Excel, SAS, R, and SPSS can be transformed.

Metadata documentation : The Data Use Log will document all data-related activities. The proposed study investigators will have access to a highly secured network drive controlled by the University of Minnesota that requires logging of any data use. For specific data management activities, Stata “log” function will record all activities and store in relevant designated folders. Standard file naming convention will be used with a format: “WTCINJ_[six letter of data indication]_mmddyy_[initial of personnel]”.

Data sharing agreement : Data sharing will require two steps of permission. 1) data use agreement from the XXXXXX for pre-analysis data use, and 2) data use agreement from the Principal Investigator, Dr. XXX XXX ([email protected] and 612-xxx-xxxx) for post-analysis data use.

Data repository/sharing/archiving : A long-term data sharing and preservation plan will be used to store and make publicly accessible the data beyond the life of the project. The data will be deposited into the Data Repository for the University of Minnesota (DRUM), http://hdl.handle.net/11299/166578. This University Libraries’ hosted institutional data repository is an open access platform for dissemination and archiving of university research data. Date files in DRUM are written to an Isilon storage system with two copies, one local to ​each of the two geographically separated University of Minnesota Data Centers​. The local Isilon cluster stores the data in such a way that the data can survive the loss of any two disks or any one node of the cluster. Within two hours of the initial write, data replication to the 2nd Isilon cluster commences. The 2nd cluster employs the same protections as the local cluster, and both verify with a checksum procedure that data has not altered on write. In addition, DRUM provides long-term preservation of digital data files for at least 10 years using services such as migration (limited format types), secure backup, bit-level checksums, and maintains a persistent DOIs for data sets, facilitating data citations. In accordance to DRUM policies, the de-identified data will be accompanied by the appropriate documentation, metadata, and code to facilitate reuse and provide the potential for interoperability with similar data sets.

Expected timeline : Preparation for data sharing will begin with completion of planned publications and anticipated data release date will be six months prior.

Back to top

College of Education and Human Development featuring quantitative and qualitative data

Types of data to be collected and shared The following quantitative and qualitative data (for which we have participant consent to share in de-identified form) will be collected as part of the project and will be available for sharing in raw or aggregate form. Specifically, any individual level data will be de-identified before sharing. Demographic data may only be shared at an aggregated level as needed to maintain confidentiality.

Student-level data including

  • Pre- and posttest data from proximal and distal writing measures
  • Demographic data (age, sex, race/ethnicity, free or reduced price lunch status, home language, special education and English language learning services status)
  • Pre/post knowledge and skills data (collected via secure survey tools such as Qualtrics)
  • Teacher efficacy data (collected via secure survey tools such as Qualtrics)
  • Fidelity data (teachers’ accuracy of implementation of Data-Based Instruction; DBI)
  • Teacher logs of time spent on DBI activities
  • Demographic data (age, sex, race/ethnicity, degrees earned, teaching certification, years and nature of teaching experience)
  • Qualitative field notes from classroom observations and transcribed teacher responses to semi-structured follow-up interview questions.
  • Coded qualitative data
  • Audio and video files from teacher observations and interviews (participants will sign a release form indicating that they understand that sharing of these files may reveal their identity)

Procedures for managing and for maintaining the confidentiality of the data to be shared

The following procedures will be used to maintain data confidentiality (for managing confidentiality of qualitative data, we will follow additional guidelines ).

  • When participants give consent and are enrolled in the study, each will be assigned a unique (random) study identification number. This ID number will be associated with all participant data that are collected, entered, and analyzed for the study.
  • All paper data will be stored in locked file cabinets in locked lab/storage space accessible only to research staff at the performance sites. Whenever possible, paper data will only be labeled with the participant’s study ID. Any direct identifiers will be redacted from paper data as soon as it is processed for data entry.
  • All electronic data will be stripped of participant names and other identifiable information such as addresses, and emails.
  • During the active project period (while data are being collected, coded, and analyzed), data from students and teachers will be entered remotely from the two performance sites into the University of Minnesota’s secure BOX storage (box.umn.edu), which is a highly secure online file-sharing system. Participants’ names and any other direct identifiers will not be entered into this system; rather, study ID numbers will be associated with the data entered into BOX.
  • Data will be downloaded from BOX for analysis onto password protected computers and saved only on secure University servers. A log (saved in BOX) will be maintained to track when, at which site, and by whom data are entered as well as downloaded for analysis (including what data are downloaded and for what specific purpose).

Roles and responsibilities of project or institutional staff in the management and retention of research data

Key personnel on the project (PIs XXXXX and XXXXX; Co-Investigator XXXXX) will be the data stewards while the data are “active” (i.e., during data collection, coding, analysis, and publication phases of the project), and will be responsible for documenting and managing the data throughout this time. Additional project personnel (cost analyst, project coordinators, and graduate research assistants at each site) will receive human subjects and data management training at their institutions, and will also be responsible for adhering to the data management plan described above.

Project PIs will develop study-specific protocols and will train all project staff who handle data to follow these protocols. Protocols will include guidelines for managing confidentiality of data (described above), as well as protocols for naming, organizing, and sharing files and entering and downloading data. For example, we will establish file naming conventions and hierarchies for file and folder organization, as well as conventions for versioning files. We will also develop a directory that lists all types of data and where they are stored and entered. As described above, we will create a log to track data entry and downloads for analysis. We will designate one project staff member (e.g., UMN project coordinator) to ensure that these protocols are followed and documentation is maintained. This person will work closely with Co-Investigator XXXXX, who will oversee primary data analysis activities.

At the end of the grant and publication processes, the data will be archived and shared (see Access below) and the University of Minnesota Libraries will serve as the steward of the de-identified, archived dataset from that point forward.

Expected schedule for data access

The complete dataset is expected to be accessible after the study and all related publications are completed, and will remain accessible for at least 10 years after the data are made available publicly. The PIs and Co-Investigator acknowledge that each annual report must contain information about data accessibility, and that the timeframe of data accessibility will be reviewed as part of the annual progress reviews and revised as necessary for each publication.

Format of the final dataset

The format of the final dataset to be available for public access is as follows: De-identified raw paper data (e.g., student pre/posttest data) will be scanned into pdf files. Raw data collected electronically (e.g., via survey tools, field notes) will be available in MS Excel spreadsheets or pdf files. Raw data from audio/video files will be in .wav format. Audio/video materials and field notes from observations/interviews will also be transcribed and coded onto paper forms and scanned into pdf files. The final database will be in a .csv file that can be exported into MS Excel, SAS, SPSS, or ASCII files.

Dataset documentation to be provided

The final data file to be shared will include (a) raw item-level data (where applicable to recreate analyses) with appropriate variable and value labels, (b) all computed variables created during setup and scoring, and (c) all scale scores for the demographic, behavioral, and assessment data. These data will be the de-identified and individual- or aggregate-level data used for the final and published analyses.

Dataset documentation will consist of electronic codebooks documenting the following information: (a) a description of the research questions, methodology, and sample, (b) a description of each specific data source (e.g., measures, observation protocols), and (c) a description of the raw data and derived variables, including variable lists and definitions.

To aid in final dataset documentation, throughout the project, we will maintain a log of when, where, and how data were collected, decisions related to methods, coding, and analysis, statistical analyses, software and instruments used, where data and corresponding documentation are stored, and future research ideas and plans.

Method of data access

Final peer-reviewed publications resulting from the study/grant will be accompanied by the dataset used at the time of publication, during and after the grant period. A long-term data sharing and preservation plan will be used to store and make publicly accessible the data beyond the life of the project. The data will be deposited into the Data Repository for the University of Minnesota (DRUM),  http://hdl.handle.net/11299/166578 . This University Libraries’ hosted institutional data repository is an open access platform for dissemination and archiving of university research data. Date files in DRUM are written to an Isilon storage system with two copies, one local to each of the two geographically separated University of Minnesota Data Centers. The local Isilon cluster stores the data in such a way that the data can survive the loss of any two disks or any one node of the cluster. Within two hours of the initial write, data replication to the 2nd Isilon cluster commences. The 2nd cluster employs the same protections as the local cluster, and both verify with a checksum procedure that data has not altered on write. In addition, DRUM provides long-term preservation of digital data files for at least 10 years using services such as migration (limited format types), secure backup, bit-level checksums, and maintains persistent DOIs for datasets, facilitating data citations. In accordance to DRUM policies, the de-identified data will be accompanied by the appropriate documentation, metadata, and code to facilitate reuse and provide the potential for interoperability with similar datasets.

The main benefit of DRUM is whatever is shared through this repository is public; however, a completely open system is not optimal if any of the data could be identifying (e.g., certain types of demographic data). We will work with the University of MN Library System to determine if DRUM is the best option. Another option available to the University of MN, ICPSR ( https://www.icpsr.umich.edu/icpsrweb/ ), would allow us to share data at different levels. Through ICPSR, data are available to researchers at member institutions of ICPSR rather than publicly. ICPSR allows for various mediated forms of sharing, where people interested in getting less de-identified individual level would sign data use agreements before receiving the data, or would need to use special software to access it directly from ICPSR rather than downloading it, for security proposes. ICPSR is a good option for sensitive or other kinds of data that are difficult to de-identify, but is not as open as DRUM. We expect that data for this project will be de-identifiable to a level that we can use DRUM, but will consider ICPSR as an option if needed.

Data agreement

No specific data sharing agreement will be needed if we use DRUM; however, DRUM does have a general end-user access policy ( conservancy.umn.edu/pages/drum/policies/#end-user-access-policy ). If we go with a less open access system such as ICPSR, we will work with ICPSR and the Un-funded Research Agreements (UFRA) coordinator at the University of Minnesota to develop necessary data sharing agreements.

Circumstances preventing data sharing

The data for this study fall under multiple statutes for confidentiality including multiple IRB requirements for confidentiality and FERPA. If it is not possible to meet all of the requirements of these agencies, data will not be shared.

For example, at the two sites where data will be collected, both universities (University of Minnesota and University of Missouri) and school districts have specific requirements for data confidentiality that will be described in consent forms. Participants will be informed of procedures used to maintain data confidentiality and that only de-identified data will be shared publicly. Some demographic data may not be sharable at the individual level and thus would only be provided in aggregate form.

When we collect audio/video data, participants will sign a release form that provides options to have data shared with project personnel only and/or for sharing purposes. We will not share audio/video data from people who do not consent to share it, and we will not publicly share any data that could identify an individual (these parameters will be specified in our IRB-approved informed consent forms). De-identifying is also required for FERPA data. The level of de-identification needed to meet these requirements is extensive, so it may not be possible to share all raw data exactly as collected in order to protect privacy of participants and maintain confidentiality of data.

  • Data Management in Research: A Comprehensive Guide

Research Data Management (RDM) is an important part of any research project. Learn about RDM examples from University of Minnesota & how it helps researchers manage & share their research data.

Data Management in Research: A Comprehensive Guide

Research data management (RDM) is a term that describes the organization, storage, preservation, and sharing of data collected and used in a research project. It involves the daily management of research data throughout the life of a research project, such as using consistent file naming conventions. Examples of data management plans (DMPs) can be found in universities, such as the University of Minnesota . These plans can be concise or detailed, depending on the type of data used (secondary or primary).

Research data can include video, sound, or text data, as long as it is used for systematic analysis. For example, a collection of video interviews used to collect and identify facial gestures and expressions in a study of emotional responses to stimuli would be considered research data. Making this data accessible to everyone in the group, even those who are not on the team but who are in the same discipline, can open up enormous opportunities to advance their own research. The Stata logging function can record all activities and store them in the relevant designated folders.

Qualitative data is subjective and exists only in relation to the observer (McLeod, 201). The Stanford Digital Repository (SDR) provides digital preservation, hosting and access services that allow researchers to preserve, manage and share research data in a secure environment for citation, access and long-term reuse. In addition, DRUM provides long-term preservation of digital data files for at least 10 years using services such as migration (limited format types), secure backups, bit-level checksums, and maintains persistent DOIs for data sets. The DMPTool includes data management plan templates along with a wealth of information and assistance to guide you through the process of creating a ready-to-use DMP for your specific research project and funding agency.

Some demographics may not be shareable on an individual level and would therefore only be provided in aggregate form. In accordance with DRUM policies, unidentified data will be accompanied by appropriate documentation, metadata and code to facilitate reuse and provide the potential for interoperability with similar data sets. It is important to remember that you can generate data at any point in your research but if you don't document it properly it will become useless. For example, observation data should be recorded immediately to avoid data loss while reference data is not as time-sensitive.

Research data management describes a way to organize and store the data that a research project has accumulated in the most efficient way possible. The willingness of researchers to manage and share their data has been evolving under increasing pressure from government mandates of the National Institutes of Health and the data exchange policies of major publishers that now require researchers to share their data and the processes they took to collect the data if they want to continue receiving funding or have their articles published. Librarians have begun to provide a range of services in this area and are now teaching data management to researchers, working with individual researchers to improve their data management practices, create thematic data management guides, and help support agencies' data requirements funding and publishers. Some operating systems also support embedding metadata in this way, such as Microsoft Document Properties.

Related Posts

Data Management: A Comprehensive Guide for Researchers

  • Data Management: A Comprehensive Guide for Researchers

Data management is an essential part of any research project. Learn how to manage your research data safely and securely with this comprehensive guide.

Data Management: A Comprehensive Guide

  • Data Management: A Comprehensive Guide

This article provides an overview of what is involved in effective Data Management including tasks policies procedures practices strategies methods etc.

Data Management: A Comprehensive Guide

Data Management is the practice of collecting organizing accessing protecting & maintaining valuable organizational & customer's personal information. Learn more about its importance & best practices.

The Ultimate Guide to Data Management

  • The Ultimate Guide to Data Management

Data management is an essential part of any successful business. Learn about the best practices for managing your organization's data effectively.

More articles

The Benefits of Using Management Tools for Businesses

  • The Benefits of Using Management Tools for Businesses

Data Governance vs Data Management: What's the Difference?

  • Data Governance vs Data Management: What's the Difference?

Data Management: A Comprehensive Guide

  • 10 Steps to Master Data Management
  • Data Management Services: What You Need to Know
  • The Benefits of Data Management in Research
  • What software create and manage data base?
  • The Benefits of Data Management: Unlocking the Power of Your Data
  • Data Management Services: A Comprehensive Guide
  • A Comprehensive Guide to Database Management Systems
  • A Comprehensive Guide to Data Management Plans
  • Why is tools management important?
  • The Benefits of Data Management in Research: A Comprehensive Guide
  • Which of the following is api management tool?
  • Why is data management and analysis important?

Data Management: Unlocking the Potential of Data

  • The Benefits of Enterprise Data Management
  • What should a data management plan include?
  • Types of Data Management Tools: A Comprehensive Guide
  • Who will approve data management plan?
  • Types of Data in Research: A Comprehensive Guide
  • The Benefits of Data Management Tools
  • Who created data management?
  • What are the 5 layers of big data architecture?
  • The Benefits of Enterprise Data Management for Every Enterprise
  • Data Management: An Overview of the 5 Key Functions
  • The Advantages of Enterprise Data Management
  • The Essential Role of Data Management in Business
  • What are data management skills?
  • What is Enterprise Data Management and How to Implement It?
  • What Are the Different Types of Management Tools?
  • What is Data Management and How to Manage it Effectively?
  • Data Management: A Comprehensive Guide to the Process
  • A Comprehensive Guide to Writing a Data Management Research Plan
  • Data Management: What is it and How to Use it?
  • Data Management: What is it and How to Implement it?
  • What are the 3 dimensions of data governance defined in the active data governance framework?
  • What is Enterprise Data Management and How Can It Help Your Organization?
  • The Difference Between Enterprise Data Management and Data Management
  • What is a value in data management?
  • Data Management Tools: An Expert's Guide
  • Organizing IoT Device Data: A Comprehensive Guide
  • Data Management in Research Studies: An Expert's Guide

The Benefits of Data Management in Biology

  • Data Management Principles: A Comprehensive Guide
  • Data Management Tools: A Comprehensive Guide
  • The Benefits of Enterprise Data Management for Businesses
  • The Benefits of Good Data Management Explained
  • Data Management: A Comprehensive Guide to Unlocking Data Potential
  • Best Practices for Managing Big Data Analytics Programs
  • 6 Major Areas for Best Practices in Data Management
  • The Benefits of Understanding Project Data and Processes
  • Data Management: What You Need to Know
  • What is data standard?
  • How do you create a data standard?
  • Six Major Areas for Best Practices in Data Management
  • The Essential Role of Enterprise Data Management
  • What are the 5 basic steps in data analysis?
  • What is managing the data?
  • Understanding Research Data Management Systems
  • Data Management Standards: A Comprehensive Guide
  • 4 Types of Data Management Explained
  • What are data standards and interoperability?
  • Types of Data Management Functions Explained
  • Data Management Plans: A Comprehensive Guide for Researchers
  • Data Management: Unlocking the Potential of Your Data
  • The Significance of Data Management in Mathematics
  • Data Processing: Exploring the 3 Main Methods
  • Unlocking the Secrets of Big Data Analytics: A Comprehensive Guide
  • Do You Need Math to Become a Data Scientist?
  • Data Management: A Comprehensive Guide to Maximize Benefits

A Comprehensive Guide to Different Types of Data Management

  • How does data governance add value?
  • What is Data Management Software and How Does it Work?
  • The 10 Principles of Data Management for a Successful Organization
  • Data Management Best Practices: A Comprehensive Guide
  • Four Essential Best Practices for Big Data Governance
  • Data Management: What It Is and Examples
  • Data Collection Methods: An Expert's Guide
  • How to Effectively Manage Big Data Analytics
  • Is sql a database management system?
  • What is Enterprise Data Management Framework?
  • What is a data principle?
  • What are database systems?
  • What is data processing and its types?
  • What are spreadsheet data tools?
  • 8 Core Principles of Data Management
  • What are the best practices in big data adoption?
  • Types of Management Tools: A Comprehensive Guide
  • The Ultimate Guide to Enterprise Data Management

New Articles

A Comprehensive Guide to Different Types of Data Management

Which cookies do you want to accept?

Bernard Becker Medical Library

Data Management and Sharing

  • Research Data Management (RDM)

data management and analysis in research example

Developing a comprehensive data management strategy can be hard to implement but it will save everyone involved in the research a lot of time over the course of the project. Check out the data management and storage platforms available at the School of Medicine, and then dive into the guidance around good data management practices in the sections below. Overwhelmed? Becker Library is here to help. Submit a consultation request for assistance working through the data management considerations for your research. Data Interviews can also assist investigators with developing a workflow for good research data management and sharing (DMS) practices and writing a DMS plan that is mandated by many funders including the NIH.

On this page:

Data Security Policies

Fair principles, metadata and data curation, data management planning, data collection, data analysis, additional resources, poor data management examples.

Does this data management approach sound familiar? 

“I will store all data on at least one, and possibly up to 50, hard drives in my lab. The directory structure will be custom, not self-explanatory, and in no way documented or described. Students working with the data will be encouraged to make their own copies and modify them as they please, in order to ensure that no one can ever figure out what the actual real raw data is. Backups will rarely, if ever, be done. (Brown 2010)” 

How about this video? 

Brown’s comical data management practices and the video summarize some typical issues with storing data without data management in mind. Data lacking organization, quality controls and security can quickly become an incomprehensible and unusable data dumpster. The best time to formulate a data management plan is before you start collecting the research data. Having a management plan in place before the study begins ensures the data will be organized, well documented, and stored securely during the project and long after it is completed. Waiting until the end of a project often results in lost data, lack of notes, or sometimes a lack of proper permissions to even analyze and publish a particular dataset.

Data Management and Storage Platforms

The following list represents the data management and storage platforms that Washington University in St. Louis offers to the School of Medicine research community.

REDCap (Research Electronic Data Capture) is a secure, web-based application for building and managing online databases and surveys. Developed at Vanderbilt University and made freely available to other institutions, our Washington University instance of REDCap is managed by the Institute for Informatics, Data Science and Biostatistics .

  • Join the WUSTL REDCap User Community Teams Group
  • Schedule a REDCap Consultation
  • Full List of WUSTL REDCap Training and Support
  • View recordings of REDCap workshops

LabArchives (WashU ELN)

LabArchives is a cloud-based electronic lab notebook (ELN) that makes organizing, storing, and sharing lab data fast, simple and accessible on all digital platforms. The professional (research) and classroom editions of LabArchives, as well as the Scheduler and Inventory modules, are available to everyone in the WashU community at no cost, courtesy of the Vice Chancellor for Research and the Chief Information Officer. Check out the LabArchives ELN information page and FAQs for additional details. The LabArchives HELP page is also available for learning about all LabArchives products and how to use them.

If you have any technical issues or access issues with WashU LabArchives, please contact the LabArchives support team via email ([email protected]).

Additional WashU data storage platforms

  • Research Infrastructure Services (RIS) data storage platform
  • RIS data storage platform and WUSTL Box comparison table
  • File Shares and Storage Options ( Printable File Storage Options Chart-PDF )
  • Secure Storage and Communication Services

At the beginning of your research project, it is important to review all applicable policies regarding data security from Washington University in St. Louis. Below is the list of WashU data security policies currently in place.

  • HIPAA Privacy Office Policies
  • Information Security Policies, Standards and Guidelines
  • Information Classification Policy
  • Data Classification
  • Managing User Access Policy
  • IRB Security Review

NIH encourages data management and sharing practices to be consistent with the FAIR ( F indable, A ccessible, I nteroperable, and R eusable) data principles. The FAIR principles emphasize machine actionability (i.e. the capacity of computational systems to find, access, interoperate, and reuse data with no or minimal human intervention) because humans increasingly rely on computational support to deal with data as a result of the increase in volume, complexity, and creation speed of data.

To learn more, visit the  GO FAIR initiative . 

A key element of making the data findable is metadata . Metadata and other documentation (e.g. data dictionaries, README files, study protocols) associated with a dataset allow users to understand how the data were collected and how to interpret the data.

Data Curation is the processing of data to ensure long-term accessibility, sharing, and preservation. The Data Curation Network (DCN) developed a standardized set of C-U-R-A-T-E-D steps and the CURATED checklists to aid individuals in the curation process.

Before an idea can bear fruit, there must be a description of methods or processes that will generate data for analysis. This will include: 1) an experimental protocol or description of the experimental workflow, 2) define the variables to be measured, 3) provide information about the study population or model organism, 4) describe the biological materials used in the study (reagents, biospecimens etc.), and 5) identify the equipment and software used to collect the data. The processes and methods include all research activities a researcher performs as well as all reagents, research participants, equipment and software necessary to produce the data.

This stage is the time to decide how best to organize the data, what quality control criteria and standards will be used when collecting the data and what workflows and parameters need to be documented.

  • Folder/File structure
  • File naming conventions
  • Data dictionary
  • Storage plan
  • Backup plan
  • Study participants (sex, age, ethnicity, behaviors, etc. for human participants)
  • Equipment (technology, manufacturer, settings, software, etc.)
  • Materials (reagents, chemicals, specimens, etc.)

Furthermore, it is important at this stage to identify discipline standards for data collection. The use of data standards and common data elements (CDEs) makes the data interoperable with other existing datasets and makes it easier for others in the field to properly understand your data.

There are several ways that a dataset can be collected and generated, including:

  • Quantitative measurements collected from humans, animals, cells, or other biological organisms
  • Observations, such as a study of human behavior
  • Simulated data produced by a computational model, where outputs are calculated for different inputs, or different assumptions about how the system functions
  • New data generated from existing datasets, such as using subsets of U.S. Census data to better understand the health status of a particular geographic region (secondary data)
  • ​​​​​​​Data collected through interviews, focus groups or surveys (qualitative data collection)

Good data management practices lay out file naming conventions, file structure, variable naming conventions, and storage and backup plans and schedules. However, establishing these conventions and plans is not enough to ensure that they will be adhered to.

Good intentions for data management at the beginning of a project can easily be neglected in the busy, high-pressure environment of a research lab unless workflows are established that make adherence an integrated part of the research process.

During the collection process, make sure researchers understand the importance of data management and have sufficient time to adhere to file naming conventions, file structures, storage plans and backup plans and collect data according to discipline standards and document variable names.

Processing Data : Raw data often requires some form of processing, either by a computer or by a human to put the data in a form that can be analyzed. There are many ways that data can be processed for analysis, for example:

  • cleaning the data, such as assigning a consistent value to non-responses in a survey
  • extracting data from an image, such as measuring the size of a tumor in an MRI
  • filtering signals, such as removing noise in an EKG measurement to enhance the signal
  • normalizing, transforming, or formatting data to better suit the analysis process

Whatever processing the data undergoes, the end result is that the data are in a form ready to be analyzed.

Data Analysis and Research Findings : Data analysis is the process of applying quantitative, statistical and visualization approaches to a dataset to test a hypothesis, evaluate findings or glean new insight into a phenomenon. Some examples of data analysis include:

  • Statistical analysis using a programming language such as R or Python
  • Using a statistical program such as STATA, SAS, SPSS or GraphPad Prism
  • Creating visualizations such as charts, graphs, and maps
  • Coding interviews or focus groups using a qualitative research method

Before data is processed and analyzed, it is essential to design and implement a plan for storing the analysis files. Not only should the analysis process be documented, but documentation should also include information about any software used in the analysis, including the version and operating system. For collaborative data analysis, virtual research environments can be set up to ensure that all collaborators have access to the data as well as the necessary tools to analyze the data.

It is crucial to establish a clear connection between analyzed and raw data files using file naming conventions and documentation. This will allow researchers to easily identify and if necessary, reanalyze raw data files in published tables, figures, and plots.

The following checklist can help to create a workflow for managing processed and analyzed data:

  • Design and implement a plan for storing analysis files
  • Determine where the analysis files will live in relation to the raw data
  • Document the analysis process
  • Document software used for analysis

For more information on data management, view the slides from Becker Library’s Data Management Planning and Practice workshop.

RDMkit

The ELIXIR Research Data Management Kit ( RDMkit ) has been designed as a guide for life scientists to better manage their research data following the FAIR Principles . The contents are generated and maintained by the ELIXIR community as part of the ELIXIR-CONVERGE project and are based on the various steps of the data lifecycle (Plan, Collect, Process, Analyze, Preserve, Share and Reuse).  It provides a combination of tools and resources for research data management based on role, domain, and tasks.

  • Data Management and Sharing Policies
  • Data Management and Sharing Plans
  • Human Participant Data Considerations
  • Data Sharing
  • Data Access and Reuse

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • CAREER FEATURE
  • 13 March 2018

Data management made simple

  • Quirin Schiermeier

You can also search for this author in PubMed   Google Scholar

When Marjorie Etique learnt that she had to create a data-management plan for her next research project, she was not sure exactly what to do.

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 51 print issues and online access

185,98 € per year

only 3,65 € per issue

Rent or buy this article

Prices vary by article type

Prices may be subject to local taxes which are calculated during checkout

Nature 555 , 403-405 (2018)

doi: https://doi.org/10.1038/d41586-018-03071-1

See Editorial: Everyone needs a data-management plan

Related Articles

data management and analysis in research example

The FAIR Guiding Principles for scientific data management and stewardship

Researcher parents are paying a high price for conference travel — here’s how to fix it

Researcher parents are paying a high price for conference travel — here’s how to fix it

Career Column 27 MAY 24

How researchers in remote regions handle the isolation

How researchers in remote regions handle the isolation

Career Feature 24 MAY 24

What steps to take when funding starts to run out

What steps to take when funding starts to run out

Software tools identify forgotten genes

Software tools identify forgotten genes

Technology Feature 24 MAY 24

Guidelines for academics aim to lessen ethical pitfalls in generative-AI use

Guidelines for academics aim to lessen ethical pitfalls in generative-AI use

Nature Index 22 MAY 24

Internet use and teen mental health: it’s about more than just screen time

Correspondence 21 MAY 24

The recruitment for Earth Science High-talent in IDSSE, CAS

Seeking global talents in the field of Earth Science and Ocean Engineering.

Sanya, Hainan, China

Institute of Deep-sea Science and Engineering, Chinese Academy of Sciences

data management and analysis in research example

Faculty(Group Leaders or Principal Investigators) and Postdoc positions

Faculty and Postdoc positions are open all year.

Hangzhou, Zhejiang, China

The Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine(ZJUSS)

data management and analysis in research example

Full-Time Faculty Member in Molecular Agrobiology at Peking University

Faculty positions in molecular agrobiology, including plant (crop) molecular biology, crop genomics and agrobiotechnology and etc.

Beijing, China

School of Advanced Agricultural Sciences, Peking University

data management and analysis in research example

Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Warmly Welcomes Talents Abroad

“Qiushi” Distinguished Scholar, Zhejiang University, including Professor and Physician

No. 3, Qingchun East Road, Hangzhou, Zhejiang (CN)

Sir Run Run Shaw Hospital Affiliated with Zhejiang University School of Medicine

data management and analysis in research example

Associate Editor, Nature Briefing

Associate Editor, Nature Briefing Permanent, full time Location: London, UK Closing date: 10th June 2024   Nature, the world’s most authoritative s...

London (Central), London (Greater) (GB)

Springer Nature Ltd

data management and analysis in research example

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Logo for Mavs Open Press

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

2.3 Data management and analysis

Learning objectives.

Learners will be able to…

  • Define and construct a data analysis plan
  • Define key quantitative data management terms—variable name, data dictionary, and observations/cases
  • Differentiate between univariate and bivariate quantitative analysis
  • Explain when we might use quantitative bivariate analysis in social work research
  • Identify how your qualitative research question, research aim, and type of data may influence your choice of analytic methods
  • Outline the steps you will take in preparation for conducting qualitative data analysis

After you have your raw data, whether this is secondary data or data you collected yourself, you will need to analyze it. While the specific steps to follow in quantitative or qualitative data analysis are beyond the scope of this chapter, we are going to address some basic concepts in this section to help you create a data analysis plan. A data analysis plan is an ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact step-by-step analyses that you plan to run to answer your research question. If you look back at Table 2.1, you will see that creating a data analysis plan is a part of the study design process. The data analysis plan flows from the research question, is integral to the study design, and should be well conceptualized prior to beginning data collection. In this section, we will walk through the basics of quantitative and qualitative data analysis to help you understand the fundamentals of creating a data analysis plan.

Quantitative Data: Management

When considering what data you might want to collect as part of your project, there are two important considerations that can create dilemmas for researchers. You might only get one chance to interact with your participants, so you must think comprehensively in your planning phase about what information you need and collect as much relevant data as possible. At the same time, though, especially when collecting sensitive information, you need to consider how onerous the data collection is for participants and whether you really need them to share that information. Just because something is interesting to us doesn’t mean it’s related enough to our research question to chase it down. Work with your research team and/or faculty early in your project to talk through these issues before you get to this point. And if you’re using secondary data, make sure you have access to all the information you need in that data before you use it.

Once you’ve collected your quantitative data, you need to make sure it is well-organized in a database in a way that’s actually usable. “Database” can be kind of a scary word, but really, it can be as simple as an Excel spreadsheet or a data file in whatever program you’re using to analyze your data.  You may want to avoid Excel and use a formal database such as Microsoft Access or MySQL if you’ve got a large or complicated data set. But if your data set is smaller and you plan to keep your analyses simple, you can definitely get away with Excel. A typical data set is organized with variables as columns and observations/cases as rows. For example, let’s say we did a survey on ice cream preferences and collected the following information in Table 2.3:

There are a few key data management terms to understand:

  • Variable name : Just what it sounds like—the name of your variable. Make sure this is something useful, short and, if you’re using something other than Excel, all one word. Most statistical programs will automatically rename variables for you if they aren’t one word, but the names can be a little ridiculous and long.
  • Observations/cases : The rows in your data set. In social work, these are often your study participants (people), but can be anything from census tracts to black bears to trains. When we talk about sample size, we’re talking about the number of observations/cases. In our mini data set, each person is an observation/case.
  • Data dictionary (also called a code book or metadata) : This is the document where you list your variable names, what the variables actually measure or represent, what each of the values of the variable mean if the meaning isn’t obvious (i.e., if there are numbers assigned to gender), the level of measurement and anything special to know about the variables (for instance, the source if you mashed two data sets together). If you’re using secondary data, the researchers sharing the data should make the data dictionary available.

Let’s take that mini data set we’ve got up above and we’ll show you what your data dictionary might look like in Table 2.4.

Quantitative Data: Univariate Analysis

As part of planning for your research, you should come up with a data analysis plan. Remember, a data analysis plan is an ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact step-by-step analyses that you plan to run to answer your research question. A basic data analysis plan might look something like what you see in Table 2.5. Don’t panic if you don’t yet understand some of the statistical terms in the plan; we’re going to delve into some of them in this section, and others will be covered in more depth in your statistics courses. Note here also that this is what operationalizing your variables and moving through your research with them looks like on a basic level. We will cover operationalization in more depth in Chapter 10.

An important point to remember is that you should never get stuck on using a particular statistical method because you or one of your co-researchers thinks it’s cool or it’s the hot thing in your field right now. You should certainly go into your data analysis plan with ideas, but in the end, you need to let your research question guide what statistical tests you plan to use. Be prepared to be flexible if your plan doesn’t pan out because the data is behaving in unexpected ways.

You’ll notice that the first step in the quantitative data analysis plan is univariate and descriptive statistics.   Univariate data analysis is a quantitative method in which a variable is examined individually to determine its distribution , or the way the scores are distributed across the levels, or values, of that variable. When we talk about levels ,  what we are talking about are the possible values of the variable—like a participant’s age, income or gender. (Note that this is different from levels of measurement , which will be discussed in Chapter 11, but the level of measurement of your variables absolutely affects what kinds of analyses you can do with it.) Univariate analysis is non-relational , which just means that we’re not looking into how our variables relate to each other. Instead, we’re looking at variables in isolation to try to understand them better. For this reason, univariate analysis is used for descriptive research questions.

So when do you use univariate data analysis? Always! It should be the first thing you do with your quantitative data, whether you are planning to move on to more sophisticated statistical analyses or are conducting a study to describe a new phenomenon. You need to understand what the values of each variable look like—what if one of your variables has a lot of missing data because participants didn’t answer that question on your survey? What if there isn’t much variation in the gender of your sample? These are things you’ll learn through univariate analysis.

Quantitative Data: Bivariate Analysis

Did you know that ice cream causes shark attacks? It’s true! When ice cream sales go up in the summer, so does the rate of shark attacks. So you’d better put down that ice cream cone, unless you want to make yourself look more delicious to a shark.

Photo of shark with open mouth emerging from water

Ok, so it’s quite obviously not true that ice cream causes shark attacks. But if you looked at these two variables and how they’re related, you’d notice that during times of the year with high ice cream sales, there are also the most shark attacks. This is a classic example of the difference between correlation and causation. Despite the fact that the conclusion we drew about causation was wrong, it’s nonetheless true that these two variables appear related, and researchers figured that out through the use of bivariate analysis.

Bivariate analysis consists of a group of statistical techniques that examine the association between two variables. We could look at how anti-depressant medications and appetite are related, whether there is a relation between having a pet and emotional well-being, or if a policy-maker’s level of education is related to how they vote on bills related to environmental issues.

Bivariate analysis forms the foundation of multivariate analysis, which we don’t get to in this book. All you really need to know here is that there are steps beyond bivariate analysis, which you’ve undoubtedly seen in scholarly literature already! But before we can move forward with multivariate analysis, we need to understand the associations between the variables in our study.

Throughout your PhD program, you will learn much more about quantitative data analysis techniques, including more sophisticated multivariate analysis methods. Hopefully this section has provided you with some initial insights into how data is analyzed, and the importance of creating a data analysis plan prior to collecting data. Next, we will discuss some basic strategies for creating a qualitative data analysis plan.

Resources for Quantitative Data Analysis

While you are affiliated with a university, it is likely that you will have access to some kind of commercial statistics software. Examples in the previous section uses SPSS, the most common one our authoring team has seen in social work education. Like its competitors SAS and STATA, SPSS is expensive and your license to the software must be renewed every year (like a subscription). Even if you are able to install commercial statistics software on your computer, once your license expires, your program will no longer work. We believe that forcing students to learn software they will never use is wasteful and contributes to the (accurate, in many cases) perception from students that research class is unrelated to real-world practice. SPSS is more accessible due to its graphical user interface and does not require researchers to learn basic computer programming, but it is prohibitively costly if a student wanted to use it to measure practice data in their agency post-graduation.

Instead, we suggest getting familiar with JASP Statistics , a free and open-source alternative to SPSS developed and supported by the University of Amsterdam. It has a similar user interface as SPSS, and should be similarly easy to learn. Moreover, usability upgrades from SPSS like generating APA formatted tables make it a compelling option. While a great many of my students will rely on statistical analyses of their programs and practices in reports to funders, it is unlikely that any will use SPSS. Browse JASP’s how-to guide or consult this textbook Learning Statistics with JASP: A Tutorial for Psychology Students and Other Beginners , written by  Danielle J. Navarro ,  David R. Foxcroft , and  Thomas J. Faulkenberry .

Another open source statistics software package is R (a.k.a. The R Project for Statistical Computing ). R uses a command line interface, so you will need some coding knowledge in order to use it. Luckily, R is the most commonly used statistics software in the world, and the community of support and guides for using R are omnipresent online. For beginning researchers, consult the textbook Learning Statistics with R: A tutorial for psychology students and other beginners by Danielle J. Navarro .

While statistics software is sometimes needed to perform advanced statistical tests, most univariate and bivariate tests can be performed in spreadsheet software like Microsoft Excel, Google Sheets, or the free and open source LibreOffice Calc . Microsoft includes a ToolPak to perform complex data analysis as an add-on to Excel. For more information on using spreadsheet software to perform statistics, the open textbook Collaborative Statistics Using Spreadsheets by Susan Dean, Irene Mary Duranczyk, Barbara Illowsky, Suzanne Loch, and Janet Stottlemyer.

Statistical analysis is performed in just about every discipline, and as a result, there are a lot of openly licensed, free resources to assist you with your data analysis. We have endeavored to provide you the basics in the past few chapters, but ultimately, you will likely need additional support in completing quantitative data analysis from an instructor, textbook, or other resource. Browse the Open Textbook Library for statistics resources or look for video tutorials from reputable instructors like this video textbook on statistics by Bryan Koenig .

Qualitative Data: Management

Qualitative research often involves human participants and qualitative data can include of recordings or transcripts of their words, photographs or images, or diaries and documents. The personal nature of qualitative data poses the challenge of recognizability of sensitive information on individuals, communities, and places. If you choose this methodology for your research, you should familiarize yourself with policies, procedures, and rules to ensure safety and security of data in the documentation and dissemination process.

In any research involving primary data, a researcher is not only entrusted with the responsibility of upholding privacy of their participants but also accountable to them, making confidentiality and human subjects’ protection front and center of qualitative data management. Data such as audiotapes, videotapes, transcripts, notes, and other records should be stored and secured in locations where only authorized persons have access to them.

Sometimes in qualitative research, you will learn intimate details about people’s lives. Often, qualitative data contain personal identifiers. A helpful practice to ensure that participants confidentiality is to replace personal information in transcripts with pseudonyms or descriptive language (e.g., “[the participant’s sister]” instead of the sister’s name). Once audio and video recordings have been accurately transcribed with the de-identification of personal identifiers, the original recordings should be destroyed.

Qualitative Data: Analysis

There are many different types of qualitative data, including transcripts of interviews and focus groups, observational data, documents and other artifacts, and more. Your qualitative data analysis plan should be anchored in the type of data collected and the purpose of your study. Qualitative research can serve a range of purposes. Below is a brief list of general purposes we might consider when using a qualitative approach.

  • Are you trying to understand how a particular group is affected by an issue?
  • Are you trying to uncover how people arrive at a decision in a given situation?
  • Are you trying to examine different points of view on the impact of a recent event?
  • Are you trying to summarize how people understand or make sense of a condition?
  • Are you trying to describe the needs of your target population?

If you don’t see the general aim of your research question reflected in one of these areas, don’t fret! This is only a small sampling of what you might be trying to accomplish with your qualitative study. Whatever your aim, you need to have a plan for what you will do once you have collected your data.

Iterative or Linear

Some qualitative research is linear , meaning it follows more of a traditionally quantitative process: create a plan, gather data, and analyze data; each step is completed before we proceed to the next. You can think of this like how information is presented in this book. We discuss each topic, one after another.

However, many times qualitative research is iterative , or evolving in cycles. An iterative approach means that once we begin collecting data, we also begin analyzing data as it is coming in. This early and ongoing analysis of our (incomplete) data then impacts our continued planning, data gathering and future analysis. Again, coming back to this book, while it may be written linear, we hope that you engage with it iteratively as you design and conduct your own research. By this we mean that you will revisit previous sections so you can understand how they fit together and you are in continuous process of building and revising how you think about the concepts you are learning about.

As you may have guessed, there are benefits and challenges to both linear and iterative approaches. A linear approach is much more straightforward, each step being fairly defined. However, linear research being more defined and rigid also presents certain challenges. A linear approach assumes that we know what we need to ask or look for at the very beginning of data collection, which often is not the case. Figure 2.1 contrasts the two approaches.

Comparison of linear and iterative systematic approaches. Linear approach box is a series of boxes with arrows between them in a line. The first box is "create a plan", then "gather data", ending with "analyze data". The iterative systematic approach is a series of boxes in a circle with arrows between them, with the boxes labeled "planning", "data gathering", and "analyzing the data".

With iterative research, we have more flexibility to adapt our approach as we learn new things. We still need to keep our approach systematic and organized, however, so that our work doesn’t become a free-for-all. As we adapt, we do not want to stray too far from the original premise of our study. It’s also important to remember with an iterative approach that we may risk ethical concerns if our work extends beyond the original boundaries of our informed consent and institutional review board agreement (IRB; see Chapter 3 for more on IRBs). If you feel that you do need to modify your original research plan in a significant way as you learn more about the topic, you can submit an addendum to modify your original application that was submitted. Make sure to keep detailed notes of the decisions that you are making and what is informing these choices. This helps to support transparency and your credibility throughout the research process.

Acquainting yourself with your data

As you begin your analysis, you need to get to know your data. This often means reading through your data prior to any attempt at breaking it apart and labeling it. You might read through a couple of times, in fact. This helps give you a more comprehensive feel for each piece of data and the data as a whole, again, before you start to break it down into smaller units or deconstruct it. This is especially important if others assisted us in the data collection process. We often gather data as part of team and everyone involved in the analysis needs to be very familiar with all of the data.

Capturing your emerging understanding of the data

During your reviewing you will start to develop and evolve your understanding of what the data means. Coding is a part of the qualitative data analysis process where we begin to interpret and assign meaning to the data. It represents one of the first steps as we begin to filter the data through our own subjective lens as the researcher. This understanding of the data should be dynamic and flexible, but you want to have a way to capture this understanding as it evolves. You may include this as part of your qualitative codebook where you are tracking the main ideas that are emerging and what they mean. Table 2.6 is an example of how your thinking might change about a code and how you can go about capturing it.

There are a variety of different approaches to qualitative analysis, including thematic analysis, content analysis, grounded theory, phenomenology, photovoice, and more. The specific steps you will take to code your qualitative data, and to generate themes from these codes, will vary based on the analytic strategy you are employing. In designing your qualitative study, you would identify an analytical approach as you plan out your project. The one you select would depend on the type of data you have and what you want to accomplish with it. In Chapter 19, we will go into more detail about various types of qualitative data analysis. Each qualitative approach has specific techniques and methods that take substantial study and practice to master.

Key Takeaways

  • Getting organized at the beginning of your project with a data analysis plan will help keep you on track. Data analysis plans should include your research question, a description of your data, and a step-by-step outline of what you’re going to do with it. [chapter 14.1]
  • Be flexible with your data analysis plan—sometimes data surprises us and we have to adjust the statistical tests we are using. [chapter 14.1]
  • Always make a data dictionary or, if using secondary data, get a copy of the data dictionary so you (or someone else) can understand the basics of your data. [chapter 14.1]
  • Bivariate analysis is a group of statistical techniques that examine the relationship between two variables. [chapter 15.1]
  • You need to conduct bivariate analyses before you can begin to draw conclusions from your data, including in future multivariate analyses. [chapter 15.1]
  • There are a lot of high quality and free online resources to learn and perform statistical analysis.
  • Qualitative research analysis requires preparation and careful planning. You will need to take time to familiarize yourself with the data in a general sense before you begin analyzing. [chapter 19.3]
  • The specific steps you will take to code your qualitative data and generate final themes will depend on the qualitative analytic approach you select.

TRACK 1 (IF YOU ARE CREATING A RESEARCH PROPOSAL FOR THIS CLASS)

  • Make a data analysis plan for your project. Remember this should include your research question, a description of the data you will use, and a step-by-step outline of what you’re going to do with your data once you have it, including statistical tests (non-relational and relational) that you plan to use. You can do this exercise whether you’re using quantitative or qualitative data! The same principles apply.
  • Make a data dictionary for the data you are proposing to collect as part of your study. You can use the example above as a template.

TRACK 2 (IF YOU  AREN’T CREATING A RESEARCH PROPOSAL FOR THIS CLASS)

You are researching the impact of your city’s recent harm reduction interventions for intravenous drug users (e.g., sterile injection kits, monitored use, overdose prevention, naloxone provision, etc.).

  • Make a draft quantitative data analysis plan for your project. Remember this should include your research question, a description of the data you will use, and a step-by-step outline of what you’re going to do with your data once you have it, including statistical tests (non-relational and relational) that you plan to use. It’s okay if you don’t yet have a complete idea of the types of statistical analyses you might use.

An ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact analyses, step-by-step, that you plan to run to answer your research question.

The name of your variable.

The rows in your data set. In social work, these are often your study participants (people), but can be anything from census tracts to black bears to trains.

This is the document where you list your variable names, what the variables actually measure or represent, what each of the values of the variable mean if the meaning isn't obvious.

process by which researchers spell out precisely how a concept will be measured in their study

A group of statistical techniques that examines the relationship between at least three variables

Univariate data analysis is a quantitative method in which a variable is examined individually to determine its distribution.

the way the scores are distributed across the levels of that variable.

Chapter Outline

  • Practical and ethical considerations ( 14 minute read)
  • Raw data (10 minute read)
  • Creating a data analysis plan (?? minute read)
  • Critical considerations (3 minute read)

Content warning: Examples in this chapter discuss substance use disorders, mental health disorders and therapies, obesity, poverty, gun violence, gang violence, school discipline, racism and hate groups, domestic violence, trauma and triggers, incarceration, child neglect and abuse, bullying, self-harm and suicide, racial discrimination in housing, burnout in helping professions, and sex trafficking of indigenous women.

2.1 Practical and ethical considerations

Learners will be able to...

  • Identify potential stakeholders and gatekeepers
  • Differentiate between raw data and the results of scientific studies
  • Evaluate whether you can feasibly complete your project

Pre-awareness check (Knowledge)

Similar to practice settings, research has ethical considerations that must be taken to ensure the safety of participants. What ethical considerations were relevant to your practice experience that may have impacted the delivery of services?

As a PhD student, you will have many opportunities to conduct research. You may be asked to be a part of a research team led by the faculty at your institution. You will also conduct your own research for your dissertation. As you will learn, research can take many forms. For example, you may want to focus qualitatively on individuals’ lived experiences, or perhaps you will quantitatively assess the impact of interventions on research subjects. You may work with large, already-existing datasets, or you may create your own data. Though social work research can vary widely from project to project, researchers typically follow the same general process, even if their specific research questions and methodologies differ. Table 2.1 outlines the major components of the research process covered in this textbook, and indicates the chapters where you will find more information on each subject. You will notice that your research paradigm is an organizing framework that guides each component of the research process.

Table 2.1 Components of the Research Process

Feasibility

Feasibility refers to whether you can practically conduct the study you plan to do, given the resources and ethical obligations you have. In this chapter, we will review some important practical and ethical considerations researchers should start thinking about from the beginning of a research project. These considerations apply to all research, but it is important to also consider the context of research and researchers when thinking about feasibility.

For example, as a doctoral student, you likely have a unique set of circumstances that inspire and constrain your research. Some students have the ability to engage in independent studies where they can gain skills and expertise in specialized research methods to prepare them for a research-intensive career. Others may have reasons, such as a limited amount of funding or family concerns, that encourage them to complete their dissertation research as quickly as possible. These circumstances relate to the feasibility of a research project. Regardless of the potential societal importance of a 10-year longitudinal study, it’s not feasible for a student to conduct it in time to graduate! Your dissertation chair, doctoral program director, and other faculty mentors can help you navigate the many decisions you will face as a doctoral student about conducting independent research or joining research projects.

The context and role of the researcher continue to affect feasibility even after a doctoral student graduates. Many will continue in their careers to become tenure track faculty with research expectations to obtain tenure. Some funders expect faculty members to have a track record of successful projects before trusting them to lead expensive or long-term studies.  Realistically, these expectations will influence what research is feasible for a junior faculty member to conduct. Just like for doctoral students, mentorship is incredibly valuable for junior faculty to make informed decisions about what research to conduct. Senior faculty, associate deans of research, chairs, and deans can help junior faculty decide what projects to pursue to ensure they meet the expectations placed on them without losing sight of the reasons they became a researcher in the first place.

As you read about other feasibility considerations such as gaining access, consent, and collecting data, consider the ways in which context and roles also influence feasibility.

Access, consent, and ethical obligations

One of the most important feasibility issues is gaining access to your target population. For example, let’s say you wanted to better understand middle-school students who engaged in self-harm behaviors. That is a topic of social importance, but what challenges might you face in accessing this population? Let's say you proposed to identify students from a local middle school and interview them about self-harm. Methodologically, that sounds great since you are getting data from those with the most knowledge about the topic, the students themselves. But practically, that sounds challenging. Think about the ethical obligations a social work practitioner has to adolescents who are engaging in self-harm (e.g., competence, respect). In research, we are similarly concerned mostly with the benefits and harms of what you propose to do as well as the openness and honesty with which you share your project publicly.

data management and analysis in research example

Gatekeepers

If you were the principal at your local middle school, would you allow researchers to interview kids in your schools about self-harm? What if the results of the study showed that self-harm was a big problem that your school was not addressing? What if the researcher's interviews themselves caused an increase in self-harming behaviors among the children? The principal in this situation is a gatekeeper . Gatekeepers are the individuals or organizations who control access to the population you want to study. The school board would also likely need to give consent for the research to take place at their institution. Gatekeepers must weigh their ethical questions because they have a responsibility to protect the safety of the people at their organization, just as you have an ethical obligation to protect the people in your research study.

For vulnerable populations, it can be a challenge to get consent from gatekeepers to conduct your research project. As a result, researchers often conduct research projects in places where they have established trust with gatekeepers. In the case where the population (children who self-harm) are too vulnerable, researchers may collect data from people who have secondary knowledge about the topic. For example, the principal may be more willing to let you talk to teachers or staff, rather than children.

Stakeholders

In some cases, researchers and gatekeepers partner on a research project. When this happens, the gatekeepers become stakeholders . Stakeholders are individuals or groups who have an interest in the outcome of the study you conduct. As you think about your project, consider whether there are formal advisory groups or boards (like a school board) or advocacy organizations who already serve or work with your target population. Approach them as experts and ask for their review of your study to see if there are any perspectives or details you missed that would make your project stronger.

There are many advantages to partnering with stakeholders to complete a research project together. Continuing with our example on self-harm in schools, in order to obtain access to interview children at a middle school, you will have to consider other stakeholders' goals. School administrators also want to help students struggling with self-harm, so they may want to use the results to form new programs. But they may also need to avoid scandal and panic if the results show high levels of self-harm. Most likely, they want to provide support to students without making the problem worse. By bringing in school administrators as stakeholders, you can better understand what the school is currently doing to address the issue and get an informed perspective on your project's questions. Negotiating the boundaries of a stakeholder relationship requires strong meso-level practice skills.

Of course, partnering with administrators probably sounds quite a bit easier than bringing on board the next group of stakeholders—parents. It's not ethical to ask children to participate in a study without their parents' consent. We will review the parameters of parental and child consent in Chapter 5 . Parents may be understandably skeptical of a researcher who wants to talk to their child about self-harm, and they may fear potential harm to the child and family from your study. Would you let a researcher you didn't know interview your children about a very sensitive issue?

Social work research must often satisfy multiple stakeholders. This is especially true if a researcher receives a grant to support the project, as the funder has goals it wants to accomplish by funding the research project. Your university is also a stakeholder in your project. When you conduct research, it reflects on your school. If you discover something of great importance, your school looks good. If you harm someone, they may be liable. Your university likely has opportunities for you to share your research with the campus community, and may have incentives or grant programs for researchers. Your school also provides you with support and access to resources like the library and data analysis software.

Target population

So far, we've talked about access in terms of gatekeepers and stakeholders. Let's assume all of those people agree that your study should proceed. But what about the people in the target population? They are the most important stakeholder of all! Think about the children in our proposed study on self-harm. How open do you think they would be to talking to you about such a sensitive issue? Would they consent to talk to you at all?

Maybe you are thinking about simply asking clients on your caseload. As we talked about before, leveraging existing relationships created through field work can help with accessing your target population. However, they introduce other ethical issues for researchers. Asking clients on your caseload or at your agency to participate in your project creates a dual relationship between you and your client. What if you learn something in the research project that you want to share with your clinical team? More importantly, would your client feel uncomfortable if they do not consent to your study? Social workers have power over clients, and any dual relationship would require strict supervision in the rare case it was allowed.

Resources and scope

Let's assume everyone consented to your project and you have adequately addressed any ethical issues with gatekeepers, stakeholders, and your target population. That means everything is ready to go, right? Not quite yet. As a researcher, you will need to carry out the study you propose to do. Depending on how big or how small your proposed project is, you’ll need a little or a lot of resources.

One thing that all projects need is raw data . Raw data can come in may forms. Very often in social science research, raw data includes the responses to a survey or transcripts of interviews and focus groups, but raw data can also include experimental results, diary entries, art, or other data points that social scientists use in analyzing the world. Primary data is data you have collected yourself. Sometimes, social work researchers do not collect raw data of their own, but instead use secondary data analysis to analyze raw data that has been shared by other researchers. Secondary data is data someone else has collected that you have permission to use in your research. For example, you could use data from a local probation program to determine if a shoplifting prevention group was reducing the rate at which people were re-offending. You would need data on who participated in the program and their criminal history six months after the end of their probation period. This is secondary data you could use to determine whether the shoplifting prevention group had any effect on an individual's likelihood of re-offending. Whether a researcher should use secondary data or collect their own raw data is an important choice which we will discuss in greater detail in section 2.2. Collecting raw data or obtaining secondary data can be time consuming or expensive, but without raw data there can be no research project.

data management and analysis in research example

Time is an important resource to consider when designing research projects. Make sure that your proposal won't require you to spend more time than you have to collect and analyze data. Think realistically about the timeline for your research project. If you propose to interview fifty mental health professionals in their offices in your community about your topic, make sure you can dedicate fifty hours to conduct those interviews, account for travel time, and think about how long it will take to transcribe and analyze those interviews.

  • What is reasonable for you to do in your timeframe?
  • How many hours each week can the research team dedicate to this project?

One thing that can delay a research project is receiving approval from the institutional review board (IRB), the research ethics committee at your university. If your study involves human subjects , you may have to formally propose your study to the IRB and get their approval before gathering your data. A well-prepared study is likely to gain IRB approval with minimal revisions needed, but the process can take weeks to complete and must be done before data collection can begin. We will address the ethical obligations of researchers in greater detail in Chapter 5 .

Most research projects cost some amount of money. Potential expenses include wages for members of the research team, incentives for research participants, travel expenses, and licensing costs for standardized instruments. Most researchers seek grant funding to support the research. Grant applications can be time consuming to write and grant funding can be competitive to receive.

Knowledge, competence, and skills

For social work researchers, the social work value of competence is key in their research ethics.

Clearly, researchers need to be skilled in working with their target population in order to conduct ethical research.  Some research addresses this challenge by collecting data from competent practitioners or administrators who have second-hand knowledge of target populations based on professional relationships. Members of the research team delivering an intervention also need to have training and skills in the intervention. For example, if a research study examines the effectiveness of dialectical behavioral therapy (DBT) in a particular context, the person delivering the DBT must be certified in DBT.  Another idea to keep in mind is the level of data collection and analysis skills needed to complete the project.  Some assessments require training to administer. Analyses may be complex or require statistical consultation or advanced training.

In summary, here are a few questions you should ask yourself about your project to make sure it's feasible. While we present them early on in the research process (we're only in Chapter 2), these are certainly questions you should ask yourself throughout the proposal writing process. We will revisit feasibility again in Chapter 9 when we work on finalizing your research question .

  • Do you have access to the data you need or can you collect the data you need?
  • Will you be able to get consent from stakeholders, gatekeepers, and your target population?
  • Does your project pose risk to individuals through direct harm, dual relationships, or breaches in confidentiality?
  • Are you competent enough to complete the study?
  • Do you have the resources and time needed to carry out the project?
  • People will have to say “yes” to your research project. Evaluate whether your project might have gatekeepers or potential stakeholders. They may control access to data or potential participants.
  • Researchers need raw data such as survey responses, interview transcripts, or client charts. Your research project must involve more than looking at the analyses conducted by other researchers, as the literature review is only the first step of a research project.
  • Make sure you have enough resources (time, money, and knowledge) to complete your research project.

Post-awareness check (Emotion)

What factors have created your passion toward assisting your target population? How can this connection enhance your ability to receive a “yes” from potential participants? What are the anticipated challenges to receiving a “yes” from potential participants?

Think about how you might answer your question by collecting your own data.

  • Identify any gatekeepers and stakeholders you might need to contact.
  • How can you increase the likelihood you will get access to the people or records you need for your study?

Describe the resources you will need for your project.

  • Do you have concerns about feasibility?

TRACK 2 (IF YOU  AREN'T CREATING A RESEARCH PROPOSAL FOR THIS CLASS)

You are researching the impact of your city's recent harm reduction interventions for intravenous drug users (e.g., sterile injection kits, monitored use, overdose prevention, naloxone provision, etc.).

  • Thinking about the services related to this issue in your own city, identify any gatekeepers and stakeholders you might need to contact.
  • How might you approach these gatekeepers and stakeholders? How would you explain your study?

2.2 Raw data

  • Identify potential sources of available data
  • Weigh the challenges and benefits of collecting your own data

In our previous section, we addressed some of the challenges researchers face in collecting and analyzing raw data. Just as a reminder, raw data are unprocessed, unanalyzed data that researchers analyze using social science research methods. It is not just the statistics or qualitative themes in journal articles. It is the actual data from which those statistical outputs or themes are derived (e.g., interview transcripts or survey responses).

There are two approaches to getting raw data. First, students can analyze data that are publicly available or from agency records. Using secondary data like this can make projects more feasible, but you may not find existing data that are useful for answering your working question. For that reason, many students gather their own raw data. As we discussed in the previous section, potential harms that come from addressing sensitive topics mean that surveys and interviews of practitioners or other less-vulnerable populations may be the most feasible and ethical way to approach data collection.

Using secondary data

Within the agency setting, there are two main sources of raw data. One option is to examine client charts. For example, if you wanted to know if substance use was related to parental reunification for youth in foster care, you could look at client files and compare how long it took for families with differing levels of substance use to be reunified. You will have to negotiate with the agency the degree to which your analysis can be public. Agencies may be okay with you using client files for a class project but less comfortable with you presenting your findings at a city council meeting. When analyzing data from your agency, you will have to manage a stakeholder relationship.

Another great example of agency-based raw data comes from program evaluations. If you are working with a grant funded agency, administrators and clinicians are likely producing data for grant reporting. The agency may consent to have you look at the raw data and run your own analysis. Larger agencies may also conduct internal research—for example, surveying employees or clients about new initiatives. These, too, can be good sources of available data. Generally, if the agency has already collected the data, you can ask to use them. Again, it is important to be clear on the boundaries and expectations of the agency. And don't be angry if they say no!

Some agencies, usually government agencies, publish their data in formal reports. You could take a look at some of the websites for county or state agencies to see if there are any publicly available data relevant to your research topic. As an example, perhaps there are annual reports from the state department of education that show how seclusion and restraint is disproportionately applied to Black children with disabilities , as students found in Virginia. In another example, one student matched public data from their city's map of criminal incidents with historically redlined neighborhoods. For this project, she is using publicly available data from Mapping Inequality , which digitized historical records of redlined housing communities and the Roanoke, VA crime mapping webpage . By matching historical data on housing redlining with current crime records, she is testing whether redlining still impacts crime to this day.

Not all public data are easily accessible, though. The student in the previous example was lucky that scholars had digitized the records of how Virginia cities were redlined by race. Sources of historical data are often located in physical archives, rather than digital archives. If your project uses historical data in an archive, it would require you to physically go to the archive in order to review the data. Unless you have a travel budget, you may be limited to the archival data in your local libraries and government offices. Similarly, government data may have to be requested from an agency, which can take time. If the data are particularly sensitive or if the department would have to dedicate a lot of time to your request, you may have to file a Freedom of Information Act request. This process can be time-consuming, and in some cases, it will add financial cost to your study.

Another source of secondary data is shared by researchers as part of the publication and review process. There is a growing trend in research to publicly share data so others can verify your results and attempt to replicate your study. In more recent articles, you may notice links to data provided by the researcher. Often, these have been de-identified by eliminating some information that could lead to violations of confidentiality. You can browse through the data repositories in Table 2.1 to find raw data to analyze. Make sure that you pick a data set with thorough and easy to understand documentation. You may also want to use Google's dataset search which indexes some of the websites below as well as others in a very intuitive and easy to use way.

Ultimately, you will have to weigh the strengths and limitations of using secondary data on your own. Engel and Schutt (2016, p. 327) [1] propose six questions to ask before using secondary data:

  • What were the agency’s or researcher’s goals in collecting the data?
  • What data were collected, and what were they intended to measure?
  • When was the information collected?
  • What methods were used for data collection? Who was responsible for data collection, and what were their qualifications? Are they available to answer questions about the data?
  • How is the information organized (by date, individual, family, event, etc.)? Are identifiers used to indicate different types of data available?
  • What is known about the success of the data collection effort? How are missing data indicated and treated? What kind of documentation is available? How consistent are the data with data available from other sources?

In this section, we've talked about data as though it is always collected by scientists and professionals. But that's definitely not the case! Think more broadly about sources of data that are already out there in the world. Perhaps you want to examine the different topics mentioned in the past 10 State of the Union addresses by the President. Or maybe you want to examine whether the websites and public information about local health and mental health agencies use gender-inclusive language. People share their experiences through blogs, social media posts, videos, performances, among countless other sources of data. When you think broadly about data, you'll be surprised how much you can answer with available data.

Collecting your own raw data

The primary benefit of collecting your own data is that it allows you to collect and analyze the specific data you are looking for, rather than relying on what other people have shared. You can make sure the right questions are asked to the right people. Your early research projects may be smaller in scope. This isn't necessarily a limitation. Early projects are often the first step in a long research trajectory in which the same topic is studied in increasing detail and sophistication over time.

Student researchers often propose to survey or interview practitioners. The focus of these projects should be about the practice of social work and the study will uncover how practitioners understand what they do. Surveys of practitioners often test whether responses to questions are related to each other. For example, you could propose to examine whether someone's length of time in practice was related to the type of therapy they use or their level of burnout. Interviews or focus groups can also illuminate areas of practice. One student proposed to conduct focus groups of individuals in different helping professions in order to understand how they viewed the process of leaving an abusive partner. She suspected that people from different disciplines would make unique assumptions about the survivor's choices.

It's worth remembering here that you need to have access to practitioners, as we discussed in the previous section. Resourceful researchers will look at publicly available databases of practitioners, draw from agency and personal contacts, or post in public forums like Facebook groups. Consent from gatekeepers is important, and as we described earlier, you and your agency may be interested in collaborating on a project. Bringing your agency on board as a stakeholder in your project may allow you access to company email lists or time at staff meetings as well as access to practitioners. One student partnered with her internship placement at a local hospital to measure the burnout that nurses experienced in their department. Her project helped the agency identify which departments may need additional support.

Another possible way you could collect data is by partnering with your agency on evaluating an existing program. Perhaps they want you to evaluate the early stage of a program to see if it's going as planned and if any changes need to be made. Maybe there is an aspect of the program they haven't measured but would like to, and you can fill that gap for them. Collaborating with agency partners in this way can be a challenge, as you must negotiate roles, get stakeholder buy-in, and manage the conflicting time schedules of field work and research work. At the same time, it allows you to make your work immediately relevant to your specific practice and client population.

In summary, many early projects fall into one of the following categories. These aren't your only options! But they may be helpful in thinking about what research projects can look like.

  • Analyzing charts or program evaluations at an agency
  • Analyzing existing data from an agency, government body, or other public source
  • Analyzing popular media or cultural artifacts
  • Surveying or interviewing practitioners, administrators, or other less-vulnerable groups
  • Conducting a program evaluation in collaboration with an agency
  • All research projects require analyzing raw data.
  • Research projects often analyze available data from agencies, government, or public sources. Doing so allows researchers to avoid the process of recruiting people to participate in their study. This makes projects more feasible but limits what you can study to the data that are already available to you.
  • Think through the potential harm of discussing sensitive topics when surveying or interviewing clients and other vulnerable populations. Since many social work topics are sensitive, researchers often collect data from less-vulnerable populations such as practitioners and administrators.

Post-awareness check (Environment)

In what environment are you most comfortable in data collection (phone calls, face to face recruitment, etc)? Consider your preferred method of data collection that may align with both your personality and your target population.

  • Describe the difference between raw data and the results of research articles.
  • Consider browsing around the data repositories in Table 2.1.
  • Identify a common type of project (e.g., surveys of practitioners) and how conducting a similar project might help you answer your working question.
  • What kind of raw data might you collect yourself for your study?

2.3 Creating a data analysis plan

  • Define and construct a data analysis plan.
  • Define key quantitative data management terms—variable name, data dictionary, primary and secondary data, observations/cases.
  • Differentiate between univariate and bivariate quantitative analysis.
  • Explain when we might use quantitative bivariate analysis in social work research.
  • Identify how your qualitative research question, research aim, and type of data may influence your choice of analytic methods.
  • Outline the steps you will take in preparation for conducting qualitative data analysis.

After you have your raw data , whether this is secondary data or data you collected yourself, you will need to analyze it. While the specific steps to follow in quantitative or qualitative data analysis are beyond the scope of this chapter, we are going to address some basic concepts in this section to help you create a data analysis plan. A data analysis plan is an ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact step-by-step analyses that you plan to run to answer your research question. If you look back at Table 2.1, you will see that creating a data analysis plan is a part of the study design process. The data analysis plan flows from the research question, is integral to the study desig n, and should be well conceptualized prior to beginning data collection. In this section, we will walk through the basics of quantitative and qualitative data analysis to help you understand the fundamentals of creating a data analysis plan.

When considering what data you might want to collect as part of your project, there are two important considerations that can create dilemmas for researchers. You might only get one chance to interact with your participants, so you must think comprehensively in your planning phase about what information you need and collect as much relevant data as possible. At the same time, though, especially when collecting sensitive information, you need to consider how onerous the data collection is for participants and whether you really need them to share that information. Just because something is interesting to us doesn't mean it's related enough to our research question to chase it down. Work with your research team and/or faculty early in your project to talk through these issues before you get to this point. And if you're using secondary data, make sure you have access to all the information you need in that data before you use it.

Once you've collected your quantitative data, you need to make sure it is well- organized in a database in a way that's actually usable. "Database" can be kind of a scary word, but really, it can be as simple as an Excel spreadsheet or a data file in whatever program you're using to analyze your data.  You may want to avoid Excel and use a formal database such as Microsoft Access or MySQL if you've got a large or complicated data set. But if your data set is smaller and you plan to keep your analyses simple, you can definitely get away with Excel. A typical data set is organized with variables as columns and observations/cases as rows. For example, let's say we did a survey on ice cream preferences and collected the following information in Table 2.3:

  • Variable name : Just what it sounds like—the name of your variable. Make sure this is something useful, short and, if you're using something other than Excel, all one word. Most statistical programs will automatically rename variables for you if they aren't one word, but the names can be a little ridiculous and long.
  • Observations/cases : The rows in your data set. In social work, these are often your study participants (people), but can be anything from census tracts to black bears to trains. When we talk about sample size, we're talking about the number of observations/cases. In our mini data set, each person is an observation/case.
  • Data dictionary (sometimes called a code book or metadata) : This is the document where you list your variable names, what the variables actually measure or represent, what each of the values of the variable mean if the meaning isn't obvious (i.e., if there are numbers assigned to gender), the level of measurement and anything special to know about the variables (for instance, the source if you mashed two data sets together). If you're using secondary data, the researchers sharing the data should make the data dictionary available .

Let's take that mini data set we've got up above and we'll show you what your data dictionary might look like in Table 2.4.

As part of planning for your research, you should come up with a data analysis plan. Remember, a data analysis plan is an ordered outline that includes your research question, a description of the data you are going to use to answer it, and the exact step-by-step analyses that you plan to run to answer your research question. A basic data analysis plan might look something like what you see in Table 2.5. Don't panic if you don't yet understand some of the statistical terms in the plan; we're going to delve into some of them in this section, and others will be covered in more depth in your statistics courses. Note here also that this is what operationalizing your variables and moving through your research with them looks like on a basic level. We will cover operationalization in more depth in Chapter 11.

An important point to remember is that you should never get stuck on using a particular statistical method because you or one of your co-researchers thinks it's cool or it's the hot thing in your field right now. You should certainly go into your data analysis plan with ideas, but in the end, you need to let your research question guide what statistical tests you plan to use. Be prepared to be flexible if your plan doesn't pan out because the data is behaving in unexpected ways.

You'll notice that the first step in the quantitative data analysis plan is univariate and descriptive statistics.   Univariate data analysis is a quantitative method in which a variable is examined individually to determine its distribution , or the way the scores are distributed across the levels, or values, of that variable. When we talk about levels ,  what we are talking about are the possible values of the variable—like a participant's age, income or gender. (Note that this is different from levels of measurement , which will be discussed in Chapter 11, but the level of measurement of your variables absolutely affects what kinds of analyses you can do with it.) Univariate analysis is n on-relational , which just means that we're not looking into how our variables relate to each other. Instead, we're looking at variables in isolation to try to understand them better. For this reason, univariate analysis is used for descriptive research questions.

So when do you use univariate data analysis? Always! It should be the first thing you do with your quantitative data, whether you are planning to move on to more sophisticated statistical analyses or are conducting a study to describe a new phenomenon. You need to understand what the values of each variable look like—what if one of your variables has a lot of missing data because participants didn't answer that question on your survey? What if there isn't much variation in the gender of your sample? These are things you'll learn through univariate analysis.

Did you know that ice cream causes shark attacks? It's true! When ice cream sales go up in the summer, so does the rate of shark attacks. So you'd better put down that ice cream cone, unless you want to make yourself look more delicious to a shark.

Photo of shark with open mouth emerging from water

Ok, so it's quite obviously not true that ice cream causes shark attacks. But if you looked at these two variables and how they're related, you'd notice that during times of the year with high ice cream sales, there are also the most shark attacks. Despite the fact that the conclusion we drew about the relationship was wrong, it's nonetheless true that these two variables appear related, and researchers figured that out through the use of bivariate analysis. (You will learn about correlation versus causation in  Chapter 8 .)

Bivariate analysis consists of a group of statistical techniques that examine the association between two variables. We could look at how anti-depressant medications and appetite are related, whether there is a relation between having a pet and emotional well-being, or if a policy-maker's level of education is related to how they vote on bills related to environmental issues.

Bivariate analysis forms the foundation of multivariate analysis, which we don't get to in this book. All you really need to know here is that there are steps beyond bivariate analysis, which you've undoubtedly seen in scholarly literature already! But before we can move forward with multivariate analysis, we need to understand the associations between the variables in our study .

[MADE THIS UP] Throughout your PhD program, you will learn more about quantitative data analysis techniques. Hopefully this section has provided you with some initial insights into how data is analyzed, and the importance of creating a data analysis plan prior to collecting data. Next, we will discuss some basic strategies for creating a qualitative data analysis plan.

If you don't see the general aim of your research question reflected in one of these areas, don't fret! This is only a small sampling of what you might be trying to accomplish with your qualitative study. Whatever your aim, you need to have a plan for what you will do once you have collected your data.

Iterative or linear

Some qualitative research is linear , meaning it follows more of a tra ditionally quantitative process: create a plan, gather data, and analyze data; each step is completed before we proceed to the next. You can think of this like how information is presented in this book. We discuss each topic, one after another. 

However, many times qualitative research is iterative , or evolving in cycles. An iterative approach means that once we begin collecting data, we also begin analyzing data as it is coming in. This early and ongoing analysis of our (incomplete) data then impacts our continued planning, data gathering and future analysis. Again, coming back to this book, while it may be written linear, we hope that you engage with it iteratively as you design and conduct your own research. By this we mean that you will revisit previous sections so you can understand how they fit together and you are in continuous process of building and revising how you think about the concepts you are learning about. 

As you may have guessed, there are benefits and challenges to both linear and iterative approaches. A linear approach is much more straightforward, each step being fairly defined. However, linear research being more defined and rigid also presents certain challenges. A linear approach assumes that we know what we need to ask or look for at the very beginning of data collection, which often is not the case.

With iterative research, we have more flexibility to adapt our approach as we learn new things. We still need to keep our approach systematic and organized, however, so that our work doesn't become a free-for-all. As we adapt, we do not want to stray too far from the original premise of our study. It's also important to remember with an iterative approach that we may risk ethical concerns if our work extends beyond the original boundaries of our informed consent and institutional review board agreement (IRB; see Chapter 6 for more on IRBs). If you feel that you do need to modify your original research plan in a significant way as you learn more about the topic, you can submit an addendum to modify your original application that was submitted. Make sure to keep detailed notes of the decisions that you are making and what is informing these choices. This helps to support transparency and your credibility throughout the research process.

As y ou begin your analysis, y ou need to get to know your data. This often  means reading through your data prior to any attempt at breaking it apart and labeling it. You mig ht read through a couple of times, in fact. This helps give you a more comprehensive feel for each piece of data and the data as a whole, again, before you start to break it down into smaller units or deconstruct it. This is especially important if others assisted us in the data collection process. We often gather data as part of team and everyone involved in the analysis needs to be very familiar with all of the data. 

During your reviewing you will start to develop and evolve your understanding of what the data means. Coding is a part of the qualitative data analysis process where we begin to interpret and assign meaning to the data. It represents one of the first steps as we begin to filter the data through our own subjective lens as the researcher. This understanding of the data should be dynamic and flexible, but you want to have a way to capture this understanding as it evolves. You may include this as part of your qualitative codebook where you are tracking the main ideas that are emerging and what they mean. Figure 2.2 is an example of how your thinking might change about a code and how you can go about capturing it. 

There are a variety of different approaches to qualitative analysis, including thematic analysis, content analysis, grounded theory, phenomenology, photovoice, and more. The specific steps you will take to code your qualitative data, and to generate themes from these codes, will vary based on the analytic strategy you are employing. In designing your qualitative study, you would identify an analytical approach as you plan out your project. The one you select would depend on the type of data you have and what you want to accomplish with it.

  • Getting organized at the beginning of your project with a data analysis plan will help keep you on track. Data analysis plans should include your research question, a description of your data, and a step-by-step outline of what you're going to do with it. [chapter 14.1]

Exercises [from chapter 14.1]

  • Make a data analysis plan for your project. Remember this should include your research question, a description of the data you will use, and a step-by-step outline of what you're going to do with your data once you have it, including statistical tests (non-relational and relational) that you plan to use. You can do this exercise whether you're using quantitative or qualitative data! The same principles apply.
  • Make a draft quantitative data analysis plan for your project. Remember this should include your research question, a description of the data you will use, and a step-by-step outline of what you're going to do with your data once you have it, including statistical tests (non-relational and relational) that you plan to use. It's okay if you don't yet have a complete idea of the types of statistical analyses you might use.

2.4 Critical considerations

  • Critique the traditional role of researchers and identify how action research addresses these issues

So far in this chapter, we have presented the steps of research projects as follows:

  • Find a topic that is important to you and read about it.
  • Pose a question that is important to the literature and to your community.
  • Propose to use specific research methods and data analysis techniques to answer your question.
  • Carry out your project and report the results.

These were depicted in more detail in Table 2.1 earlier in this chapter. There are important limitations to this approach. This section examines those problems and how to address them.

Whose knowledge is privileged?

First, let's critically examine your role as the researcher. Following along with the steps in a research project, you start studying the literature on your topic, find a place where you can add to scientific knowledge, and conduct your study. But why are you the person who gets to decide what is important? Just as clients are the experts on their lives, members of your target population are the experts on their lives. What does it mean for a group of people to be researched on, rather than researched with? How can we better respect the knowledge and self-determination of community members?

data management and analysis in research example

A different way of approaching your research project is to start by talking with members of the target population and those who are knowledgeable about that community. Perhaps there is a community-led organization you can partner with on a research project. The researcher's role in this case would be more similar to a consultant, someone with specialized knowledge about research who can help communities study problems they consider to be important. The social worker is a co-investigator, and community members are equal partners in the research project. Each has a type of knowledge—scientific expertise vs. lived experience—that should inform the research process.

The community focus highlights something important: they are localized. These projects can dedicate themselves to issues at a single agency or within a service area. With a local scope, researchers can bring about change in their community. This is the purpose behind action research.

Action research

Action research   is research that is conducted for the purpose of creating social change. When engaging in action research, scholars collaborate with community stakeholders to conduct research that will be relevant to the community. Social workers who engage in action research don't just go it alone; instead, they collaborate with the people who are affected by the research at each stage in the process. Stakeholders, particularly those with the least power, should be consulted on the purpose of the research project, research questions, design, and reporting of results.

Action research also distinguishes itself from other research in that its purpose is to create change on an individual and community level. Kristin Esterberg puts it quite eloquently when she says, “At heart, all action researchers are concerned that research not simply contribute to knowledge but also lead to positive changes in people’s lives” (2002, p. 137). [2] Action research has multiple origins across the globe, including Kurt Lewin’s psychological experiments in the US and Paulo Friere’s literacy and education programs (Adelman, 1993; Reason, 1994). [3] Over the years, action research has become increasingly popular among scholars who wish for their work to have tangible outcomes that benefit the groups they study.

A traditional scientist might look at the literature or use their practice wisdom to formulate a question for quantitative or qualitative research, as we suggested earlier in this chapter. An action researcher, on the other hand, would consult with people in the target population and community to see what they believe the most pressing issues are and what their proposed solutions may be. In this way, action research flips traditional research on its head. Scientists are not the experts on the research topic. Instead, they are more like consultants who provide the tools and resources necessary for a target population to achieve their goals and to address social problems using social science research.

According to Healy (2001), [4] the assumptions of participatory-action research are that (a) oppression is caused by macro-level structures such as patriarchy and capitalism; (b) research should expose and confront the powerful; (c) researcher and participant relationships should be equal, with equitable distribution of research tasks and roles; and (d) research should result in consciousness-raising and collective action. Consistent with social work values, action research supports the self-determination of oppressed groups and privileges their voice and understanding through the conceptualization, design, data collection, data analysis, and dissemination processes of research. We will return to similar ideas in Part 4 of the textbook when we discuss qualitative research methods, though action research can certainly be used with quantitative research methods, as well.

  • Traditionally, researchers did not consult target populations and communities prior to formulating a research question. Action research proposes a more community-engaged model in which researchers are consultants that help communities research topics of import to them.

Post- awareness check (Knowledge)

Based on what you know of your target population, what are a few ways to receive their “buy-in” to participate in your proposed research study?

  • Apply the key concepts of action research to your project. How might you incorporate the perspectives and expertise of community members in your project?

The level that describes how data for variables are recorded. The level of measurement defines the type of operations can be conducted with your data. There are four levels: nominal, ordinal, interval, and ratio.

Referring to data analysis that doesn't examine how variables relate to each other.

a group of statistical techniques that examines the relationship between two variables

A research process where you create a plan, you gather your data, you analyze your data and each step is completed before you proceed to the next.

An iterative approach means that after planning and once we begin collecting data, we begin analyzing as data as it is coming in.  This early analysis of our (incomplete) data, then impacts our planning, ongoing data gathering and future analysis as it progresses.

Part of the qualitative data analysis process where we begin to interpret and assign meaning to the data.

A document that we use to keep track of and define the codes that we have identified (or are using) in our qualitative data analysis.

Doctoral Research Methods in Social Work Copyright © by Mavs Open Press. All Rights Reserved.

Share This Book

PW Skills | Blog

Data Analysis Techniques in Research – Methods, Tools & Examples

' src=

Varun Saharawat is a seasoned professional in the fields of SEO and content writing. With a profound knowledge of the intricate aspects of these disciplines, Varun has established himself as a valuable asset in the world of digital marketing and online content creation.

data analysis techniques in research

Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.

Data Analysis Techniques in Research : While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence. Data analysis involves refining, transforming, and interpreting raw data to derive actionable insights that guide informed decision-making for businesses.

Data Analytics Course

A straightforward illustration of data analysis emerges when we make everyday decisions, basing our choices on past experiences or predictions of potential outcomes.

If you want to learn more about this topic and acquire valuable skills that will set you apart in today’s data-driven world, we highly recommend enrolling in the Data Analytics Course by Physics Wallah . And as a special offer for our readers, use the coupon code “READER” to get a discount on this course.

Table of Contents

What is Data Analysis?

Data analysis is the systematic process of inspecting, cleaning, transforming, and interpreting data with the objective of discovering valuable insights and drawing meaningful conclusions. This process involves several steps:

  • Inspecting : Initial examination of data to understand its structure, quality, and completeness.
  • Cleaning : Removing errors, inconsistencies, or irrelevant information to ensure accurate analysis.
  • Transforming : Converting data into a format suitable for analysis, such as normalization or aggregation.
  • Interpreting : Analyzing the transformed data to identify patterns, trends, and relationships.

Types of Data Analysis Techniques in Research

Data analysis techniques in research are categorized into qualitative and quantitative methods, each with its specific approaches and tools. These techniques are instrumental in extracting meaningful insights, patterns, and relationships from data to support informed decision-making, validate hypotheses, and derive actionable recommendations. Below is an in-depth exploration of the various types of data analysis techniques commonly employed in research:

1) Qualitative Analysis:

Definition: Qualitative analysis focuses on understanding non-numerical data, such as opinions, concepts, or experiences, to derive insights into human behavior, attitudes, and perceptions.

  • Content Analysis: Examines textual data, such as interview transcripts, articles, or open-ended survey responses, to identify themes, patterns, or trends.
  • Narrative Analysis: Analyzes personal stories or narratives to understand individuals’ experiences, emotions, or perspectives.
  • Ethnographic Studies: Involves observing and analyzing cultural practices, behaviors, and norms within specific communities or settings.

2) Quantitative Analysis:

Quantitative analysis emphasizes numerical data and employs statistical methods to explore relationships, patterns, and trends. It encompasses several approaches:

Descriptive Analysis:

  • Frequency Distribution: Represents the number of occurrences of distinct values within a dataset.
  • Central Tendency: Measures such as mean, median, and mode provide insights into the central values of a dataset.
  • Dispersion: Techniques like variance and standard deviation indicate the spread or variability of data.

Diagnostic Analysis:

  • Regression Analysis: Assesses the relationship between dependent and independent variables, enabling prediction or understanding causality.
  • ANOVA (Analysis of Variance): Examines differences between groups to identify significant variations or effects.

Predictive Analysis:

  • Time Series Forecasting: Uses historical data points to predict future trends or outcomes.
  • Machine Learning Algorithms: Techniques like decision trees, random forests, and neural networks predict outcomes based on patterns in data.

Prescriptive Analysis:

  • Optimization Models: Utilizes linear programming, integer programming, or other optimization techniques to identify the best solutions or strategies.
  • Simulation: Mimics real-world scenarios to evaluate various strategies or decisions and determine optimal outcomes.

Specific Techniques:

  • Monte Carlo Simulation: Models probabilistic outcomes to assess risk and uncertainty.
  • Factor Analysis: Reduces the dimensionality of data by identifying underlying factors or components.
  • Cohort Analysis: Studies specific groups or cohorts over time to understand trends, behaviors, or patterns within these groups.
  • Cluster Analysis: Classifies objects or individuals into homogeneous groups or clusters based on similarities or attributes.
  • Sentiment Analysis: Uses natural language processing and machine learning techniques to determine sentiment, emotions, or opinions from textual data.

Also Read: AI and Predictive Analytics: Examples, Tools, Uses, Ai Vs Predictive Analytics

Data Analysis Techniques in Research Examples

To provide a clearer understanding of how data analysis techniques are applied in research, let’s consider a hypothetical research study focused on evaluating the impact of online learning platforms on students’ academic performance.

Research Objective:

Determine if students using online learning platforms achieve higher academic performance compared to those relying solely on traditional classroom instruction.

Data Collection:

  • Quantitative Data: Academic scores (grades) of students using online platforms and those using traditional classroom methods.
  • Qualitative Data: Feedback from students regarding their learning experiences, challenges faced, and preferences.

Data Analysis Techniques Applied:

1) Descriptive Analysis:

  • Calculate the mean, median, and mode of academic scores for both groups.
  • Create frequency distributions to represent the distribution of grades in each group.

2) Diagnostic Analysis:

  • Conduct an Analysis of Variance (ANOVA) to determine if there’s a statistically significant difference in academic scores between the two groups.
  • Perform Regression Analysis to assess the relationship between the time spent on online platforms and academic performance.

3) Predictive Analysis:

  • Utilize Time Series Forecasting to predict future academic performance trends based on historical data.
  • Implement Machine Learning algorithms to develop a predictive model that identifies factors contributing to academic success on online platforms.

4) Prescriptive Analysis:

  • Apply Optimization Models to identify the optimal combination of online learning resources (e.g., video lectures, interactive quizzes) that maximize academic performance.
  • Use Simulation Techniques to evaluate different scenarios, such as varying student engagement levels with online resources, to determine the most effective strategies for improving learning outcomes.

5) Specific Techniques:

  • Conduct Factor Analysis on qualitative feedback to identify common themes or factors influencing students’ perceptions and experiences with online learning.
  • Perform Cluster Analysis to segment students based on their engagement levels, preferences, or academic outcomes, enabling targeted interventions or personalized learning strategies.
  • Apply Sentiment Analysis on textual feedback to categorize students’ sentiments as positive, negative, or neutral regarding online learning experiences.

By applying a combination of qualitative and quantitative data analysis techniques, this research example aims to provide comprehensive insights into the effectiveness of online learning platforms.

Also Read: Learning Path to Become a Data Analyst in 2024

Data Analysis Techniques in Quantitative Research

Quantitative research involves collecting numerical data to examine relationships, test hypotheses, and make predictions. Various data analysis techniques are employed to interpret and draw conclusions from quantitative data. Here are some key data analysis techniques commonly used in quantitative research:

1) Descriptive Statistics:

  • Description: Descriptive statistics are used to summarize and describe the main aspects of a dataset, such as central tendency (mean, median, mode), variability (range, variance, standard deviation), and distribution (skewness, kurtosis).
  • Applications: Summarizing data, identifying patterns, and providing initial insights into the dataset.

2) Inferential Statistics:

  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. This technique includes hypothesis testing, confidence intervals, t-tests, chi-square tests, analysis of variance (ANOVA), regression analysis, and correlation analysis.
  • Applications: Testing hypotheses, making predictions, and generalizing findings from a sample to a larger population.

3) Regression Analysis:

  • Description: Regression analysis is a statistical technique used to model and examine the relationship between a dependent variable and one or more independent variables. Linear regression, multiple regression, logistic regression, and nonlinear regression are common types of regression analysis .
  • Applications: Predicting outcomes, identifying relationships between variables, and understanding the impact of independent variables on the dependent variable.

4) Correlation Analysis:

  • Description: Correlation analysis is used to measure and assess the strength and direction of the relationship between two or more variables. The Pearson correlation coefficient, Spearman rank correlation coefficient, and Kendall’s tau are commonly used measures of correlation.
  • Applications: Identifying associations between variables and assessing the degree and nature of the relationship.

5) Factor Analysis:

  • Description: Factor analysis is a multivariate statistical technique used to identify and analyze underlying relationships or factors among a set of observed variables. It helps in reducing the dimensionality of data and identifying latent variables or constructs.
  • Applications: Identifying underlying factors or constructs, simplifying data structures, and understanding the underlying relationships among variables.

6) Time Series Analysis:

  • Description: Time series analysis involves analyzing data collected or recorded over a specific period at regular intervals to identify patterns, trends, and seasonality. Techniques such as moving averages, exponential smoothing, autoregressive integrated moving average (ARIMA), and Fourier analysis are used.
  • Applications: Forecasting future trends, analyzing seasonal patterns, and understanding time-dependent relationships in data.

7) ANOVA (Analysis of Variance):

  • Description: Analysis of variance (ANOVA) is a statistical technique used to analyze and compare the means of two or more groups or treatments to determine if they are statistically different from each other. One-way ANOVA, two-way ANOVA, and MANOVA (Multivariate Analysis of Variance) are common types of ANOVA.
  • Applications: Comparing group means, testing hypotheses, and determining the effects of categorical independent variables on a continuous dependent variable.

8) Chi-Square Tests:

  • Description: Chi-square tests are non-parametric statistical tests used to assess the association between categorical variables in a contingency table. The Chi-square test of independence, goodness-of-fit test, and test of homogeneity are common chi-square tests.
  • Applications: Testing relationships between categorical variables, assessing goodness-of-fit, and evaluating independence.

These quantitative data analysis techniques provide researchers with valuable tools and methods to analyze, interpret, and derive meaningful insights from numerical data. The selection of a specific technique often depends on the research objectives, the nature of the data, and the underlying assumptions of the statistical methods being used.

Also Read: Analysis vs. Analytics: How Are They Different?

Data Analysis Methods

Data analysis methods refer to the techniques and procedures used to analyze, interpret, and draw conclusions from data. These methods are essential for transforming raw data into meaningful insights, facilitating decision-making processes, and driving strategies across various fields. Here are some common data analysis methods:

  • Description: Descriptive statistics summarize and organize data to provide a clear and concise overview of the dataset. Measures such as mean, median, mode, range, variance, and standard deviation are commonly used.
  • Description: Inferential statistics involve making predictions or inferences about a population based on a sample of data. Techniques such as hypothesis testing, confidence intervals, and regression analysis are used.

3) Exploratory Data Analysis (EDA):

  • Description: EDA techniques involve visually exploring and analyzing data to discover patterns, relationships, anomalies, and insights. Methods such as scatter plots, histograms, box plots, and correlation matrices are utilized.
  • Applications: Identifying trends, patterns, outliers, and relationships within the dataset.

4) Predictive Analytics:

  • Description: Predictive analytics use statistical algorithms and machine learning techniques to analyze historical data and make predictions about future events or outcomes. Techniques such as regression analysis, time series forecasting, and machine learning algorithms (e.g., decision trees, random forests, neural networks) are employed.
  • Applications: Forecasting future trends, predicting outcomes, and identifying potential risks or opportunities.

5) Prescriptive Analytics:

  • Description: Prescriptive analytics involve analyzing data to recommend actions or strategies that optimize specific objectives or outcomes. Optimization techniques, simulation models, and decision-making algorithms are utilized.
  • Applications: Recommending optimal strategies, decision-making support, and resource allocation.

6) Qualitative Data Analysis:

  • Description: Qualitative data analysis involves analyzing non-numerical data, such as text, images, videos, or audio, to identify themes, patterns, and insights. Methods such as content analysis, thematic analysis, and narrative analysis are used.
  • Applications: Understanding human behavior, attitudes, perceptions, and experiences.

7) Big Data Analytics:

  • Description: Big data analytics methods are designed to analyze large volumes of structured and unstructured data to extract valuable insights. Technologies such as Hadoop, Spark, and NoSQL databases are used to process and analyze big data.
  • Applications: Analyzing large datasets, identifying trends, patterns, and insights from big data sources.

8) Text Analytics:

  • Description: Text analytics methods involve analyzing textual data, such as customer reviews, social media posts, emails, and documents, to extract meaningful information and insights. Techniques such as sentiment analysis, text mining, and natural language processing (NLP) are used.
  • Applications: Analyzing customer feedback, monitoring brand reputation, and extracting insights from textual data sources.

These data analysis methods are instrumental in transforming data into actionable insights, informing decision-making processes, and driving organizational success across various sectors, including business, healthcare, finance, marketing, and research. The selection of a specific method often depends on the nature of the data, the research objectives, and the analytical requirements of the project or organization.

Also Read: Quantitative Data Analysis: Types, Analysis & Examples

Data Analysis Tools

Data analysis tools are essential instruments that facilitate the process of examining, cleaning, transforming, and modeling data to uncover useful information, make informed decisions, and drive strategies. Here are some prominent data analysis tools widely used across various industries:

1) Microsoft Excel:

  • Description: A spreadsheet software that offers basic to advanced data analysis features, including pivot tables, data visualization tools, and statistical functions.
  • Applications: Data cleaning, basic statistical analysis, visualization, and reporting.

2) R Programming Language:

  • Description: An open-source programming language specifically designed for statistical computing and data visualization.
  • Applications: Advanced statistical analysis, data manipulation, visualization, and machine learning.

3) Python (with Libraries like Pandas, NumPy, Matplotlib, and Seaborn):

  • Description: A versatile programming language with libraries that support data manipulation, analysis, and visualization.
  • Applications: Data cleaning, statistical analysis, machine learning, and data visualization.

4) SPSS (Statistical Package for the Social Sciences):

  • Description: A comprehensive statistical software suite used for data analysis, data mining, and predictive analytics.
  • Applications: Descriptive statistics, hypothesis testing, regression analysis, and advanced analytics.

5) SAS (Statistical Analysis System):

  • Description: A software suite used for advanced analytics, multivariate analysis, and predictive modeling.
  • Applications: Data management, statistical analysis, predictive modeling, and business intelligence.

6) Tableau:

  • Description: A data visualization tool that allows users to create interactive and shareable dashboards and reports.
  • Applications: Data visualization , business intelligence , and interactive dashboard creation.

7) Power BI:

  • Description: A business analytics tool developed by Microsoft that provides interactive visualizations and business intelligence capabilities.
  • Applications: Data visualization, business intelligence, reporting, and dashboard creation.

8) SQL (Structured Query Language) Databases (e.g., MySQL, PostgreSQL, Microsoft SQL Server):

  • Description: Database management systems that support data storage, retrieval, and manipulation using SQL queries.
  • Applications: Data retrieval, data cleaning, data transformation, and database management.

9) Apache Spark:

  • Description: A fast and general-purpose distributed computing system designed for big data processing and analytics.
  • Applications: Big data processing, machine learning, data streaming, and real-time analytics.

10) IBM SPSS Modeler:

  • Description: A data mining software application used for building predictive models and conducting advanced analytics.
  • Applications: Predictive modeling, data mining, statistical analysis, and decision optimization.

These tools serve various purposes and cater to different data analysis needs, from basic statistical analysis and data visualization to advanced analytics, machine learning, and big data processing. The choice of a specific tool often depends on the nature of the data, the complexity of the analysis, and the specific requirements of the project or organization.

Also Read: How to Analyze Survey Data: Methods & Examples

Importance of Data Analysis in Research

The importance of data analysis in research cannot be overstated; it serves as the backbone of any scientific investigation or study. Here are several key reasons why data analysis is crucial in the research process:

  • Data analysis helps ensure that the results obtained are valid and reliable. By systematically examining the data, researchers can identify any inconsistencies or anomalies that may affect the credibility of the findings.
  • Effective data analysis provides researchers with the necessary information to make informed decisions. By interpreting the collected data, researchers can draw conclusions, make predictions, or formulate recommendations based on evidence rather than intuition or guesswork.
  • Data analysis allows researchers to identify patterns, trends, and relationships within the data. This can lead to a deeper understanding of the research topic, enabling researchers to uncover insights that may not be immediately apparent.
  • In empirical research, data analysis plays a critical role in testing hypotheses. Researchers collect data to either support or refute their hypotheses, and data analysis provides the tools and techniques to evaluate these hypotheses rigorously.
  • Transparent and well-executed data analysis enhances the credibility of research findings. By clearly documenting the data analysis methods and procedures, researchers allow others to replicate the study, thereby contributing to the reproducibility of research findings.
  • In fields such as business or healthcare, data analysis helps organizations allocate resources more efficiently. By analyzing data on consumer behavior, market trends, or patient outcomes, organizations can make strategic decisions about resource allocation, budgeting, and planning.
  • In public policy and social sciences, data analysis is instrumental in developing and evaluating policies and interventions. By analyzing data on social, economic, or environmental factors, policymakers can assess the effectiveness of existing policies and inform the development of new ones.
  • Data analysis allows for continuous improvement in research methods and practices. By analyzing past research projects, identifying areas for improvement, and implementing changes based on data-driven insights, researchers can refine their approaches and enhance the quality of future research endeavors.

However, it is important to remember that mastering these techniques requires practice and continuous learning. That’s why we highly recommend the Data Analytics Course by Physics Wallah . Not only does it cover all the fundamentals of data analysis, but it also provides hands-on experience with various tools such as Excel, Python, and Tableau. Plus, if you use the “ READER ” coupon code at checkout, you can get a special discount on the course.

For Latest Tech Related Information, Join Our Official Free Telegram Group : PW Skills Telegram Group

Data Analysis Techniques in Research FAQs

What are the 5 techniques for data analysis.

The five techniques for data analysis include: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis Qualitative Analysis

What are techniques of data analysis in research?

Techniques of data analysis in research encompass both qualitative and quantitative methods. These techniques involve processes like summarizing raw data, investigating causes of events, forecasting future outcomes, offering recommendations based on predictions, and examining non-numerical data to understand concepts or experiences.

What are the 3 methods of data analysis?

The three primary methods of data analysis are: Qualitative Analysis Quantitative Analysis Mixed-Methods Analysis

What are the four types of data analysis techniques?

The four types of data analysis techniques are: Descriptive Analysis Diagnostic Analysis Predictive Analysis Prescriptive Analysis

What is Data Analytics in Database?

database analytics

Database analytics is a method of interpreting and analyzing data stored inside the database to extract meaningful insights. Read the…

Data Analytics Internships – What Does A Data Analyst Intern Do?

Data Analytics Internships

Data analytics internships involve working with data to identify trends, create visualizations, and generate reports. Data Analytics Internships learn to…

Analytics & Insights: The Difference Between Data, Analytics and Insights

analytics and insights

Analytics and Insights: With the expansion of technologies in all major related fields, data is used in extensive amounts, containing…

bottom banner

  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data management and analysis in research example

Home Market Research Research Tools and Apps

Research data management: What it is + benefits with examples

Research data management: What it is + benefits with examples

With the proliferation of data and the need for agile and fast insights, research teams, researchers, and organizations worldwide need quicker access to the correct data, which is possible with research data management .

Research is conducted for various reasons, including academic research, pricing research, brand tracking, monitoring, competitive research, ad testing , longitudinal tracking, product and service upgrades, customer satisfaction, etc. The data generated during the research process is diverse and vast.

LEARN ABOUT: Pricing Research

Accessing the correct data in the proper format allows for the data democratization of insights, reduces the silo in research, and eliminates tribal knowledge with mature insights management tools such as InsightsHub . A recent Statista report stated that the global revenue from the market research industry exceeded $74.6 billion in 2021 , and that number is only expected to grow. 

With data at this scale, it is imperative to have systems in place to make the most of the data in the shortest possible time, and that’s where research data management comes in.

LEARN ABOUT: Research Process Steps

What is research data management?

Research data management (or RDM) is the action of vigorously organizing, storing, and preserving data during the market research process. RDM covers the data lifecycle from planning through to people, process and technology, and long-term monitoring and access to the data. This is a continuous ongoing cycle during the process of data.

Research data comes in many forms and types, especially with various types of research that include qualitative and quantitative research. This means the data also can be of multiple scales and types. RDM helps to classify, categorize and store this information in a way that’s easy to understand, reference, and draw inferences from.  

Data management in research follows the fundamentals of the data life cycle, which are critical steps in research data management as listed below:

  • Plan: The plan includes involving stakeholders, defining processes, picking the tools, defining data owners, and how the data is shared.
  • Create: Researchers and research teams create the data in the form of data collection techniques defined by projects and then put together this data in structured formats with relevant tags and meta-descriptions. 
  • Process: This raw data is then converted into digital data in the organization’s structure. The information is cleaned, scrubbed, and structured to eliminate time for insights.
  • Analyze: A critical component of RDM is analyzing research dat a to derive actionable insights from the data that has been collected. This data can then be structured into consumable data.
  • Preserve: The raw and analyzed data is then preserved in formats defined in the earlier process to maintain the quality of information. 
  • Share: Distribution of insights to the right stakeholders in role-based access control is required so that the insights are then actioned upon to match business and research goals. 
  • Reuse: With the correct metadata, tagging and categorization, it is possible to reuse research data to draw correlations, increase ROI and reduce time to research studies. 

All the above steps aid in innovative research data management and are critical to market research and insights management success.

LEARN ABOUT:   Action Research

Research data management benefits

Following good research data management practices has multiple benefits. Some of the most important ones, however, are:

Maintain the sanctity of data and increase accountability

An essential benefit of RDM is that it allows the ability to maintain the sanctity of the collected data and increases accountability across the board and for all stakeholders. There is absolute transparency in how the information is collected, stored, tracked, shared, and more, along with the additional benefits of following data storage compliances and regulations. Defined processes also lead to lesser ambiguity about stakeholders and data owners and how it is to be monitored. 

Eliminate tribal knowledge

Since there is an expectation that data is to be managed in a specific manner, everyone follows the same process. This eliminates tribal knowledge when people either leave organizations or new members come into the fold. It also ensures that stakeholders and researchers from cross-functional teams can lean in on past data to make inferences.

Democratize insights

Insights are powerful when they’re accessible to the right teams at the right time. With research data management, there is an assurance that even if it is role-based access, a larger pool of members has access to the data regardless of the research design and type. There is greater visibility in the tools used; the audience reached, granular, and analyzed data, which in turn helps to democratize insights.

Enable longitudinal monitoring and quick turnaround studies

No matter what type of research is being conducted, RDM allows users to either draw comparisons from past studies or use the past data to validate or disprove hypotheses. With easy access to data, there is also the ability to conduct longitudinal studies or quick turnaround studies by leaning on past, structured data.

Avoid duplication of efforts and research

Brands and organizations use a market research platform to conduct research studies. With a data management software plan in place, you can avoid redoing the same or similar research and reducing the research’s geo-location borders and validity. It also helps reduce duplication of efforts as you do not have to start from scratch.  

Reduce time and increase the ROI of research

With easy access to structured data and insights, the time to insights is reduced as there is a reduction of duplicity in research projects. There’s also the scope of inferences to past data and data across demographics and regions. There’s scope to do more with less. All of the above aids in increasing the ROI of research as the effort spent is lesser, but the output is higher, which aids with continuous discovery . 

Research data management examples

As seen above, there is research data management is forming an integral part of organizations and research teams to derive the most out of their research processes. 

To illustrate this better with an example, consider a retail giant with a presence in many countries. To stay above the competition, create sticky customers, and to constantly co-create with customers, multiple research techniques and methods are used continuously.

This research helps to understand the brand value, consumer behavior, pricing sensitivity, product upgrades, customer satisfaction, etc. By implementing a solid RDM strategy, the brand can lean on past and existing studies, both qualitative and quantitative, to draw inferences about pricing preferences across markets, seasonal launches, what works in different needs, perception of brand vs. their competitors, etc. There is also the ability to look at historical data to manage inventory or budget for marketing spending. 

When done well, a good research data management strategy and the right knowledge discovery tools can work wonders for brands and organizations alike. 

Get the most out of your research data management

With QuestionPro, you have access to the most mature market research platform and tool that helps you collect and analyze the insights that matter the most. By leveraging InsightsHub , the unified hub for data management, you can ​​leverage the consolidated platform to organize, explore, search, and discover your research data in one organized repository.

LEARN ABOUT: Customer data management

Get started now

MORE LIKE THIS

Best Dynata Alternatives

Top 10 Dynata Alternatives & Competitors

May 27, 2024

data management and analysis in research example

What Are My Employees Really Thinking? The Power of Open-ended Survey Analysis

May 24, 2024

When I think of “disconnected”, it is important that this is not just in relation to people analytics, Employee Experience or Customer Experience - it is also relevant to looking across them.

I Am Disconnected – Tuesday CX Thoughts

May 21, 2024

Customer success tools

20 Best Customer Success Tools of 2024

May 20, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Illustration with collage of pictograms of clouds, pie chart, graph pictograms on the following

Data management is the practice of ingesting, processing, securing and storing an organization’s data, where it is then utilized for strategic decision-making to improve business outcomes.

Over the last decade, developments within hybrid cloud , artificial intelligence , the Internet of Things (IoT), and edge computing  have led to the exponential growth of big data, creating even more complexity for enterprises to manage. As a result, a data management discipline within an organization has become an increasing priority as this growth has created significant challenges, such as data silos, security risks, and general bottlenecks to decision-making.

Teams address these challenges head on with a number of data management solutions, which are aimed to clean, unify and secure data. This, in turn, allows leaders to glean insights through dashboards and other data visualization tools, enabling informed business decisions. It also empowers data science teams to investigate more complex questions, allowing them to leverage more advanced analytical capabilities, such as machine learning , for proof-of-concept projects. If they’re successful at delivering and improving against business outcomes, they can partner with relevant teams to scale those learnings across their organization through automation practices.

While data management refers to a whole discipline, master data management is more specific in its scope as it focuses on transactional data—i.e. sales records. Sales data typically includes customer, seller, and product information. This type of data enables businesses to determine their most successful products and markets and their highest valued customers. Since master data is inclusive of personally identifiable information (PII), it also conforms to stricter regulations, such as GDPR.

Learn about barriers to AI adoptions, particularly lack of AI governance and risk management solutions.

Register for the ebook on responsible AI workflows

The scope of a data management discipline is quite broad, and a strong data management strategy typically implements the following components to streamline their strategy and operations throughout an organization: 

Data processing: Within this stage of the data management lifecycle , raw data is ingested from a range of data sources, such as web APIs, mobile apps, Internet of Things (IoT) devices, forms, surveys, and more. It is, then, usually processed or loaded, via data integration techniques, such as extract, transform, load (ETL) or extract, load, transform (ELT) . While ETL has historically been the standard method to integrate and organize data across different datasets, ELT has been growing in popularity with the emergence of cloud data platforms and the increasing demand for real-time data. Independently of the data integration  technique used, the data is usually filtered, merged, or aggregated during the data processing stage to meet the requirements for its intended purpose, which can range from a business intelligence dashboard to a predictive machine learning algorithm. 

Data storage: While data can be stored before or after data processing, the type of data and purpose of it will usually dictate the storage repository that is leveraged. For example, data warehousing requires a defined schema to meet specific data analytics requirements for data outputs, such as dashboards, data visualizations , and other business intelligence  tasks. These data requirements are usually directed and documented by business users in partnership with data engineers, who will ultimately execute against the defined data model . The underlying structure of a data warehouse is typically organized as a relational system (i.e. in a structured data format), sourcing data from transactional databases. However, other storage systems, such as data lakes , incorporate data from both relational and non-relational systems , becoming a sandbox for innovative data projects. Data lakes benefit data scientists in particular, as they allow them to incorporate both structured and unstructured data into their data science projects. 

Data governance: Data governance is a set of standards and business processes which ensure that data assets are leveraged effectively within an organization. This generally includes processes around data quality, data access, usability, and data security. For instance, data governance councils tend align on taxonomies to ensure that metadata is added consistently across various data sources. This taxonomy should also be further documented via a data catalog to make data more accessible to users, facilitating data democratization across organizations. Data governance teams also help to define roles and responsibilities to ensure that data access is provided appropriately; this is particularly important to maintain data privacy. 

Data security: Data security sets guardrails in place to protect digital information from unauthorized access, corruption, or theft. As digital technology becomes an increasing part of our lives, more scrutiny is placed upon the security practices of modern businesses to ensure that customer data is protected from cybercriminals or disaster recovery incidents. While data loss can be devastating to any business, data breaches, in particular, can reap costly consequences from both a financial and brand standpoint. Data security teams can better secure their data by leveraging encryption and data masking within their data security strategy. 

While data processing, data storage, data governance and data security are all part of data management, the success of any of these components hinges on a company’s data architecture or technology stack. A company’s data infrastructure creates a pipeline for data to be acquired, processed, stored and accessed, and this is done by integrating these systems together. Data services and APIs pull together data from legacy systems, data lakes , data warehouses , sql databases , and apps, providing a holistic view into business performance. 

Each of these components in the data management space are undergoing a vast amount of change right now. For example, the shift from on-premise system to cloud platforms are one of the most disruptive technologies in the space right now. Unlike on-premise deployments, cloud storage providers allow users to spin up large clusters as needed, only requiring payment for the storage specified. This means that if you need additional compute power to run a job in a few hours vs. a few days, you can easily do this on a cloud platform by purchasing additional compute nodes.

This shift to cloud data platforms is also facilitating the adoption of streaming data processing. Tools, like Apache Kafka, allow for more real-time data processing, enabling consumers to subscribe to topics to receive data in a matter of seconds. However, batch processing still has its advantages as it’s more efficient at processing large volumes of data. While batch processing abides by a set schedule, such as daily, weekly, or monthly, it is ideal for business performance dashboards which typically do not require real-time data. 

Change only continues to accelerate in this space. More recently, data fabrics have emerged to assist with the complexity of managing these data systems. Data fabrics  leverage intelligent and automated systems to facilitate end-to-end integration of various data pipelines and cloud environments. As new technology like this develops, we can expect that business leaders will gain a more holistic view of business performance as it will integrate data across functions. The unification of data across human resources, marketing, sales, supply chain, et cetera can only give leaders a better understanding of their customer. 

Organizations experience a number of benefits when launching and maintaining data management initiatives: 

Reduced data silos: Most, if not all, companies experience data silos within their organization. Different data management tools and frameworks, such as data fabrics and data lakes, help to eliminate data silos and dependencies on data owners. For instance, data fabrics assist in revealing potential integrations across disparate datasets across functions, such as human resources, marketing, sales, et cetera. Data lakes, on the other hand, ingest raw data from those same functions, removing dependencies and eliminating single owners to a given dataset. 

Improved compliance and security: Governance councils assist in placing guardrails to protect businesses from fines and negative publicity that can occur due to noncompliance to government regulations and policies. Missteps here can be costly from both a brand and financial perspective. 

Enhanced customer experience: While this benefit will not be immediately seen, successful proof of concepts can improve the overall user experience, enabling teams to better understand and personalize the customer journey through more holistic analyses.

Scalability: Data management can help businesses scale but this largely depends on the technology and processes in place. For example, cloud platforms allow for more flexibility, enabling data owners to scale up or scale down compute power as needed. Additionally, governance councils can help to ensure that defined taxonomies are adopted as a company grows in size. 

Learn more about the IBM® Db2® family of products that span operational and warehousing solutions.

Discover the value of deploying Db2 on the cloud-native IBM Cloud Pak® for Data platform.

Explore IBM’s open source partnerships with MongoDB, EDB Postgres, DataStax and Cloudera.

Read the free 451 Research report to learn how data management on a unified platform for data, analytics and AI can accelerate time to insights.

Learn the best practices to ensure data quality, accessibility, and security as a foundation to an AI-centric data architecture.

IBM research is regularly integrated into new features for IBM Cloud Pak for Data

Maintain high-quality data through preprocessing. Learn how to do it in four stages: data cleaning, data integration, data reduction and data transformation.

Scale AI workloads for all your data, anywhere, with IBM watsonx.data, a fit-for-purpose data store built on an open data lakehouse architecture.

Case Western Reserve University

  • Research Data Lifecycle Guide

Developing a Data Management Plan

This section breaks down different topics required for the planning and preparation of data used in research at Case Western Reserve University. In this phase you should understand the research being conducted, the type and methods used for collecting data, the methods used to prepare and analyze the data, addressing budgets and resources required, and have a sound understanding of how you will manage data activities during your research project.

Many federal sponsors of Case Western Reserve funded research have required data sharing plans in research proposals since 2003. As of Jan. 25, 2023, the National Institutes of Health has revised its data management and sharing requirements. 

This website is designed to provide basic information and best practices to seasoned and new investigators as well as detailed guidance for adhering to the revised NIH policy.  

Basics of Research Data Management

What is research data management?

Research data management (RDM) comprises a set of best practices that include file organization, documentation, storage, backup, security, preservation, and sharing, which affords researchers the ability to more quickly, efficiently, and accurately find, access, and understand their own or others' research data.

Why should you care about research data management?

RDM practices, if applied consistently and as early in a project as possible, can save you considerable time and effort later, when specific data are needed, when others need to make sense of your data, or when you decide to share or otherwise upload your data to a digital repository. Adopting RDM practices will also help you more easily comply with the data management plan (DMP) required for obtaining grants from many funding agencies and institutions.

Does data need to be retained after a project is completed?

Research data must be retained in sufficient detail and for an adequate period of time to enable appropriate responses to questions about accuracy, authenticity, primacy and compliance with laws and regulations governing the conduct of the research. External funding agencies will each have different requirements regarding storage, retention, and availability of research data. Please carefully review your award or agreement for the disposition of data requirements and data retention policies.

A good data management plan begins by understanding the sponsor requirements funding your research. As a principal investigator (PI) it is your responsibility to be knowledgeable of sponsors requirements. The Data Management Plan Tool (DMPTool) has been designed to help PIs adhere to sponsor requirements efficiently and effectively. It is strongly recommended that you take advantage of the DMPTool.  

CWRU has an institutional account with DMPTool that enables users to access all of its resources via your Single Sign On credentials. CWRU's DMPTool account is supported by members of the Digital Scholarship team with the Freedman Center for Digital Scholarship. Please use the RDM Intake Request form to schedule a consultation if you would like support or guidance regarding developing a Data Management Plan.

Some basic steps to get started:

  • Sign into the  DMPTool site  to start creating a DMP for managing and sharing your data. 
  • On the DMPTool site, you can find the most up to date templates for creating a DMP for a long list of funders, including the NIH, NEH, NSF, and more. 
  • Explore sample DMPs to see examples of successful plans .

Be sure that your DMP is addressing any and all federal and/or funder requirements and associated DMP templates that may apply to your project. It is strongly recommended that investigators submitting proposals to the NIH utilize this tool. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023.  Guidance for completing a NIH Data Management Plan has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals. 

A Data Management Plan can help create and maintain reliable data and promote project success. DMPs, when carefully constructed and reliably adhered to, help guide elements of your research and data organization.

A DMP can help you:

Document your process and data.

  • Maintain a file with information on researchers and collaborators and their roles, sponsors/funding sources, methods/techniques/protocols/standards used, instrumentation, software (w/versions), references used, any applicable restrictions on its distribution or use.
  • Establish how you will document file changes, name changes, dates of changes, etc. Where will you record of these changes? Try to keep this sort of information in a plain text file located in the same folder as the files to which it pertains.
  • How are derived data products created? A DMP encourages consistent description of data processing performed, software (including version number) used, and analyses applied to data.
  • Establish regular forms or templates for data collection. This helps reduce gaps in your data, promotes consistency throughout the project.

Explain your data

  • From the outset, consider why your data were collected, what the known and expected conditions may be for collection, and information such as time and place, resolution, and standards of data collected.
  • What attributes, fields, or parameters will be studied and included in your data files? Identify and describe these in each file that employs them.
  • For an overview of data dictionaries, see the USGS page here: https://www.usgs.gov/products/data-and-tools/data-management/data-dictionaries

DMP Requirements

Why are you being asked to include a data management plan (DMP) in your grant application? For grants awarded by US governmental agencies, two federal memos from the US Office of Science and Technology Policy (OSTP), issued in 2013 and 2015 , respectively, have prompted this requirement. These memos mandate public access to federally- (and, thus, taxpayer-) funded research results, reflecting a commitment by the government to greater accountability and transparency. While "results" generally refers to the publications and reports produced from a research project, it is increasingly used to refer to the resulting data as well.

Federal research-funding agencies  have responded to the OSTP memos by issuing their own guidelines and requirements for grant applicants (see below), specifying whether and how research data in particular are to be managed in order to be publicly and properly accessible.

  • NSF—National Science Foundation "Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled 'Data Management Plan'. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results." Note: Additional requirements may apply per Directorate, Office, Division, Program, or other NSF unit.
  • NIH—National Institutes of Health "To facilitate data sharing, investigators submitting a research application requesting $500,000 or more of direct costs in any single year to NIH on or after October 1, 2003 are expected to include a plan for sharing final research data for research purposes, or state why data sharing is not possible."
  • NASA—National Aeronautics and Space Administration "The purpose of a Data Management Plan (DMP) is to address the management of data from Earth science missions, from the time of their data collection/observation, to their entry into permanent archives."
  • DOD—Department of Defense "A Data Management Plan (DMP) describing the scientific data expected to be created or gathered in the course of a research project must be submitted to DTIC at the start of each research effort. It is important that DoD researchers document plans for preserving data at the outset, keeping in mind the potential utility of the data for future research or to support transition to operational or other environments. Otherwise, the data is lost as researchers move on to other efforts. The essential descriptive elements of the DMP are listed in section 3 of DoDI 3200.12, although the format of the plan may be adjusted to conform to standards established by the relevant scientific discipline or one that meets the requirements of the responsible Component"
  • Department of Education "The purpose of this document is to describe the implementation of this policy on public access to data and to provide guidance to applicants for preparing the Data Management Plan (DMP) that must outline data sharing and be submitted with the grant application. The DMP should describe a plan to provide discoverable and citable dataset(s) with sufficient documentation to support responsible use by other researchers, and should address four interrelated concerns—access, permissions, documentation, and resources—which must be considered in the earliest stages of planning for the grant."
  • " Office of Scientific and Technical Information (OSTI) Provides access to free, publicly-available research sponsored by the Department of Energy (DOE), including technical reports, bibliographic citations, journal articles, conference papers, books, multimedia, software, and data.

Data Management Best Practices

As you plan to collect data for research, keep in mind the following best practices. 

Keep Your Data Accessible to You

  • Store your temporary working files somewhere easily accessible, like on a local hard drive or shared server.
  • While cloud storage is a convenient solution for storage and sharing, there are often concerns about data privacy and preservation. Be sure to only put data in the cloud that you are comfortable with and that your funding and/or departmental requirements allow.
  • For long-term storage, data should be put into preservation systems that are well-managed. [U]Tech provides several long-term data storage options for cloud and campus. 
  • Don't keep your original data on a thumb drive or portable hard drive, as it can be easily lost or stolen.
  • Think about file formats that have a long life and that are readable by many programs. Formats like ascii, .txt, .csv, .pdf are great for long term  preservation.
  • A DMP is not a replacement for good data management practices, but it can set you on the right path if it is consistently followed. Consistently revisit your plan to ensure you are following it and adhering to funder requirements.

Preservation

  • Know the difference between storing and preserving your data. True preservation is the ongoing process of making sure your data are secure and accessible for future generations. Many sponsors have preferred or recommended data repositories. The DMP tool can help you identify these preferred repositories. 
  • Identify data with long-term value. Preserve the raw data and any intermediate/derived products that are expensive to reproduce or can be directly used for analysis. Preserve any scripted code that was used to clean and transform the data.
  • Whenever converting your data from one format to another, keep a copy of the original file and format to avoid loss or corruption of your important files.
  • Leverage online platforms like OSF can help your group organize, version, share, and preserve your data, if the sponsor hasn’t specified a specific platform.
  • Adhere to federal sponsor requirements on utilizing accepted data repositories (NIH dbGaP, NIH SRA, NIH CRDC, etc.) for preservation. 

Backup, Backup, Backup

  • The general rule is to keep 3 copies of your data: 2 copies onsite, 1 offsite.
  • Backup your data regularly and frequently - automate the process if possible. This may mean weekly duplication of your working files to a separate drive, syncing your folders to a cloud service like Box, or dedicating a block of time every week to ensure you've copied everything to another location.

Organization

  • Establish a consistent, descriptive filing system that is intelligible to future researchers and does not rely on your own inside knowledge of your research.
  • A descriptive directory and file-naming structure should guide users through the contents to help them find whatever they are looking for.

Naming Conventions

  • Use consistent, descriptive filenames that reliably indicate the contents of the file.
  • If your discipline requires or recommends particular naming conventions, use them!
  • Do not use spaces between words. Use either camelcase or underscores to separate words
  • Include LastnameFirstname descriptors where appropriate.
  • Avoid using MM-DD-YYYY formats
  • Do not append vague descriptors like "latest" or "final" to your file versions. Instead, append the version's date or a consistently iterated version number.

Clean Your Data

  • Mistakes happen, and often researchers don't notice at first. If you are manually entering data, be sure to double-check the entries for consistency and duplication. Often having a fresh set of eyes will help to catch errors before they become problems.
  • Tabular data can often be error checked by sorting the fields alphanumerically to catch simple typos, extra spaces, or otherwise extreme outliers. Be sure to save your data before sorting it to ensure you do not disrupt the records!
  • Programs like OpenRefine  are useful for checking for consistency in coding for records and variables, catching missing values, transforming data, and much more.

What should you do if you need assistance implementing RDM practices?

Whether it's because you need discipline-specific metadata standards for your data, help with securing sensitive data, or assistance writing a data management plan for a grant, help is available to you at CWRU. In addition to consulting the resources featured in this guide, you are encouraged to contact your department's liaison librarian.

If you are planning to submit a research proposal and need assistance with budgeting for data storage and or applications used to capture, manage, and or process data UTech provides information and assistance including resource boilerplates that list what centralized resources are available. 

More specific guidance for including a budget for Data Management and Sharing is included on this document: Budgeting for Data Management and Sharing . 

Custody of Research Data

The PI is the custodian of research data, unless agreed on in writing otherwise and the agreement is on file with the University, and is responsible for the collection, management, and retention of research data. The PI should adopt an orderly system of data organization and should communicate the chosen system to all members of a research group and to the appropriate administrative personnel, where applicable. Particularly for long-term research projects, the PI should establish and maintain procedures for the protection and management of essential records.

CWRU Custody of Research Data Policy  

Data Sharing

Many funding agencies require data to be shared for the purposes of reproducibility and other important scientific goals. It is important to plan for the timely release and sharing of final research data for use by other researchers.  The final release of data should be included as a key deliverable of the DMP. Knowledge of the discipline-specific database, data repository, data enclave, or archive store used to disseminate the data should also be documented as needed. 

The NIH is mandating Data Management and Sharing Plans for all proposals submitted after Jan. 25, 2023. Guidance for completing a NIH Data Management and Sharing Plan  has its own dedicated content to provide investigators detailed guidance on development of these plans for inclusion in proposals.

tableau.com is not available in your region.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Research data management in academic institutions: A scoping review

Laure perrier.

1 Gerstein Science Information Centre, University of Toronto, Toronto, Ontario, Canada

Erik Blondal

2 Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada

A. Patricia Ayala

Dylanne dearborn.

3 Gibson D. Lewis Health Science Library, UNT Health Science Center, Fort Worth, Texas, United States of America

David Lightfoot

4 St. Michael’s Hospital Library, St. Michael’s Hospital, Toronto, Ontario, Canada

5 Faculty of Information, University of Toronto, Toronto, Ontario, Canada

Mindy Thuna

6 Engineering & Computer Science Library, University of Toronto, Toronto, Ontario, Canada

Leanne Trimble

7 Map and Data Library, University of Toronto, Toronto, Ontario, Canada

Heather MacDonald

8 MacOdrum Library, Carleton University, Ottawa, Ontario, Canada

  • Conceptualization: LP.
  • Data curation: LP EB HM.
  • Formal analysis: LP EB.
  • Investigation: LP EB HM APA DD TK DL MT LT RR.
  • Methodology: LP.
  • Project administration: LP.
  • Supervision: LP.
  • Validation: LP HM.
  • Writing – original draft: LP.
  • Writing – review & editing: LP EB HM APA DD DL TK MT LT RR.

Associated Data

Dataset is available from the Zenodo Repository, DOI: 10.5281/zenodo.557043 .

The purpose of this study is to describe the volume, topics, and methodological nature of the existing research literature on research data management in academic institutions.

Materials and methods

We conducted a scoping review by searching forty literature databases encompassing a broad range of disciplines from inception to April 2016. We included all study types and data extracted on study design, discipline, data collection tools, and phase of the research data lifecycle.

We included 301 articles plus 10 companion reports after screening 13,002 titles and abstracts and 654 full-text articles. Most articles (85%) were published from 2010 onwards and conducted within the sciences (86%). More than three-quarters of the articles (78%) reported methods that included interviews, cross-sectional, or case studies. Most articles (68%) included the Giving Access to Data phase of the UK Data Archive Research Data Lifecycle that examines activities such as sharing data. When studies were grouped into five dominant groupings (Stakeholder, Data, Library, Tool/Device, and Publication), data quality emerged as an integral element.

Most studies relied on self-reports (interviews, surveys) or accounts from an observer (case studies) and we found few studies that collected empirical evidence on activities amongst data producers, particularly those examining the impact of research data management interventions. As well, fewer studies examined research data management at the early phases of research projects. The quality of all research outputs needs attention, from the application of best practices in research data management studies, to data producers depositing data in repositories for long-term use.

Introduction

Increased connectivity has accelerated progress in global research and estimates indicate scientific output is doubling approximately every ten years [ 1 ]. A rise in research activity results in an increase in research data output. However, data generated from research that is not prepared and stored for long-term access is at risk of being lost forever. Vines and colleagues report that the availability of data related to studies declines rapidly with the age of a study and determined that the odds of a data set being reported as available decreased 17% per year after publication)[ 2 ]. At the same time, research funding agencies and scholarly journals are progressively moving towards directives that require data management plans and demand data sharing [ 3 – 6 ]. The current research ecosystem is complex and highlights the need for focused attention on the stewardship of research data [ 1 , 7 ].

Academic institutions are multifaceted organizations that exist within the research ecosystem. Researchers practicing within universities and higher education institutions must comply with funding agency requirements when they are the recipients of research grants. For some disciplines, such as genomics and astronomy, persevering and sharing data is the norm [ 8 – 9 ] yet best practices stipulate that research be reproducible and transparent which indicates effective data management is pertinent to all disciplines.

Interest in research data management in the global community is on the rise. Recent activity has included the Bill & Melinda Gates Foundation moving their open access/open data policy, considered to be exceptionally strong, into force at the beginning of 2017 [ 10 ]. Researchers working towards a solution to the Zika virus organized themselves to publish all epidemiological and clinical data as soon as it was gathered and analyzed [ 11 ]. Fecher and colleagues [ 12 ] conducted a systematic review focusing on data sharing to support the development of a conceptual framework, however it lacked rigorous methods, such as the use of a comprehensive search strategy [ 13 ]. Another review on data sharing was conducted by Bull and colleagues [ 14 ] that examined stakeholders’ perspectives on ethical best practices but focused specifically on low- and middle-income settings. In this scoping review, we aim to assess the research literature that examines research data management as it relates to academic institutions. It is a time of increasing activity in the area of research data management [ 15 ] and higher learning institutions need to be ready to address this change, as well as provide support for their faculty and researchers. Identifying the current state of the literature so there is a clear understanding of the evidence in the area will provide guidance in planning strategies for services and support, as well as outlining essential areas for future research endeavors in research data management. The purpose of this study is to describe the volume, topics, and methodological nature of the existing research literature on research data management in academic institutions.

We conducted a scoping review using guidance from Arksey and O’Malley [ 16 ] and the Joanna Briggs Manual for Scoping Reviews [ 17 ]. A scoping review protocol was prepared and revised based on input from the research team, which included methodologists and librarians specializing in data management. It is available upon request from the corresponding author. Although traditionally applied to systematic reviews, the PRISMA Statement was used for reporting [ 18 ].

Data sources and literature search

We searched 40 electronic literature databases from inception until April 3–4, 2016. Since research data management is relevant to all disciplines, we did not restrict our search to literature databases in the sciences. This was done in order to gain an understanding of the breadth of research available and provide context for the science research literature on the topic of research data management. The search was peer-reviewed by an experienced librarian (HM) using the Peer Review of Electronic Search Strategies checklist and modified as necessary [ 19 ]. The full literature search for MEDLINE is available in the S1 File . Additional database literature searches are available from the corresponding author. Searches were performed with no year or language restrictions. We also searched conference proceedings and gray literature. The gray literature discovery process involved identifying and searching the websites of relevant organizations (such as the Association of Research Libraries, the Joint Information Systems Committee, and the Data Curation Centre). Finally, we scanned the references of included studies to identify other potentially relevant articles. The results were imported into Covidence (covidence.org) for the review team to screen the records.

Study selection

All study designs were considered, including qualitative and quantitative methods such as focus groups, interviews, cross-sectional studies, and randomized controlled trials. Eligible studies included academic institutions and reported on research data management involving areas such as infrastructure, services, and policy. We included studies from all disciplines within academic institutions with no restrictions on geographical location. Studies reporting results that accepted participants outside of academic institutions were included if 50% or more of the total sample represented respondents from academic institutions. For studies that examined entities other than human subjects, the study was included if the outcomes were pertinent to the broader research community, including academia. For example, if a sample of journal articles were retrieved to examine the data sharing statements but each study was not explicitly linked to a research sector, it was accepted into our review since the outcomes are significant to the entire research community and academia was not explicitly excluded. We excluded commentaries, editorials, or papers providing descriptions of processes that lacked a research component.

We define an academic institution as a higher education degree-granting organization dedicated to education and research. Research data management is defined as the storage, access, and preservation of data produced from a given investigation [ 20 ]. This includes issues such as creating data management plans, matters related to sharing data, delivery of services and tools, infrastructure considerations typically related to researchers, planners, librarians, and administrators.

A two-stage process was used to assess articles. Two investigators independently reviewed the retrieved titles and abstracts to identify those that met the inclusion criteria. The study selection process was pilot tested on a sample of records from the literature search. In the second stage, full-text articles of all records identified as relevant were retrieved and independently assessed by two investigators to determine if they met the inclusion criteria. Discrepancies were addressed by having a third reviewer resolve disagreements.

Data abstraction and analysis

After a training exercise, two investigators independently read each article and extracted relevant data in duplicate. Extracted data included study design, study details (such as purpose, methodology), participant characteristics, discipline, and data collection tools used to gather information for the study. In addition, articles were aligned with the research data lifecycle proposed by the United Kingdom Data Archive [ 21 ]. Although represented in a simple diagram, this framework incorporates a comprehensive set of activities (creating data, processing data, analyzing data, preserving data, giving access to data, re-using data) and actions associated with research data management clarifying the longer lifespan that data has outside of the research project that is was created within (see S2 File ). Differences in abstraction were resolved by a third reviewer. Companion reports were identified by matching the authors, timeframe for the study, and intervention. Those that were identified were used for supplementary material only. Risk of bias of individual studies was not assessed because our aim was to examine the extent, range, and nature of research activity, as is consistent with the proposed scoping review methodology [ 16 – 17 ].

We summarized the results descriptively with the use of validated guidelines for narrative synthesis [ 22 – 25 ]. Following guidance from Rodgers and colleagues, [ 22 ] data extraction tables were examined to determine the presence of dominant groups or clusters of characteristics by which the subsequent analysis could be organized. Two team members independently evaluated the abstracted data from the included articles in order to identify key characteristics and themes. Disagreement was resolved through discussion. Due to the heterogeneity of the data, articles and themes were summarized as frequencies and proportions.

Literature search

The literature search identified a total of 15,228 articles. After reviewing titles and abstracts, we retrieved 654 potentially relevant full-text articles. 301 articles were identified for inclusion in the study along with 10 companion documents ( Fig 1 ). The full list of citations for the included studies can be found in the S3 File . The five literature databases that identified the most included studies were MEDLINE (81 articles or 21.60%), Compendex (60 articles or 16%), INSPEC (55 articles or 14.67%), Library and Information Science Abstracts (52 articles or 13.87%), and BIOSIS Previews (47 articles or 12.53%). The full list of electronic databases is available in the S4 File which also includes the number of included studies traced back to their original literature database.

An external file that holds a picture, illustration, etc.
Object name is pone.0178261.g001.jpg

Characteristics of included articles

Most of the 301 articles were published from 2010 onwards (256 or 85.04%) with 15% published prior to that time ( Table 1 ). Almost half (45.85%) identified North America (Canada, United States, or Mexico) as the region where studies were conducted; however, close to one fifth of articles (18.60%) did not report where the study was conducted. Most of the articles (78.51%) reported methods that included cross-sectional (129 or 35.54%), interviews (86 or 23.69%), or case studies (70 or 19.28%), with 42 articles (out of 301) describing two or more methods. Articles were almost even for reporting qualitative evidence (44.85%) and quantitative evidence (43.85%), with mixed methods representing a smaller proportion (11.29%). Reliance was put on authors in reporting characteristics of studies and no interpretations were made with regards to how attributes of the studies were reported. As a result, some information may appear to have overlap in the reporting of disciplines. For example, health science, medicine, and biomedicine are reported separately as disciplines/subject areas. Authors identified 35 distinct disciplines in the articles with just under ten percent (8.64%) not reporting a discipline and the largest group (105 or 34.88%) being a multidisciplinary. The two disciplines reported most often were medicine and information science/library science (31 or 10.30% each). Studies were reported in 116 journals, 43 conference papers, 26 gray literature documents (e.g., reports), two book chapters, and one PhD dissertation. Almost one-third of the articles (99 or 32.89%) did not use a data collection tool (e.g., when a case study was reported) and a small number (22 or 7.31%) based their data collection tools on instruments previously reported in the literature. Most data collection tools were either developed by authors (97 or 32.23%) or no description was provided about their development (83 or 27.57%). No validated data collection tools were reported. We identified articles that offered no information on the sample size or participant characteristics, [ 26 – 29 ] as well as those that reported on the number of participants that completed the study but failed to describe how many were recruited [ 30 – 31 ].

a Percentages may not total 100 because of rounding

b Geographic region refers to where data originated, e.g., if telephone interviews were conducted with participants in France, Mexico, and Chile, the region would be listed as Multi-continent

c Categories are not mutually exclusive, i.e., multiple study designs of two or more are reported in 42 articles

d No attempt was made to create groupings, e.g., to collapse Chemistry and Science into one group

Research data lifecycle framework

Two hundred and seven (31.13%) articles aligned with the Giving Access to Data phase of the Research Data Lifecycle [ 20 ] ( Table 2 ) which include the components of distributing data, sharing data, controlling access, establishing copyright, and promoting data. The Preserving Data phase contained the next largest set of articles with 178 (26.77%). In contrast, Analysing Data and Processing Data were the two phases with the least amount of articles containing 28 (4.21%) and 49 (7.37%) respectively. Most articles (87 or 28.9%) were aligned with two phases of the Research Data Lifecycle and were followed by an almost even match of 73 (24.25%) aligning with three phases and 70 (23.26%) with one phase. Twenty-nine (9.63%) were not aligned with any phase of the Research Data Lifecycle and these included articles such as those that described education and training for librarians, or identified skill sets needed to work in research data management.

Source: UK Data Archive, Research data lifecycle. Available at: http://data-archive.ac.uk/create-manage/life-cycle

b Articles can be listed in more than one phase of the Research Data Lifecycle

Key characteristics of articles

Five dominant groupings were identified for the 301 articles ( Table 3 ). Each of these dominant groups were further categorized into subgroupings of articles to provide more granularity. The top three study types and the top three discipline/subject area is reported for each of the dominant groups. Half of the articles (151 or 50.17%) concentrated on stakeholders (Stakeholder Group), e.g., activities of researchers, publishers, participants / patients, funding agencies, 57 (18.94%) were data-focused (Data Group), e.g., investigating quality or integrity of data in repositories, development or refinement of metadata, 42 (13.95%) centered on library-related activities (Library Group), e.g., identifying skills or training for librarians working in data management, 27 (8.97%) described specific tools/applications/repositories (Tool/Device Group), e.g., introducing an electronic notebook into a laboratory, and 24 (7.97%) articles focused on the activities of publishing (Publication Group), e.g., examining data policies. The Stakeholder Group contained the largest subgroup of articles which was labelled ‘Researcher’ (119 or 39.53%).

b Articles can be listed in more than one grouping

We identified 301 articles and 10 companion documents that focus on research data management in academic institutions published between 1995 and 2016. Tracing articles back to their original literature database indicates that 86% of the studies accepted into our review were from the applied science or basic science literature indicating high activity for research in this area among the sciences. The number of published articles has risen dramatically since 2010 with 85% of articles published post-2009, signaling the increased importance and interest in this area of research. However, the limited use of study designs, deficiency in standardized or validated data collection tools, and lack of transparency in reporting demonstrate the need for attention to rigor. As well, there are limited studies that examine the impact of research data management activities (e.g., the implementation of services, training, or tools).

Few of the study designs employed in the 301 articles collected empirical evidence on activities amongst data producers such as examining changes in behavior (e.g., movement from data withholding to data sharing) or identifying changes in endeavors (e.g., strategies to increase data quality in repositories). Close to 80% of the articles rely on self-reports (e.g., participating in interviews, filling out surveys) or accounts from an observer (e.g., describing events in a case study). Case studies made up almost one-fifth of the articles examined. This group of articles ranged from question-and-answer journalistic style reports, [ 32 ] to articles that offered structured descriptions of activities and processes [ 33 ]. Although study quality was not formally assessed, this range of offerings provided challenges with data abstraction, in particular with the journalistic style accounts. If papers provided clear reporting that included declaring a purpose and describing well-defined outcomes, these articles could supply valuable input to knowledge syntheses such as a realist review [ 34 – 35 ] despite being ranked lower in the hierarchy of evidence [ 36 ]. One exception was Hruby and colleagues [ 37 ] that included a retrospective analysis in their case report that examined the impact of introducing a centralized research data repository for datasets within a urology department at Columbia University. This study offered readers a fuller understanding of the impact of a research data management intervention by providing evidence that detailed a change. Results described a reduction in the time required to complete studies, and an increase in publication quantity and quality (i.e., increase in average journal impact factor of papers published). There is opportunity for those wishing to conduct studies that provide empirical evidence for data producers and those interested in data reuse, however, for those wishing to conduct case studies, the development of reporting guidelines may be of benefit.

Using the Research Data Lifecycle framework provides the opportunity to understand where researchers are focusing their efforts in studying research data management. Most studies fell within the Giving Access to Data phase of the framework which includes activities such as sharing data and controlling access to data, and the Preserving Data phase which focuses on activities such as documenting and archiving data. This aligns with the global trend of funding agencies moving towards requirements for open access and open data [ 15 ] which includes activities such as creating metadata/documentation and sharing data in public repositories when possible. Fewer studies fell within phases that occurred at the beginning of the Research Data Lifecycle which includes activities such as writing data management plans and the preparation of data for preservation. Research in these early phases that include planning and setting up processes for handling data as it is being created may provide insight into how these activities impact later phases of the Research Data Lifecycle, in particular with regards to data quality.

Data quality was examined in several of the Groups described in Table 3 . Within the Data Group, ‘data quality and integrity’ comprised the biggest subgroup of articles. Two other subgroups in the Data Group, ‘classification systems’ and ‘repositories’, provided articles that touched on issues related to data quality as well. These issues included refining metadata and improving functionalities in repositories that enabled scholarly use and reuse of materials. Willoughby and colleagues illustrated some of the challenges related to data quality when reporting on researchers in chemistry, biology, and physics [ 38 ]. They found that when filling out metadata for a repository, researchers used a ‘minimum required’ approach. The biggest inhibitor to adding useful metadata was the ‘blank canvas’ effect, where the users may have been willing to add metadata but did not know how. The authors concluded that simply providing a mechanism to add metadata was not sufficient. Data quality, or the lack thereof, was also identified in the Publication Group, with ‘data availability, accessibility, and reuse’ and ‘data policies’ subgroups listing articles that tracked the completeness of deposited data sets, and offered assessments on the guidance offered by journals on their data sharing policies. Piwowar and Chapman analyzed whether data sharing frequency was associated with funder and publisher requirements [ 39 ]. They found that NIH (National Institute of Health) funding had little impact on data sharing despite policies that required this. Data sharing was significantly association with the impact factor of a journal (not a journal’s data sharing policy) and the experience of the first/last authors. Studies that investigate processes to improve the quality of data deposited in repositories, or strategies to increase compliance with journal or funder data sharing policies that support depositing high-quality and useable data, could potentially provide tangible guidance to investigators interested in effective data reuse.

We found a number of articles with important information not reported. This included the geographic region in which the study was conducted (56 or 18.6%) and the discipline or subject area being examined (26 or 8.64%). Data abstraction identified studies that provided no information on participant populations (such as sample size or characteristics of the participants) as well as studies that reported the number of participants who completed the study, but failed to report the number recruited. Lack of transparency and poor documentation of research is highlighted in the recent Lancet series on ‘research waste’ that calls attention to avoiding the misuse of valuable resources and the inadequate emphasis on the reproducibility of research [ 40 ]. Those conducting research in data management must recognize the importance of research integrity being reflected in all research outputs that includes both publications and data.

We identified a sizable body of literature that describes research data management related to academic institutions, with the majority of studies conducted in the applied or basic sciences. Our results should promote further research in several areas. One area includes shifting the focus of studies towards collecting empirical evidence that demonstrates the impact of interventions related to research data management. Another area that requires further attention is researching activities that demonstrate concrete improvements to the quality and usefulness of data in repositories for reuse, as well as the examining facilitators and barriers for researchers to participate in this activity. In particular, there is a gap in research that examines activities in the early phases of research projects to determine the impact of interventions at this stage. Finally, researchers investigating research data management must follow best practices in research reporting and ensure the high quality of their own research outputs that includes both publications and datasets.

Supporting information

Acknowledgments.

We thank Mikaela Gray for retrieving articles, tracking papers back to their original literature databases, and assisting with references. We also thank Lily Yuxi Ren for retrieving conference proceedings and searching the gray literature. We acknowledge Matt Gertler for screening abstracts.

Funding Statement

The authors received no specific funding for this work.

Data Availability

Clinical Trials Data Management and Analysis

Medical research is a process of gathering data that will expand scientists' understanding of human health. Researchers take steps to ensure the accuracy of all their data, then use sophisticated methods to analyze their findings to ensure the accuracy of their conclusions.

StudyPages

Table of Contents:

How Researchers Use Biostatistics to Analyze Data

How researchers protect data integrity during clinical trials, how researchers use real-world evidence in clinical studies, continuing data collection after clinical trials, learn more about clinical trials.

Putting together a medical research study is no simple task. Scientists must create a detailed study design that lays out every aspect of the research. This careful process allows the research team to have a clear plan for gathering and using all the data for their research.

Clinical trial data management is one of the biggest priorities in research. Scientists must ensure that study participants understand what the team is trying to learn and what information they will need in the process. Researchers must accurately record data and analyze it carefully to ensure their findings are correct. This requires multiple data management techniques.

Biostatistics is a branch of statistics that deals with data about living organisms. Statisticians analyze data related to humans, animals, and plants to identify patterns. These patterns can be used to assess important scientific information. Biostatistics can also be used to predict health risks in certain groups and give doctors an idea of how different health conditions progress under different circumstances.

In research, biostatistics and clinical data analysis are essential parts of study design and analysis at all phases. Researchers can also use biostatistics when designing a new study. Data from past studies can be used for modeling, pre-study simulations, and feasibility analysis. Preclinical modeling can be used during clinical trials to adapt the study design if needed. Once the study is complete, biostatistical analysis is a key component in writing reports about the study findings.

Good science requires good data and good clinical data management. Researchers need to ensure that the information they collect is correct and obtained appropriately. One key step in ensuring data integrity is getting informed consent from study participants. Participants must knowingly agree to all research activities, with sufficient understanding of the potential consequences. Data acquired without appropriate informed consent casts doubt on the study's overall validity.

Once researchers have proper informed consent , all data gathered in the study must be valid, complete, and well-documented. The goal of data gathering and storage is to ensure that the collected information can be used to reconstruct the study and confirm the findings.

Data protection is another aspect of data management. Researchers need to protect participant privacy . They need to have adequate security tools to make sure all data is protected and participants' identities are concealed .

Quality assurance is the process of checking data and data collection methods throughout the trial. This can be done by the research team or quality management experts at the facility where the research is taking place. Government regulators may also conduct audits to ensure all the methods and findings are appropriate and follow best practices.

Real-world evidence is data collected outside the parameters of a study. It can offer valuable insight into how certain behaviors, health conditions, or treatments affect people daily without the rules or oversight imposed by a clinical trial. Real-world data can also offer insights into how particular conditions or treatments affect people over different periods.

Researchers can collect real-world data from various sources, including electronic health records (EHRs), registries (e.g., cancer registries or veterans health condition registries), insurance data, patient-reported data like surveys, or data from mobile health applications and wearable devices.

Real-world evidence can help with designing new clinical trials. The data can be used as a biostatistic to model the potential outcomes of the trial. In some cases, real-world data can replace the placebo arm of a study. Researchers can administer the treatment to all participants and compare study outcomes to real-world data instead of monitoring a control group of participants who receive the placebo instead of treatment.

Data gathering and clinical data analysis continue after a successful study is completed and a new treatment has been approved. Clinical trials are one way of determining a treatment's effectiveness, but they cannot predict all the possible outcomes when people start using the treatment outside the study.

Both researchers and regulatory agencies collect information about how the treatment affects people. This allows researchers to understand the long-term effects of new treatments. It also ensures that regulators identify any unexpected safety issues and take steps to protect people from harm.

Post-market data gathering and management may involve patient reporting systems where people can send information about the adverse effects of treatments to the FDA . Researchers may arrange post-study follow-ups with participants to gather more information about their outcomes over the long term. Researchers may also use real-world data from insurance companies or healthcare systems to understand better how the treatment affects people. This post-market data is another set of biostatistics that can be used to assess current and future research.

Clinical trials and the medical advances they provide wouldn't be possible without the help of people who participate in clinical research. If you want to learn how you can be a part of medical research studies, connect with Studypages and sign up for our free Pulse Newsletter. Our platform helps you join real clinical studies and take part in the advancement of healthcare research.

  • Python For Data Analysis
  • Data Science
  • Data Analysis with R
  • Data Analysis with Python
  • Data Visualization with Python
  • Data Analysis Examples
  • Math for Data Analysis
  • Data Analysis Interview questions
  • Artificial Intelligence
  • Data Analysis Projects
  • Machine Learning
  • Deep Learning
  • Computer Vision
  • Types of Research - Methods Explained with Examples
  • GRE Data Analysis | Methods for Presenting Data
  • Financial Analysis: Objectives, Methods, and Process
  • Financial Analysis: Need, Types, and Limitations
  • Methods of Marketing Research
  • Top 10 SQL Projects For Data Analysis
  • What is Statistical Analysis in Data Science?
  • 10 Data Analytics Project Ideas
  • Predictive Analysis in Data Mining
  • How to Become a Research Analyst?
  • Data Analytics and its type
  • Types of Social Networks Analysis
  • What is Data Analysis?
  • Six Steps of Data Analysis Process
  • Multidimensional data analysis in Python
  • Attributes and its Types in Data Analytics
  • Exploratory Data Analysis (EDA) - Types and Tools
  • Data Analyst Jobs in Pune

Data Analysis in Research: Types & Methods

Data analysis is a crucial step in the research process, transforming raw data into meaningful insights that drive informed decisions and advance knowledge. This article explores the various types and methods of data analysis in research, providing a comprehensive guide for researchers across disciplines.

Data-Analysis-in-Research

Data Analysis in Research

Overview of Data analysis in research

Data analysis in research is the systematic use of statistical and analytical tools to describe, summarize, and draw conclusions from datasets. This process involves organizing, analyzing, modeling, and transforming data to identify trends, establish connections, and inform decision-making. The main goals include describing data through visualization and statistics, making inferences about a broader population, predicting future events using historical data, and providing data-driven recommendations. The stages of data analysis involve collecting relevant data, preprocessing to clean and format it, conducting exploratory data analysis to identify patterns, building and testing models, interpreting results, and effectively reporting findings.

  • Main Goals : Describe data, make inferences, predict future events, and provide data-driven recommendations.
  • Stages of Data Analysis : Data collection, preprocessing, exploratory data analysis, model building and testing, interpretation, and reporting.

Types of Data Analysis

1. descriptive analysis.

Descriptive analysis focuses on summarizing and describing the features of a dataset. It provides a snapshot of the data, highlighting central tendencies, dispersion, and overall patterns.

  • Central Tendency Measures : Mean, median, and mode are used to identify the central point of the dataset.
  • Dispersion Measures : Range, variance, and standard deviation help in understanding the spread of the data.
  • Frequency Distribution : This shows how often each value in a dataset occurs.

2. Inferential Analysis

Inferential analysis allows researchers to make predictions or inferences about a population based on a sample of data. It is used to test hypotheses and determine the relationships between variables.

  • Hypothesis Testing : Techniques like t-tests, chi-square tests, and ANOVA are used to test assumptions about a population.
  • Regression Analysis : This method examines the relationship between dependent and independent variables.
  • Confidence Intervals : These provide a range of values within which the true population parameter is expected to lie.

3. Exploratory Data Analysis (EDA)

EDA is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. It helps in discovering patterns, spotting anomalies, and checking assumptions with the help of graphical representations.

  • Visual Techniques : Histograms, box plots, scatter plots, and bar charts are commonly used in EDA.
  • Summary Statistics : Basic statistical measures are used to describe the dataset.

4. Predictive Analysis

Predictive analysis uses statistical techniques and machine learning algorithms to predict future outcomes based on historical data.

  • Machine Learning Models : Algorithms like linear regression, decision trees, and neural networks are employed to make predictions.
  • Time Series Analysis : This method analyzes data points collected or recorded at specific time intervals to forecast future trends.

5. Causal Analysis

Causal analysis aims to identify cause-and-effect relationships between variables. It helps in understanding the impact of one variable on another.

  • Experiments : Controlled experiments are designed to test the causality.
  • Quasi-Experimental Designs : These are used when controlled experiments are not feasible.

6. Mechanistic Analysis

Mechanistic analysis seeks to understand the underlying mechanisms or processes that drive observed phenomena. It is common in fields like biology and engineering.

Methods of Data Analysis

1. quantitative methods.

Quantitative methods involve numerical data and statistical analysis to uncover patterns, relationships, and trends.

  • Statistical Analysis : Includes various statistical tests and measures.
  • Mathematical Modeling : Uses mathematical equations to represent relationships among variables.
  • Simulation : Computer-based models simulate real-world processes to predict outcomes.

2. Qualitative Methods

Qualitative methods focus on non-numerical data, such as text, images, and audio, to understand concepts, opinions, or experiences.

  • Content Analysis : Systematic coding and categorizing of textual information.
  • Thematic Analysis : Identifying themes and patterns within qualitative data.
  • Narrative Analysis : Examining the stories or accounts shared by participants.

3. Mixed Methods

Mixed methods combine both quantitative and qualitative approaches to provide a more comprehensive analysis.

  • Sequential Explanatory Design : Quantitative data is collected and analyzed first, followed by qualitative data to explain the quantitative results.
  • Concurrent Triangulation Design : Both qualitative and quantitative data are collected simultaneously but analyzed separately to compare results.

4. Data Mining

Data mining involves exploring large datasets to discover patterns and relationships.

  • Clustering : Grouping data points with similar characteristics.
  • Association Rule Learning : Identifying interesting relations between variables in large databases.
  • Classification : Assigning items to predefined categories based on their attributes.

5. Big Data Analytics

Big data analytics involves analyzing vast amounts of data to uncover hidden patterns, correlations, and other insights.

  • Hadoop and Spark : Frameworks for processing and analyzing large datasets.
  • NoSQL Databases : Designed to handle unstructured data.
  • Machine Learning Algorithms : Used to analyze and predict complex patterns in big data.

Applications and Case Studies

Numerous fields and industries use data analysis methods, which provide insightful information and facilitate data-driven decision-making. The following case studies demonstrate the effectiveness of data analysis in research:

Medical Care:

  • Predicting Patient Readmissions: By using data analysis to create predictive models, healthcare facilities may better identify patients who are at high risk of readmission and implement focused interventions to enhance patient care.
  • Disease Outbreak Analysis: Researchers can monitor and forecast disease outbreaks by examining both historical and current data. This information aids public health authorities in putting preventative and control measures in place.
  • Fraud Detection: To safeguard clients and lessen financial losses, financial institutions use data analysis tools to identify fraudulent transactions and activities.
  • investing Strategies: By using data analysis, quantitative investing models that detect trends in stock prices may be created, assisting investors in optimizing their portfolios and making well-informed choices.
  • Customer Segmentation: Businesses may divide up their client base into discrete groups using data analysis, which makes it possible to launch focused marketing efforts and provide individualized services.
  • Social Media Analytics: By tracking brand sentiment, identifying influencers, and understanding consumer preferences, marketers may develop more successful marketing strategies by analyzing social media data.
  • Predicting Student Performance: By using data analysis tools, educators may identify at-risk children and forecast their performance. This allows them to give individualized learning plans and timely interventions.
  • Education Policy Analysis: Data may be used by researchers to assess the efficacy of policies, initiatives, and programs in education, offering insights for evidence-based decision-making.

Social Science Fields:

  • Opinion mining in politics: By examining public opinion data from news stories and social media platforms, academics and policymakers may get insight into prevailing political opinions and better understand how the public feels about certain topics or candidates.
  • Crime Analysis: Researchers may spot trends, anticipate high-risk locations, and help law enforcement use resources wisely in order to deter and lessen crime by studying crime data.

Data analysis is a crucial step in the research process because it enables companies and researchers to glean insightful information from data. By using diverse analytical methodologies and approaches, scholars may reveal latent patterns, arrive at well-informed conclusions, and tackle intricate research inquiries. Numerous statistical, machine learning, and visualization approaches are among the many data analysis tools available, offering a comprehensive toolbox for addressing a broad variety of research problems.

Data Analysis in Research FAQs:

What are the main phases in the process of analyzing data.

In general, the steps involved in data analysis include gathering data, preparing it, doing exploratory data analysis, constructing and testing models, interpreting the results, and reporting the results. Every stage is essential to guaranteeing the analysis’s efficacy and correctness.

What are the differences between the examination of qualitative and quantitative data?

In order to comprehend and analyze non-numerical data, such text, pictures, or observations, qualitative data analysis often employs content analysis, grounded theory, or ethnography. Comparatively, quantitative data analysis works with numerical data and makes use of statistical methods to identify, deduce, and forecast trends in the data.

What are a few popular statistical methods for analyzing data?

In data analysis, predictive modeling, inferential statistics, and descriptive statistics are often used. While inferential statistics establish assumptions and draw inferences about a wider population, descriptive statistics highlight the fundamental characteristics of the data. To predict unknown values or future events, predictive modeling is used.

In what ways might data analysis methods be used in the healthcare industry?

In the healthcare industry, data analysis may be used to optimize treatment regimens, monitor disease outbreaks, forecast patient readmissions, and enhance patient care. It is also essential for medication development, clinical research, and the creation of healthcare policies.

What difficulties may one encounter while analyzing data?

Answer: Typical problems with data quality include missing values, outliers, and biased samples, all of which may affect how accurate the analysis is. Furthermore, it might be computationally demanding to analyze big and complicated datasets, necessitating certain tools and knowledge. It’s also critical to handle ethical issues, such as data security and privacy.

Please Login to comment...

Similar reads.

  • Data Science Blogathon 2024
  • Data Analysis

Improve your Coding Skills with Practice

 alt=

What kind of Experience do you want to share?

U.S. flag

A .gov website belongs to an official government organization in the United States.

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • About Adverse Childhood Experiences
  • Risk and Protective Factors
  • Program: Essentials for Childhood: Preventing Adverse Childhood Experiences through Data to Action
  • Adverse childhood experiences can have long-term impacts on health, opportunity and well-being.
  • Adverse childhood experiences are common and some groups experience them more than others.

diverse group of children lying on each other in a park

What are adverse childhood experiences?

Adverse childhood experiences, or ACEs, are potentially traumatic events that occur in childhood (0-17 years). Examples include: 1

  • Experiencing violence, abuse, or neglect.
  • Witnessing violence in the home or community.
  • Having a family member attempt or die by suicide.

Also included are aspects of the child’s environment that can undermine their sense of safety, stability, and bonding. Examples can include growing up in a household with: 1

  • Substance use problems.
  • Mental health problems.
  • Instability due to parental separation.
  • Instability due to household members being in jail or prison.

The examples above are not a complete list of adverse experiences. Many other traumatic experiences could impact health and well-being. This can include not having enough food to eat, experiencing homelessness or unstable housing, or experiencing discrimination. 2 3 4 5 6

Quick facts and stats

ACEs are common. About 64% of adults in the United States reported they had experienced at least one type of ACE before age 18. Nearly one in six (17.3%) adults reported they had experienced four or more types of ACEs. 7

Preventing ACEs could potentially reduce many health conditions. Estimates show up to 1.9 million heart disease cases and 21 million depression cases potentially could have been avoided by preventing ACEs. 1

Some people are at greater risk of experiencing one or more ACEs than others. While all children are at risk of ACEs, numerous studies show inequities in such experiences. These inequalities are linked to the historical, social, and economic environments in which some families live. 5 6 ACEs were highest among females, non-Hispanic American Indian or Alaska Native adults, and adults who are unemployed or unable to work. 7

ACEs are costly. ACEs-related health consequences cost an estimated economic burden of $748 billion annually in Bermuda, Canada, and the United States. 8

ACEs can have lasting effects on health and well-being in childhood and life opportunities well into adulthood. 9 Life opportunities include things like education and job potential. These experiences can increase the risks of injury, sexually transmitted infections, and involvement in sex trafficking. They can also increase risks for maternal and child health problems including teen pregnancy, pregnancy complications, and fetal death. Also included are a range of chronic diseases and leading causes of death, such as cancer, diabetes, heart disease, and suicide. 1 10 11 12 13 14 15 16 17

ACEs and associated social determinants of health, such as living in under-resourced or racially segregated neighborhoods, can cause toxic stress. Toxic stress, or extended or prolonged stress, from ACEs can negatively affect children’s brain development, immune systems, and stress-response systems. These changes can affect children’s attention, decision-making, and learning. 18

Children growing up with toxic stress may have difficulty forming healthy and stable relationships. They may also have unstable work histories as adults and struggle with finances, jobs, and depression throughout life. 18 These effects can also be passed on to their own children. 19 20 21 Some children may face further exposure to toxic stress from historical and ongoing traumas. These historical and ongoing traumas refer to experiences of racial discrimination or the impacts of poverty resulting from limited educational and economic opportunities. 1 6

Adverse childhood experiences can be prevented. Certain factors may increase or decrease the risk of experiencing adverse childhood experiences.

Preventing adverse childhood experiences requires understanding and addressing the factors that put people at risk for or protect them from violence.

Creating safe, stable, nurturing relationships and environments for all children can prevent ACEs and help all children reach their full potential. We all have a role to play.

  • Merrick MT, Ford DC, Ports KA, et al. Vital Signs: Estimated Proportion of Adult Health Problems Attributable to Adverse Childhood Experiences and Implications for Prevention — 25 States, 2015–2017. MMWR Morb Mortal Wkly Rep 2019;68:999-1005. DOI: http://dx.doi.org/10.15585/mmwr.mm6844e1 .
  • Cain KS, Meyer SC, Cummer E, Patel KK, Casacchia NJ, Montez K, Palakshappa D, Brown CL. Association of Food Insecurity with Mental Health Outcomes in Parents and Children. Science Direct. 2022; 22:7; 1105-1114. DOI: https://doi.org/10.1016/j.acap.2022.04.010 .
  • Smith-Grant J, Kilmer G, Brener N, Robin L, Underwood M. Risk Behaviors and Experiences Among Youth Experiencing Homelessness—Youth Risk Behavior Survey, 23 U.S. States and 11 Local School Districts. Journal of Community Health. 2022; 47: 324-333.
  • Experiencing discrimination: Early Childhood Adversity, Toxic Stress, and the Impacts of Racism on the Foundations of Health | Annual Review of Public Health https://doi.org/10.1146/annurev-publhealth-090419-101940 .
  • Sedlak A, Mettenburg J, Basena M, et al. Fourth national incidence study of child abuse and neglect (NIS-4): Report to Congress. Executive Summary. Washington, DC: U.S. Department of Health an Human Services, Administration for Children and Families.; 2010.
  • Font S, Maguire-Jack K. Pathways from childhood abuse and other adversities to adult health risks: The role of adult socioeconomic conditions. Child Abuse Negl. 2016;51:390-399.
  • Swedo EA, Aslam MV, Dahlberg LL, et al. Prevalence of Adverse Childhood Experiences Among U.S. Adults — Behavioral Risk Factor Surveillance System, 2011–2020. MMWR Morb Mortal Wkly Rep 2023;72:707–715. DOI: http://dx.doi.org/10.15585/mmwr.mm7226a2 .
  • Bellis, MA, et al. Life Course Health Consequences and Associated Annual Costs of Adverse Childhood Experiences Across Europe and North America: A Systematic Review and Meta-Analysis. Lancet Public Health 2019.
  • Adverse Childhood Experiences During the COVID-19 Pandemic and Associations with Poor Mental Health and Suicidal Behaviors Among High School Students — Adolescent Behaviors and Experiences Survey, United States, January–June 2021 | MMWR
  • Hillis SD, Anda RF, Dube SR, Felitti VJ, Marchbanks PA, Marks JS. The association between adverse childhood experiences and adolescent pregnancy, long-term psychosocial consequences, and fetal death. Pediatrics. 2004 Feb;113(2):320-7.
  • Miller ES, Fleming O, Ekpe EE, Grobman WA, Heard-Garris N. Association Between Adverse Childhood Experiences and Adverse Pregnancy Outcomes. Obstetrics & Gynecology . 2021;138(5):770-776. https://doi.org/10.1097/AOG.0000000000004570 .
  • Sulaiman S, Premji SS, Tavangar F, et al. Total Adverse Childhood Experiences and Preterm Birth: A Systematic Review. Matern Child Health J . 2021;25(10):1581-1594. https://doi.org/10.1007/s10995-021-03176-6 .
  • Ciciolla L, Shreffler KM, Tiemeyer S. Maternal Childhood Adversity as a Risk for Perinatal Complications and NICU Hospitalization. Journal of Pediatric Psychology . 2021;46(7):801-813. https://doi.org/10.1093/jpepsy/jsab027 .
  • Mersky JP, Lee CP. Adverse childhood experiences and poor birth outcomes in a diverse, low-income sample. BMC pregnancy and childbirth. 2019;19(1). https://doi.org/10.1186/s12884-019-2560-8 .
  • Reid JA, Baglivio MT, Piquero AR, Greenwald MA, Epps N. No youth left behind to human trafficking: Exploring profiles of risk. American journal of orthopsychiatry. 2019;89(6):704.
  • Diamond-Welch B, Kosloski AE. Adverse childhood experiences and propensity to participate in the commercialized sex market. Child Abuse & Neglect. 2020 Jun 1;104:104468.
  • Shonkoff, J. P., Garner, A. S., Committee on Psychosocial Aspects of Child and Family Health, Committee on Early Childhood, Adoption, and Dependent Care, & Section on Developmental and Behavioral Pediatrics (2012). The lifelong effects of early childhood adversity and toxic stress. Pediatrics, 129(1), e232–e246. https://doi.org/10.1542/peds.2011-2663
  • Narayan AJ, Kalstabakken AW, Labella MH, Nerenberg LS, Monn AR, Masten AS. Intergenerational continuity of adverse childhood experiences in homeless families: unpacking exposure to maltreatment versus family dysfunction. Am J Orthopsych. 2017;87(1):3. https://doi.org/10.1037/ort0000133 .
  • Schofield TJ, Donnellan MB, Merrick MT, Ports KA, Klevens J, Leeb R. Intergenerational continuity in adverse childhood experiences and rural community environments. Am J Public Health. 2018;108(9):1148-1152. https://doi.org/10.2105/AJPH.2018.304598 .
  • Schofield TJ, Lee RD, Merrick MT. Safe, stable, nurturing relationships as a moderator of intergenerational continuity of child maltreatment: a meta-analysis. J Adolesc Health. 2013;53(4 Suppl):S32-38. https://doi.org/10.1016/j.jadohealth.2013.05.004 .

Adverse Childhood Experiences (ACEs)

ACEs can have a tremendous impact on lifelong health and opportunity. CDC works to understand ACEs and prevent them.

IMAGES

  1. Tools for data analysis in research example

    data management and analysis in research example

  2. What is Data Analysis in Research

    data management and analysis in research example

  3. FREE 10+ Sample Data Analysis Templates in PDF

    data management and analysis in research example

  4. Unleashing Insights: Mastering the Art of Research and Data Analysis

    data management and analysis in research example

  5. A Guide to Research Data Management

    data management and analysis in research example

  6. Data Analysis

    data management and analysis in research example

VIDEO

  1. LMCW2022 Data Management & Analysis Set 33 Creative Video of SUNNY GROUP

  2. Chapter 17 & 18 Data Management & Analysis (BCR)

  3. Data Management & Analysis using SPSS Lecture 1

  4. Qualitative data management and analysis

  5. SQL for Finance #marketing #webbasedtraining #sales #computer #education #sql #coding #finance

  6. How Successful Investors Analyze Top-Level Management?

COMMENTS

  1. Examples of data management plans

    These examples of data management plans (DMPs) were provided by University of Minnesota researchers. They feature different elements. One is concise and the other is detailed. One utilizes secondary data, while the other collects primary data. Both have explicit plans for how the data is handled through the life cycle of the project.

  2. What Is Data Analysis? (With Examples)

    What Is Data Analysis? (With Examples) Data analysis is the practice of working with data to glean useful information, which can then be used to make informed decisions. "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts," Sherlock Holme's proclaims ...

  3. An Overview of the Fundamentals of Data Management, Analysis, and

    Introduction. Quantitative research assumes that the constructs under study can be measured. As such, quantitative research aims to process numerical data (or numbers) to identify trends and relationships and to verify the measurements made to answer questions like who, how much, what, where, when, how many, and how. 1, 2 In this context, the processing of numerical data is a series of steps ...

  4. A practical guide to data analysis in general literature reviews

    This article is a practical guide to conducting data analysis in general literature reviews. The general literature review is a synthesis and analysis of published research on a relevant clinical issue, and is a common format for academic theses at the bachelor's and master's levels in nursing, physiotherapy, occupational therapy, public health and other related fields.

  5. Data Management in Research: A Comprehensive Guide

    Research data management (RDM) is a term that describes the organization, storage, preservation, and sharing of data collected and used in a research project. It involves the daily management of research data throughout the life of a research project, such as using consistent file naming conventions. Examples of data management plans (DMPs) can ...

  6. PDF Data Management, Analysis Tools, and Analysis Mechanics

    Will all data be used in the analysis, or will subsets of the data be analyzed? To speed analysis, the researcher will sometimes want to work with a subset of fields rather than all database fields within a record at once. In other cases, only a subset of records will be analyzed. For example, a research may investigate traffic flow and speed

  7. Creating a Data Analysis Plan: What to Consider When Choosing

    The first step in a data analysis plan is to describe the data collected in the study. This can be done using figures to give a visual presentation of the data and statistics to generate numeric descriptions of the data. Selection of an appropriate figure to represent a particular set of data depends on the measurement level of the variable.

  8. What is Research Data Management

    A Data Management Cautionary Tale. Why Manage Data? Research today not only has to be rigorous, innovative, and insightful - it also has to be organized! As improved technology creates more capacity to create and store data, it increases the challenge of making data FAIR: Findable, Accessible, Interoperable, and Reusable (The FAIR Guiding ...

  9. Research Data Management (RDM)

    Data Analysis and Research Findings: Data analysis is the process of applying quantitative, statistical and visualization approaches to a dataset to test a hypothesis, evaluate findings or glean new insight into a phenomenon. Some examples of data analysis include: Statistical analysis using a programming language such as R or Python

  10. Data management made simple

    Data management is one example of the way in which public research sponsors and research institutions are implementing 'open science', the push to make scientific research and data freely ...

  11. Data Analysis in Research: Types & Methods

    Definition of research in data analysis: According to LeCompte and Schensul, research data analysis is a process used by researchers to reduce data to a story and interpret it to derive insights. The data analysis process helps reduce a large chunk of data into smaller fragments, which makes sense. Three essential things occur during the data ...

  12. Ten Simple Rules for Creating a Good Data Management Plan

    Earlier articles in the Ten Simple Rules series of PLOS Computational Biology provided guidance on getting grants [], writing research papers [], presenting research findings [], and caring for scientific data [].Here, I present ten simple rules that can help guide the process of creating an effective plan for managing research data—the basis for the project's findings, research papers ...

  13. What is Data Analysis? An Expert Guide With Examples

    Data analysis is a comprehensive method of inspecting, cleansing, transforming, and modeling data to discover useful information, draw conclusions, and support decision-making. It is a multifaceted process involving various techniques and methodologies to interpret data from various sources in different formats, both structured and unstructured.

  14. 2.3 Data management and analysis

    The data analysis plan flows from the research question, is integral to the study design, and should be well conceptualized prior to beginning data collection. In this section, we will walk through the basics of quantitative and qualitative data analysis to help you understand the fundamentals of creating a data analysis plan.

  15. Data Collection

    Data Collection | Definition, Methods & Examples. Published on June 5, 2020 by Pritha Bhandari.Revised on June 21, 2023. Data collection is a systematic process of gathering observations or measurements. Whether you are performing research for business, governmental or academic purposes, data collection allows you to gain first-hand knowledge and original insights into your research problem.

  16. Data Analytics: Definition, Uses, Examples, and More

    Data analytics is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision making. Data analytics is often confused with data analysis. While these are related terms, they aren't exactly the same. In fact, data analysis is a subcategory of data analytics that deals ...

  17. Qualitative Research: Data Collection, Analysis, and Management

    INTRODUCTION. In an earlier paper, 1 we presented an introduction to using qualitative research methods in pharmacy practice. In this article, we review some principles of the collection, analysis, and management of qualitative data to help pharmacists interested in doing research in their practice to continue their learning in this area.

  18. Learning to Do Qualitative Data Analysis: A Starting Point

    For many researchers unfamiliar with qualitative research, determining how to conduct qualitative analyses is often quite challenging. Part of this challenge is due to the seemingly limitless approaches that a qualitative researcher might leverage, as well as simply learning to think like a qualitative researcher when analyzing data. From framework analysis (Ritchie & Spencer, 1994) to content ...

  19. The Beginner's Guide to Statistical Analysis

    Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics.

  20. Data Analysis Techniques In Research

    Data analysis techniques in research are essential because they allow researchers to derive meaningful insights from data sets to support their hypotheses or research objectives.. Data Analysis Techniques in Research: While various groups, institutions, and professionals may have diverse approaches to data analysis, a universal definition captures its essence.

  21. Research data management: What it is + Free Examples

    Research data management (or RDM) is the action of vigorously organizing, storing, and preserving data during the market research process. RDM covers the data lifecycle from planning through to people, process and technology, and long-term monitoring and access to the data. This is a continuous ongoing cycle during the process of data.

  22. What is Data Management?

    Data management is the practice of ingesting, processing, securing and storing an organization's data, where it is then utilized for strategic decision-making to improve business outcomes. Over the last decade, developments within hybrid cloud, artificial intelligence, the Internet of Things (IoT), and edge computing have led to the ...

  23. Developing a Data Management Plan

    "A Data Management Plan (DMP) describing the scientific data expected to be created or gathered in the course of a research project must be submitted to DTIC at the start of each research effort. It is important that DoD researchers document plans for preserving data at the outset, keeping in mind the potential utility of the data for future ...

  24. What Is Data Management? Importance & Challenges

    Data management is the practice of collecting, organizing, protecting, and storing an organization's data so it can be analyzed for business decisions. As organizations create and consume data at unprecedented rates, data management solutions become essential for making sense of the vast quantities of data.

  25. (PDF) CHAPTER FOUR DATA PRESENTATION, ANALYSIS AND ...

    DATA PRESENTATION, ANALYSIS AND INTERPRETATION. 4.0 Introduction. This chapter is concerned with data pres entation, of the findings obtained through the study. The. findings are presented in ...

  26. Research data management in academic institutions: A scoping review

    For example, if a sample of journal articles were retrieved to examine the data sharing statements but each study was not explicitly linked to a research sector, it was accepted into our review since the outcomes are significant to the entire research community and academia was not explicitly excluded. ... Data abstraction and analysis ...

  27. Clinical Trials Data Management and Analysis

    Data in Research Clinical Trials Data Management and Analysis. Medical research is a process of gathering data that will expand scientists' understanding of human health. Researchers take steps to ensure the accuracy of all their data, then use sophisticated methods to analyze their findings to ensure the accuracy of their conclusions.

  28. Data Analysis in Research: Types & Methods

    Main Goals: Describe data, make inferences, predict future events, and provide data-driven recommendations.; Stages of Data Analysis: Data collection, preprocessing, exploratory data analysis, model building and testing, interpretation, and reporting.; Types of Data Analysis 1. Descriptive Analysis. Descriptive analysis focuses on summarizing and describing the features of a dataset.

  29. About Adverse Childhood Experiences

    Examples can include growing up in a household with: 1. Substance use problems. Mental health problems. Instability due to parental separation. Instability due to household members being in jail or prison. The examples above are not a complete list of adverse experiences. Many other traumatic experiences could impact health and well-being.

  30. From Internet of Things Data to Business Processes ...

    The IoT and Business Process Management (BPM) communities co-exist in many shared application domains, such as manufacturing and healthcare. The IoT community has a strong focus on hardware, connectivity and data; the BPM community focuses mainly on finding, controlling, and enhancing the structured interactions among the IoT devices in processes. While the field of Process Mining deals with ...