CC BY-NC-ND 4.0 · Journal of Social Health and Diabetes 2014; 02(01): 006-008
DOI: 10.4103/2321-0656.120254
Methodological Issues in Social Health and Diabetes Research
NovoNordisk Education Foundation

The first step in Data Analysis: Transcribing and managing qualitative research data

Heather L. Stuckey
Department of Medicine and Public Health Sciences, Pennsylvania State University College of Medicine, USA
› Author Affiliations
Further Information

Corresponding Author

Prof. Heather L. Stuckey
Department of Medicine and Public Health Sciences, Pennsylvania State University College of Medicine

Publication History

Publication Date:
20 November 2018 (online)



Researchers need to take data from the spoken text (structured, unstructured, or narrative interviews) to written form for analysis. Typically this is handled through deidentifying the participants and transcribing the data, and is considered the first step in analysis. The accuracy of the transcription plays a role in determining the accuracy of the data that are analyzed and with what degree of dependability. Analysis begins after reviewing the first interview to examine whether participants are responding to the research question related to your area of interest in diabetes, or whether your interview guide needs refining. As each interview is completed, the researcher examines its content to determine what has been learned and what still needs to be discovered or needs elaboration. Moving from raw interviews to evidence-based interpretations requires preparing transcripts so they will be ready to code. Before moving directing to analysis (or coding), it is important to recognize the task of handling the qualitative research data during and after the interview. This paper describes the process of transcription and handling the qualitative data related to diabetes research.



When I first started working with qualitative researcher, my professors told me to obtain Institutional Review Board approval before interviewing, and then transcribe and deidentify the participant data. It seemed straightforward, but there were bumps and curves involved in the learning process, and I realized that the quality of the transcription can impact the quality of the analysis. Recordings are transcribed into written form so they can be studied in detail and linked with analytic coding. How content is both heard and perceived by the transcriptionist and the form and accuracy of its transcription play a key role in determining what data are analyzed and with what degree of dependability.[1] The purpose of this introduction into qualitative research is to explain the importance of human protections review prior to recording the interviews, how to deidentify data in the interviews, and transcribe the documents as the first step in qualitative analysis.

Research Ethics Review

Before interviewing, the first step is to gain research ethics approval. The human protections review is developed to protect the privacy of participants and provide consent to perform research according to established steps in the protocol. If the researcher is conducting interviews or obtaining data through an interaction with patients, students, or any living individual, a human protections review is required. Each country has its own set of regulations regarding protection of human research subjects. The US Department of Health and Human Services has a compilation of human research standards for different countries, with an international compilation of human research standards.[2] At the National Institutes of Health, individuals who are involved in the design or conduct of human subjects research must fulfill an education requirement.[3] The NIH or other funding agency typically does not endorse any specific educational programs. Instead, institutions are in the best position to determine what programs are appropriate for fulfilling the education requirement for human subjects review. Institutions may require a particular program or may choose to develop a program to meet the requirement, so researchers need to check with their institution. As a public service, the NIH Office of Extramural Research offers a free tutorial on protecting human research participants[4] that researchers may elect to use to meet the human subjects protections education requirement.


Transcription Process

After human subjects review is complete, the qualitative data collection may begin. The majority of researchers use a recording device to capture the words of the participants in interviews and observation. With a recording, the interviewer can concentrate on listening and responding to the participant, without being distracted by needing to write extensive notes. For more detailed tips on the interview process itself, a recent article in this journal described three kinds of interviews as a common source of data collection.[5] [Table 1] contains an outline of the transcription process, which is the first step in data analysis.

Table 1

The transcription process

Deidentify participant’s data

Discuss with transcriptionist

Transmission of meaning to the text

Participant given a deidentifying number

Purpose of the research study

Italicize or capitalize for inflection

Interview recorded

Types of words to deidentify

Capture meaning through use of pauses, laughter, and other indicators

Management of fillers

Use of double space

Deidentifying data

When conducting interviews and recording, a nonidentifying variable (typically a number) is given to the participant′s interview. For example, if you are holding an interview with a participant, quietly turn on the recorder (recommended that you practice before you interview for the first time) and state, “This is participant 2, and today is [date].” The point is that the researcher should not use the participant′s name. This participant number also will be used to identify the transcription, and any other documents that can be linked to the participant (surveys, documents, A1c values). A master list of the participant name and the number assigned to that participant should be kept at a location that is different from where the data are kept to avoid a breach in confidentiality. To ensure anonymity in the transcript, make sure that the participant′s name has been removed, as well as identifiable variables such as workplace, place of birth, profession, or any name used in the document.[1] If the information is needed to be kept identifiable for research purposes, such as the performance of one physician′s practice compared with another, follow the same procedure in keeping a master list of the practice′s name and a number assigned to that participant. Otherwise, replacing the identifying information with an XX or substituting the name with a role (such as “son” or “endocrinologist”) ensures anonymity of the respondents and their information.


Discussions with Transcriptionist

Funded studies often include a transcriptionist into the budget. Transcription is a time consuming process and the estimated ratio of time required to transcribe is 4:1. For every hour of interview time, the cost of a transcriptionist should be written into the budget for 4 hours. Verbatim transcription with cues of nonverbal behavior are necessary to establish reliability, dependability, and trustworthiness of the study.[6] If verbatim transcription is omitted to save time, bias can occur if the researcher reaches conclusion before the data are checked. Memory can be flawed and selective and is not a substitute for careful examination of the actual transcriptions. For this reason, it is preferable that the researcher produce full transcripts of the interviews. Together, the researcher and the transcriptionist will discuss the expectations in deidentification of data and the transmission of meaning to the text.

Transmission of Meaning to the Text

Quality transcription is not just typing or using voice recognition software to transfer the data, because it is important to transmit the way that people speak. For example, if a person with diabetes says, “He told me to go on insulin,” the tone and inflection matter in the transcription. These words can mean different things:

  1. HE told me to inject insulin (but someone else told me something different)

  2. He TOLD ME to inject insulin (but I may or may not have done it)

  3. He told me to INJECT INSULIN (but I dislike the idea of needles or insulin)

When there is inflection, it is first the interviewee′s role to ask further questions to make sure the participant is clearly understood. For example, if a participant said, “He WANTED me to check my blood sugar,” an appropriate follow-up question would be, “But did you check your blood sugar? [response] and tell me more about why you didn′t[did].” This capitalization is not needed for every inflection, but the transcriptionist should be aware of how inflection impacts analysis so that he/she can manage the data to fulfill the purpose of answering the research question. If the researcher did not ask a follow-up question, then identification of pauses and inflection become even more critical to the analysis.

Another area to discuss is the omission or retention of fillers. Fillers are words such as “um” that the participant uses to fill space while he/she is thinking. Sandelowski[7] has pointed out that most analytic qualitative approaches, except narrative analysis, do not benefit from the transcription of these fillers. The informational content of the data has priority, and the transcription process needs to focus on the accuracy of the data content. Words such as “uh-huh” and “hmmm” by the researcher are often eliminated, because they are affirmations, rather than interpretation of the data.

The following is an example of a completed transcription segment, with addition of pauses and sounds, to demonstrate the interpretation of an exchange between a general practitioner (Doctor) and patient (Patient 10) with diabetes. The excerpt is taken from the end of a consultation, after the diagnosis of diabetes, with no medication at this time. Transcribing the verbal content alone can produce two very different interpretations:


Option One

Doctor: I would suggest that you begin to reduce your blood sugar with exercise. It is important to get at least 30 minutes of exercise. OK?

Patient 10: I am glad that I do not have to go on medication.

Doctor: You understand the importance of exercise? We will work on diet next visit.

Patient 10: Fine. OK. Thank you very much.


Option Two

Doctor: I would suggest (…) you begin to reduce your blood sugar. With exercise, it is important to get at least 30 minutes every day (interruption by nurse with 5 minute pause).

Patient 10 apparently waits while discussion completes with nurse.

Doctor. OK?

Patient 10: I am glad I do not need to go on medication.

Doctor: You understand the importance of exercise? (Slaps hand on table, dramatic pause). We will work on diet next visit.

Patient 10: Fine (hesitation). OK. (deep inhale). Thank you very much (very softly).

Representing some nonverbal features of the interaction on the transcript changes the interpretation of this small segment of text. The transcriptionist is an important member of the research team, as the contribution of this role plays a critical part of the first stage of the analysis of qualitative data. It is important to hold close communication and discussion with the transcriptionist, as the transcriptionist′s contribution is the first step in the interpretation of data. In the next methodological issue in this journal, we will discuss the next phase of data analysis, which is the coding process.

How to cite this article: Stuckey HL. The first step in Data Analysis: Transcribing and managing qualitative research data. J Soc Health Diabetes 2014;2:6-8.

Source of Support: Nil.


Conflict of Interest

None declared.

Corresponding Author

Prof. Heather L. Stuckey
Department of Medicine and Public Health Sciences, Pennsylvania State University College of Medicine