1. Murphy, Kristen PT, MS, DPT
  2. Lowe, Susan PT, DPT, MS, GCS, CEEAA


Abstract: The Timed Up and Go (TUG) is a popular, effective, and valid test of functional mobility and fall risk that is often completed by registered nurses (RNs) and physical therapists (PTs) throughout the course of a home care episode. As reimbursement becomes tied to outcomes, it is essential that all disciplines are consistent in their methods when administering the TUG. Results of this study confirm the hypothesis that test-specific training will significantly improve reliability of the TUG when completed by 2 different disciplines. The purpose of this article is to describe an initiative that provided tool-specific training to all clinical staff at our home care agency. The inter-rater reliability between PTs and RNs improved significantly from 0.77 to 0.86 (p = 0.001) after standardized training on administration of the TUG.


Article Content


Fall Risk Assessment and Outcome and Assessment Information Set

In 1999, the Centers for Medicare and Medicaid Services (CMS) began requiring all Medicare--certified home healthcare agencies (HHAs) to collect and report performance data using a tool called Outcome and Assessment Information Set (OASIS) (Centers for Medicare & Medicaid Services, 2012a). The data collected are meant to represent the core items needed for a comprehensive assessment of adult patients receiving home care services (Centers for Medicare & Medicaid Services, 2012a). The information can be used by the HHA and CMS to measure patient outcomes and to assist with outcome-based quality improvement. In addition to providing for patient assessment and outcomes, OASIS data can and should be used for care planning. In 2010, OASIS-C was rolled out as the latest version of the data set. OASIS-C added process measures related to fall risk assessment and prevention.

Figure. No caption a... - Click to enlarge in new windowFigure. No caption available.

Process quality measures are used to assess the rate of completion of specific evidence-based processes of care for high-risk, high-volume, problem-prone areas. Fall risk is one of these areas (Centers for Medicare & Medicaid Services, 2012b). Although not required, CMS tracks and incentivizes HHAs to complete a fall risk assessment using a tool that is valid, standardized, and multifactorial. The completion of this fall risk assessment is documented in OASIS question M1910 (Table 1). Failure to complete a multifactor fall risk assessment will negatively impact the agency's process measure report. Public reporting of this finding is expected to encourage HHAs to follow best practices for fall prevention and assessment (Anamaet & Krulish, 2011). The Timed Up and Go (TUG) test is commonly used as a tool, along with assessment of at least one other nonmobility risk factor, to meet the criteria for question M1910 (Anamaet & Krulish, 2011).

Table 1 - Click to enlarge in new windowTable 1. Outcome and Assessment Information Set Question M1910

Literature Review

Using the TUG to Determine Fall Risk

The TUG is a simple, effective, and common test of functional mobility. The TUG assesses a subject's ability to stand up, walk 3 m, turn, walk another 3 m, and then sit back down. It can provide a quick assessment of functional strength, ability to ambulate, and dynamic balance (Podsiadlo & Richardson, 1991). A great amount of clinical information can be learned from the TUG. Normative reference values are available by age to determine above and below average scores (Bohannon, 2006). Extended TUG times have been shown to be predictive of difficulty with activities of daily living (Wennie Huang et al., 2010). Herman et al. (2011) concluded that elevated TUG scores correlate with early onset of mild cognitive decline. Alexandre et al. (2012) found the TUG to be an accurate measure for screening the fall risk in older adults, using a cut-off score of 12.47 seconds. There are several different cut-offs discussed in the literature but the Alexandre study is the most recent.


Inter-rater Reliability of the TUG

Reliability of the TUG is considered to be excellent with an intraclass correlation (ICC) greater than 0.95 (Nq & Hui-Chan, 2005). It has been shown to be reliable in a variety of clinical situations including in cases of Parkinson's disease with an ICC = 0.80 (Huang et al., 2011), dementia with an ICC of 0.90 (Blankevoort et al., 2013), and hip fracture after surgery with an ICC of 0.95 (Kristensen et al., 2011).


The TUG is both sensitive and specific in regards to identifying community-dwelling older adults (CDOAs) that are at risk to fall (Shumway-Cook et al., 2000). It has been shown to be sensitive to change and to have good reliability among clinicians of varying levels of experience (Shumway-Cook et al., 2000). Researchers have determined values for a minimally important clinical difference ranging from 3.5 seconds in patients with Parkinson's disease (Huang et al., 2011) to 6.2 seconds in patients who average an initial time of 20 seconds (Kristensen et al., 2011).


Predictive Value of the TUG

Although there are consistent data about the ability of the TUG to identify subjects with a past history of falls, there is less conclusive evidence about the ability of the TUG to predict future falls. Beauchet et al. (2011) identified only one study that found a significant association between TUG time and future falls during their systematic review. In their discussion, the variation of administration of the TUG among the various studies was identified as a possible confounding factor. Bergmann et al. (2009) showed that procedural differences, including verbal instructions, distance marker, and chair type, can negatively affect reliability of the TUG.


Importance of Good Reliability

Reliability of a clinical measure is one of the most important characteristics of a test. Reliability is the ability of a test score to be consistent when repeated. A measure is thought to be reliable if consistent scores are obtained under consistent conditions. When analyzing the effectiveness of therapy interventions, it is critical that the data used to determine these outcomes are reliable. Given the importance these data can have in determining functional outcomes, it is imperative that the testing be completed as described by best practice guidelines. The value of the score is only as good as the quality of the data.



Using TUG Scores to Determine Outcomes

Although the inter-rater reliability of the TUG is good between physical therapists (PTs), there does not appear to be any literature on the reliability of the TUG between different disciplines. The TUG test is often completed by nurses at the initial nursing visit as part of the requirement to complete a multifactor fall risk assessment. The degree to which the admitting nurse or PT is trained in the administration of the TUG varies between agencies. The variation in the way the TUG is completed can be related to a lack of consistent training. The TUG is a popular test among PTs when it is time for reassessment. There are frequent occurrences where a clinician would like to compare the reassessment TUG score to the score obtained at the admission visit for outcomes studies. For the purpose of this study, outcomes are defined as the progress made on the TUG as the result of therapy. If the initial TUG was completed by the nurse, can the therapist use that score as a baseline to determine progress?




One physical therapist (PT) and one registered nurse (RN) administered the TUG to 15 CDOAs without any specific instructions or training. The home care agency had not completed any specific training of the TUG for its clinical staff before this study. The CDOAs were volunteers recruited from an exercise class held at local senior center. The volunteers were predominantly women (13 women and 2 men) with an average age of 68 years. The volunteers were divided into two groups. One group completed the TUG with the nurse first and then completed the TUG with the PT. The other group completed the TUG initially with the PT and then with the nurse. Each clinician had the CDOAs complete a practice run-through of the TUG and two timed trials of the TUG. The average of the two trials was used for statistical purposes. The physical setup, including measurement of the 3-m course, was established by the author and was consistent for each clinician. The data were not shared among clinicians. Inter-rater reliability was determined between each discipline.



One week later, the RN and PT participated in a specific training session for administration of the TUG. The training was completed in an agency-sponsored training event offered to the entire clinical staff (RNs, PTs, and occupational therapists) of the HHA. The training protocol was based on best practices as indicated in the CDC's "Tools to Implement the Otago Exercise program"(National Center for Injury Prevention and Control, 2012) (Table 2). A post-training questionnaire was filled out by all clinicians who participated in the training session (Table 3). The questionnaire surveyed the clinicians' confidence in administration of the TUG and perception of the TUG before and after the training session.

Table 2 - Click to enlarge in new windowTable 2. Standardized Timed Up and Go Protocol
Table 3 - Click to enlarge in new windowTable 3. Post-training Questionnaire


Five days after the training, the PT and the RN administered the TUG to 15 different CDOAs using the same procedure. Again, inter-rater reliability was calculated between disciplines.



There were no significant differences found between the two groups of CDOA that participated in the TUG trials in terms of age or gender. The inter-rater reliability of the TUG was calculated by using the Pearson correlation coefficient. The reliability of the TUG scores between the PT and the nurse prior to the training session was 0.77, indicating a level of consistency among the scores that is good but not as high as the inter-rater reliability normally found when performing the TUG. The interdisciplinary reliability of the TUG scores improved after standardized training (r = .86), indicating a level of consistency among scores that is considered good (Portney & Watkins, 1993). The change in inter-rater reliability was significant at a level of p = .001 (Figure 1).

Figure 1 - Click to enlarge in new windowFigure 1. Inter-rater reliability.

Observation analysis of the testing procedures prior to training revealed several inconsistencies between the clinicians' administration of the test. Variations in verbal instructions included different instructions regarding walking speed (normal or fast pace) and the exact location to turn and return to the chair (walk past the line or walk to the line). There was a lack of consistency between clinicians and between trials of the same clinician in regards to exact start and stop time recorded via the stopwatch. Several different start times were used, including when the CDOA started to move, when the CDOA actually lifted off the chair, and when the clinician said go. It was observed that during some trials no command of "go" was given and the timing started when the CDOA started to move. Stop times varied as well, though not to the same extent. After training, improved consistency of verbal instructions and timing criteria was observed.


Discussion of Findings

Results of this study confirm the hypothesis that test-specific training will significantly improve reliability of the TUG test when completed by two different disciplines. The key to improved reliability between disciplines appears to be specific instruction in best practice guidelines for proper physical setup, verbal instructions, and timing criteria.


A Model for Mandatory Training

In-service training was provided to approximately 120 clinicians at Visiting Nurses & Health Services of CT over the course of 1 day. The staff was divided into groups of four to five clinicians who were assigned a specific training time scheduled at 15-minute intervals. The RN and PT who participated in the research study attended this training event. Training consisted of instruction in proper physical setup, verbal instructions, and timing guidelines (Figure 2). Groups of clinicians timed one volunteer subject as they walked the TUG. Scores were compared between the clinicians. Clinicians were deemed competent in administration of the TUG when scores were within 0.5 seconds of each other.

Figure 2 - Click to enlarge in new windowFigure 2. A model for clinician training.

1. Consistent physical setup: As described in the original study, Podsiadlo and Richardson (1991) used an upright chair with a seat height of 47 cm. In a patient's home, it is critical to find an appropriate chair from which to start the test. Low couches, soft recliners, or chairs without armrests are not appropriate for use with the TUG. To improve consistency of the measurement of the 3-m walking course, 3-m lengths of string were provided to all clinicians for use in the home. Clinicians were instructed to use the string to mark the course and then remove the string for the actual test. It was assumed that measuring 3 m in the home was a great source of variability between clinicians. Anecdotally, during the training sessions, several clinicians attempted to confirm the accuracy of pacing out 3 m with their own steps only to be surprised at how poorly this method correctly measures 3 m. Consistent physical setup is crucial to reliable data.


2. Standardized verbal instructions:Podsiadlo and Richardson (1991) published specific instructions to be used during the test (Table 2). This wording is considered to be the standard. Without specific and standardized instructions, variability among scores will increase. Patients are instructed to ambulate at a "normal pace." It is a common misconception that the patient should be told to ambulate as quickly as possible. The verbal instructions also include directions to walk "to the line." There is no need to go past the line. Reliability is greatly improved when patients are provided with the same instructions at all times.


3. Established timing criteria: Timing of the TUG test should be completed with a stopwatch or timer. A stopwatch feature can be found on almost all cell phones. Use of the second hand on a watch is not an acceptable method of timing. Instructions for start and stop times are very clear. Timing of the TUG test should start on the word "Go." It is important to capture the patient's ability to process the command and react to it. Patients with mild cognitive impairment may have delays in the processing of this command and this delay needs to be captured in the TUG score (Herman et al., 2011). By starting the timing on the word "Go," the TUG is assessing a patient's ability to rise from a chair in a timely manner. As a patient's strength improves and the sit-to-stand transfer becomes easier, the TUG score will reflect that improvement only if the start time is accurate and consistent. The timing should end when the patient has returned to sitting and the back of the patient contacts the back of the chair. By clearly defining the exact moment to start and stop the timing of the test, reliability of the TUG is greatly improved.



TUG Scores for Outcome Studies

Without specific training, the inter-rater reliability of the TUG, in this pilot study, is not adequate to determine progress at time of reassessment. If TUG scores obtained at admission by RNs are to be compared to TUG scores obtained by PT at time of reassessment, HHAs should be instituting an education process for all staff to assure consistency. The level of reliability can be adequate for accurate reassessment of progress if the recommendations identified are followed.


After training on the TUG, the interdisciplinary reliability improved to the level of 0.86. This is less than previously documented inter-rater reliability between PTs. Kristensen et al. (2011) reported an ICC of .95 in patients with hip fracture. The inter-rater reliability was found to be .91 in a study of the effect of cognitive deficits on the TUG (Nordin et al., 2006). The difference in the reliability between PTs and RNs may be related to the experience and training PTs have in movement observation and analysis. Additional methods of training may be beneficial, including training with actual agency patients in their own home. Future studies may be indicated to further determine how interdisciplinary reliability of functional screening tests can be improved.


Clinician Confidence Pre- and Post-training

In an effort to assess the perceptions and confidence level of the clinicians related to their ability to complete the TUG, 103 clinicians completed a questionnaire after the training session-see Table 3. Only 9% of RNs reported they were "very confident" in their ability to administer the TUG before the training session. In contrast, 100% of the PTs reported they were "very confident" in their administration of the TUG. After the training session, 83% of RNs surveyed responded as being "very confident" in their ability to properly complete the TUG-see Figure 3. Of these 83%, 62% stated they would now be more likely to complete the TUG at time points other than at the start of care as needed. Although the TUG is a required part of the initial start of care visit for this agency, the TUG can be used intermittently as determined by the clinician to assess progress and fall risk. One hundred percent of all clinicians surveyed reported a greater appreciation for the TUG and its clinical value after the in-service training.

Figure 3 - Click to enlarge in new windowFigure 3. Clinician confidence.


This is a small pilot study that compared inter-rater reliability between one PT and one RN. Although the entire clinical staff of the agency (RNs, PTs, occupational therapists) completed the training, the interdisciplinary reliability was calculated with one PT and one RN on a small number of CDOAs. Future studies should include a larger number of clinicians and subjects. Another limitation was the use of healthy CDOAs as compared to actual patients. Similarly, the TUG test was completed in a controlled environment in the community (a senior center) as compared to in the home. Finally, this study addresses these concerns in one agency within one state and the results may not carry over across the country.


Discussion and Implications

The findings of the study indicate that home healthcare clinicians should complete a TUG training program and pass a competency examination based on best practice standards of administration of the TUG at time of initial hiring and annually after that as a part of annual competencies. Consistent verbal instructions, consistent timing guidelines, and consistent physical setup are required to maximize reliability and the value of any outcomes that use the TUG data. With adequate training, initial TUG scores can be used for comparison at time of reassessment regardless of which discipline completed the test at start of care.


Need for Ongoing Research

Given the collaborative nature of healthcare, it is imperative that adequate reliability is present in all assessments that are completed by multiple disciplines. Although the interdisciplinary reliability of the TUG improved significantly with this training program, there is still room for improvement. Additional ways to improve consistency across disciplines include training with actual patients, training in the home environment, and more frequent training. Future studies may want to investigate not only the TUG but also other valuable clinical tools that are commonly used across disciplines and in the home.




Alexandre T. S., Meira D. M., Rico N. C., Mizuta S. K. (2012). Accuracy of Timed Up and Go test for screening risk of falls among community-dwelling elderly. Revista Brasileira de Fisioterapia (Brazil), 16(5), 381-388. [Context Link]


Anamaet W. K., Krulish L. H. (2011). Fall risk assessments in home care: OASIS-C expectations. Home Health Care Management & Practice, 23(2), 125-138. [Context Link]


Beauchet O., Fantino B., Allali G., Muir S. W., Montero-Odasso M., Annweiler C. (2011). Timed Up and Go test and risk of falls in older adults: A systematic review. Journal of Nutrition, Health, & Aging, 15(10), 933-938. [Context Link]


Bergmann J. H., Alexiou C., Smith I. C. (2009). Procedural differences directly affect Timed Up and Go times. Journal of the American Geriatrics Society, 57(11), 2168-2169. [Context Link]


Blankevoort C. G., Van Heuvelen M. J., Scherder E. J. (2013). Reliability of six physical performance tests in older people with dementia. Physical Therapy, 93(1), 69-78. [Context Link]


Bohannon R. W. (2006). Reference values for the Timed Up and Go test: A descriptive meta-analysis. Journal of Geriatric Physical Therapy, 29(2), 64-68. [Context Link]


Herman T., Giladi N., Hausdorff J. M. (2011). Properties of the "Timed Up and Go" test: More than meets the eye. Gerontology, 57(3), 203-210. [Context Link]


Centers for Medicare & Medicaid Services. (2012a). Home health quality initiative. Retrieved from


Centers for Medicare & Medicaid Services. (2012b). Home health quality initiative/quality measures. Retrieved from


Huang S. L., Hsieh C. L., Wu R. M., Tai C. H., Lin C. H., Lu W. S. (2011). Minimal detectable change of the Timed "Up & Go" test and the dynamic gait index in people with Parkinson disease. Physical Therapy, 91(1), 114-121. [Context Link]


Kristensen M. T., Henriksen S., Site S. B., Bandholm T. (2011). Relative and absolute intertester reliability of the Timed Up and Go test to quantify functional mobility in patients with hip fracture. Journal of the American Geriatrics Society, 59(3), 565-567. [Context Link]


National Center for Injury Prevention and Control. (2012). Tools to implement the Otago exercise program: A program to reduce falls. Atlanta, GA: Centers for Disease Control and Prevention. [Context Link]


Nordin E., Rosendahl E., Lundin-Olsson L. (2006). Timed "Up & Go" test: Reliability in older people dependent in activities of daily living-focus on cognitive state. Physical Therapy, 86(5), 646-655. [Context Link]


Nq S. S., Hui-Chan C. W. (2005). The Timed Up & Go test: Its reliability and association with lower-limb impairments and locomotor capacities in people with chronic stroke. Archives of Physical Medicine and Rehabilitation, 86(8), 1641-1647. [Context Link]


Podsiadlo D., Richardson S. (1991). The Timed "Up & Go": A test of basic functional mobility for frail elderly persons. Journal of the American Geriatrics Society, 39(2), 142-148. [Context Link]


Portney L. G., Watkins M. P. (1993). Foundations of Clinical Research: Applications to Practice. Norwalk, CT: Prentice Hall. [Context Link]


Shumway-Cook A., Brauer S., Woollacott M.. (2000). Predicting the probability for falls in community-dwelling older adults using the Timed Up & Go test. Physical Therapy, 80(9), 896-903. [Context Link]


Wennie Huang W. N., Perara S., VanSwearingen J., Studenski S. (2010). Performance measures predict onset of activity of daily living difficulty in community-dwelling older adults. Journal of the American Geriatrics Society, 58(5), 844-852. [Context Link]