Buy this Article for $10.95

Have a coupon or promotional code? Enter it here:

When you buy this you'll get access to the ePub version, a downloadable PDF, and the ability to print the full article.


latent variable, missing variable, phantom variable, research utilization, structural equation model



  1. Midodzi, William K.
  2. Hayduk, Leslie
  3. Cummings, Greta G.
  4. Estabrooks, Carole A.
  5. Wallin, Lars


When doing secondary data analysis, it is not uncommon to find that a key variable was not measured. Often the researcher has no option but to do without the missing indicator, but when nearly parallel datasets exist, the researcher may have other options. In an earlier article leading up to this special issue, this research team was confronted with the problem that research utilization had been measured in only one of two similar datasets, namely, in the 1996 but not the 1998 Alberta Registered Nurse survey. The 1998 dataset had a larger sample size (6,526 compared to 600 nurse respondents in 1996) and a stronger set of measured variables, but was missing the key variable of interest-research utilization. To overcome this, a regression-based strategy was used to create a research utilization score for each nurse in the 1998 survey by exploiting the availability of several anticipated causes of research utilization in both datasets. Presented here is an alternative and more complicated procedure that might be applied in future investigations. The article presents a methodological understanding of how to use a phantom variable to account for the unmeasured research utilization variable in a two-group structural equation model. This approach could be used to overcome several of the limitations connected to using a regression-based approach to creating a key missing variable when nearly parallel datasets are available.