HEALTHCARE COST AND UTLIZATION PROJECT HCUP
A FEDERAL-STATE-INDUSTRY PARTNERSHIP IN HEALTH DATA
Sponsored by the Agency for Healthcare Research and Quality
THE HCUP NATIONWIDE INPATIENT SAMPLE (NIS),
|These pages provide only an introduction to the NIS package.|
For full documentation and notification of changes,
visit the HCUP User Support (HCUP-US) Website at http://www.hcup-us.ahrq.gov.
Issued May 2008
Updated November 2015
Agency for Healthcare Research and Quality
Healthcare Cost and Utilization Project (HCUP)
NIS Data and Documentation Distributed by:
HCUP Central Distributor
Phone: (866) 556-4287 (toll-free)
Fax: (866) 792-5313
Table of ContentsSkip Table of Contents
- SUMMARY OF DATA USE LIMITATIONS
- HCUP CONTACT INFORMATION
- WHAT’S NEW IN THE 2006 NATIONWIDE INPATIENT SAMPLE (NIS)?
- UNDERSTANDING THE NIS
- INTRODUCTION TO THE HCUP NATIONWIDE INPATIENT SAMPLE (NIS)
- Overview Of NIS Data
- NIS Data Sources, Hospitals, and Inpatient Stays
- State-Specific Restrictions
- Contents of CD-ROM Set
- NIS Data Elements
- Getting Started
- HOW TO USE THE NIS FOR DATA ANALYSIS
- Calculating National Estimates
- Why the NIS Should not be Used to Make State-Level Estimates
- Studying Trends
- Choosing Data Elements for Analysis
- ICD-9-CM Diagnosis and Procedure Codes
- Missing Values
- Variance Calculations
- Computer Software for Variance Calculations
- Longitudinal Analyses
- Discharge Subsamples
- SAMPLING OF HOSPITALS
- Sampling of Hospitals Included in the NIS
- Hospital Sampling Frame
- Hospital Sample Design
- Design Considerations
- Overview of the Sampling Procedure
- Change to Hospital Sampling Procedure Beginning with the 1998 NIS
- Zero-Weight Hospitals
- Final Hospital Sample
- SAMPLE WEIGHTS
- APPENDIX I: TABLES AND FIGURES
- Table 1: 2006 Data Sources
- Table 2: Number of NIS States, Hospitals, and Discharges, by Year
- Table 3: Summary of NIS Releases
- Table 4: Summary of NIS Data Sources, Hospitals, and Inpatient Stays, 1988-2006
- Table 5: NIS Related Reports and Database Documentation Available on HCUP-US
- Figure 1: Hospital Universe, by Year
- Figure 2: NIS States, by Region
- Table 6: All States, by Region
- Table 7: Bed Size Categories, by Region
- Figure 3: NIS Hospital Sampling Frame, by Year
- Figure 4: Number of Hospitals in the 2006 Universe, Frame, and Sample for Frame States
- Table 8: Number of Hospitals and Discharges in 2006 AHA Universe, Frame, and NIS, by State
- Figure 5: Number of Hospitals Sampled, by Year
- Figure 6: Number of NIS Discharges, Unweighted, by Year
- Figure 7: Number of NIS Discharges, Weighted, by Year
- Figure 8: Number of Hospitals in the 2006 Universe, Frame, Sample, Target, and Surplus, by Region
- Figure 9: Percentage of U.S. Population in 2006 NIS States, by Region
- Figure 10: Number of Discharges in the 2006 NIS, by State
- APPENDIX II: STATE-SPECIFIC RESTRICTIONS
- APPENDIX III: DATA ELEMENTS
HCUP NATIONWIDE INPATIENT SAMPLE ( NIS)
***** REMINDER *****
All users of the NIS must take the on–line HCUP Data Use Agreement (DUA) training course, and read and sign a Data Use Agreement.†
Authorized users of HCUP data agree to the following restrictions: ‡
Any violation of the limitations in the Data Use Agreement is punishable under Federal law by a fine of up to $10,000 and up to 5 years in prison. Violations may also be subject to penalties under State statutes.
† The on–line Data Use Agreement training session and the Data Use Agreement are available on the HCUP User Support (HCUP–US) Website at http://www.hcup-us.ahrq.gov.
HCUP CONTACT INFORMATION
All HCUP data users, including data purchasers and collaborators, must complete the online HCUP Data Use Agreement (DUA) Training Tool, and read and sign the HCUP Data Use Agreement. Proof of training completion and signed Data Use Agreements must be submitted to the HCUP Central Distributor as described below.
The on-line DUA training course is available at: http://www.hcup-us.ahrq.gov/tech_assist/dua.jsp.
The HCUP Nationwide Data Use Agreement are is available on the AHRQ-sponsored HCUP User Support (HCUP-US) website at:
HCUP Central Distributor
Data purchasers will be required to provide their DUA training completion code and will execute their DUAs electronically as a part of the online ordering process. The DUAs and training certificates for collaborators and others with access to HCUP data should be submitted directly to the HCUP Central Distributor using the contact information below.
The HCUP Central Distributor can also help with questions concerning HCUP database purchases, your current order, training certificate codes, or invoices, if your questions are not covered in the Purchasing FAQs on the HCUP Central Distributor website.
Phone: 866-556-HCUP (4287) (toll free)
Fax: 866-792-5313 (toll free in the United States)
HCUP Central Distributor
Social & Scientific Systems, Inc.
8757 Georgia Ave, 12th Floor
Silver Spring, MD 20910
HCUP User Support:
Information about the content of the HCUP databases is available on the HCUP User Support (HCUP-US) website (http://www.hcup-us.ahrq.gov). If you have questions about using the HCUP databases, software tools, supplemental files, and other HCUP products, please review the HCUP Frequently Asked Questions or contact HCUP User Support:
WHAT’S NEW IN THE 2006
UNDERSTANDING THE NIS
This document, Introduction to the NIS, 2006, summarizes the content of the NIS and describes the development of the NIS sample and weights. Cumulative information for all previous years is included to provide a longitudinal view of the database. Highlighted are important considerations for data analysis and references to detailed reports are provided. In-depth documentation for the NIS is available on the HCUP User Support (HCUP-US) Website (www.hcup-us.ahrq.gov).
HEALTHCARE COST AND UTILIZATION PROJECT HCUP
A FEDERAL-STATE-INDUSTRY PARTNERSHIP IN HEALTH DATA
Sponsored by the Agency for Healthcare Research and Quality
The Agency for Healthcare Research and Quality and
the staff of the Healthcare Cost and Utilization Project (HCUP) thank you for
purchasing the HCUP Nationwide Inpatient Sample ( NIS).
HCUP Nationwide Inpatient Sample ( NIS)
The Nationwide Inpatient Sample (NIS) is part of the Healthcare Cost and Utilization Project (HCUP), sponsored by the Agency for Healthcare Research and Quality (AHRQ), formerly the Agency for Health Care Policy and Research.
The NIS is a database of hospital inpatient stays. Researchers and policy makers use the NIS to identify, track, and analyze national trends in healthcare utilization, access, charges, quality, and outcomes.
The NIS is the largest all-payer inpatient care database that is publicly available in the United States, containing data from 5 to 8 million hospital stays from about 1,000 hospitals sampled to approximate a 20-percent stratified sample of U.S. community hospitals. The NIS is drawn from those States participating in HCUP and weights are provided to calculate national estimates. See Table 1 in Appendix I for a list of the statewide data organizations participating in the NIS. The number of sample hospitals and discharges by State and year are available in Table 2 in Appendix I.
The NIS is available yearly, beginning with 1988, allowing analysis of trends over time. (Analyses of time trends are recommended from 1993 forward. See the report, Using the HCUP Nationwide Inpatient Sample to Estimate Trends, available on the HCUP User Support (HCUP-US) Website, for details.)
The NIS is the only national hospital database with charge information on all patients, regardless of payer, including persons covered by Medicare, Medicaid, private insurance, and the uninsured. The NIS’ large sample size enables analyses of rare conditions, such as congenital anomalies; uncommon treatments, such as organ transplantation; and special patient populations, such as the uninsured.
Inpatient stay records in the NIS include clinical and resource use information typically available from discharge abstracts. Hospital and discharge weights are provided for producing national estimates. The NIS can be linked to hospital-level data from the American Hospital Association (AHA) Annual Survey Database (Health Forum, LLC © 2012) and county-level data from the Bureau of Health Professions’ Area Resource File, except in those States that do not allow the release of hospital identifiers.
Beginning in 1998, the NIS differs from previous NIS releases: some data elements were dropped; some were added; for some data elements, the coding was changed; and the sampling and weighting strategy was revised to improve the representativeness of the data. (See the report, Changes in the NIS Sampling and Weighting Strategy for 1998, which describes these changes, available on the HCUP-US Website.) Periodically, new data elements are added to the NIS and some are dropped; see Appendix III for a summary of data elements and when they are effective.
Access to the NIS is open to users who sign data use agreements. Uses are limited to research and aggregate statistical reporting.
For more information on the NIS, please visit the AHRQ-sponsored HCUP-US Website at http://www.hcup-us.ahrq.gov.
INTRODUCTION TO THE HCUP NATIONWIDE INPATIENT SAMPLE (NIS)
OVERVIEW OF NIS DATA
The Nationwide Inpatient Sample (NIS) contains all-payer data on hospital inpatient stays from States participating in the Healthcare Cost and Utilization Project (HCUP). Each year of the NIS provides information on approximately 5 million to 8 million inpatient stays from about 1,000 hospitals. All discharges from sampled hospitals are included in the NIS database.
The NIS contains clinical and resource use information included in a typical discharge abstract. The NIS can be linked directly to hospital-level data from the American Hospital Association (AHA) Annual Survey Database (Health Forum, LLC © 2012) and to county-level data from the Health Resources and Services Administration Bureau of Health Professions’ Area Resource File (ARF), except in those States that do not allow the release of hospital identifiers.
The NIS is designed to approximate a 20-percent sample of U.S. community hospitals, defined by the AHA to be "all non-Federal, short-term, general, and other specialty hospitals, excluding hospital units of institutions." Included among community hospitals are specialty hospitals such as obstetrics-gynecology, ear-nose-throat, short-term rehabilitation, orthopedic, and pediatric institutions. Also included are public hospitals and academic medical centers. Starting in 2005, the AHA included long term acute care facilities in the definition of community hospitals. These facilities provide acute care services to patients who need long term hospitalization (stays of more than 25 days). Excluded from the NIS are short-term rehabilitation hospitals (beginning with 1998 data), long-term non-acute care hospitals, psychiatric hospitals, and alcoholism/chemical dependency treatment facilities.
This universe of U.S. community hospitals is divided into strata using five hospital characteristics: ownership/control, bed size, teaching status, urban/rural location, and U.S. region.
The NIS is a stratified probability sample of hospitals in the frame, with sampling probabilities proportional to the number of U.S. community hospitals in each stratum. The frame is limited by the availability of inpatient data from the data sources currently participating in HCUP.
In order to improve the representativeness of the NIS, the sampling and weighting strategy was modified beginning with the 1998 data. The full description of this process can be found in the special report on Changes in NIS Sampling and Weighting Strategy for 1998. This report is available on the AHRQ-sponsored HCUP-US Website at http://www.hcup-us.ahrq.gov. To facilitate the production of national estimates, both hospital and discharge weights are provided, along with information necessary to calculate the variance of estimates. Detailed information on the design of the NIS prior to 2006 is available in the year-specific special reports on Design of the Nationwide Inpatient Sample found on the HCUP-US Website (http://hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp). Starting with the 2006 NIS, the information on the design of the NIS was incorporated into this report.
- Data in fixed-width ASCII format on CD-ROM.
- Patient-level hospital discharge abstract data for 100% of discharges from a sample of hospitals in participating States.
- 5 million to 8 million inpatient records per year.
- 800-1,100 hospitals per year.
- Two 10% subsamples of discharges from all NIS hospitals (only available prior to the 2005 NIS).
- Discharge-level weights to calculate national estimates for discharges.
- Hospital Weights File to produce national estimates for hospitals and to link the NIS to data from the AHA Annual Survey Database (Health Forum, LLC © 2012).
- NIS Documentation and tools – including file specifications, programming source code for loading ASCII data into SAS and SPSS, and value labels. Beginning in 2005, code is also provided for loading the NIS ASCII file into Stata.
NIS Data Sources, Hospitals, and Inpatient Stays
Some data sources that contributed data to the NIS imposed restrictions on the release of certain data elements or on the number and types of hospitals that could be included in the database. Because of confidentiality laws, some data sources were prohibited from providing HCUP with discharge records that indicated specific medical conditions, such as HIV/AIDS or behavioral health. Detailed information on these State-specific restrictions is available in Appendix II.
Contents of CD-ROM Set
The NIS is contained on two CD-ROMs that include fixed-width ASCII formatted data files and a README.TXT file describing how to access related NIS documentation on the HCUP-US Website (http://www.hcup-us.ahrq.gov).
CD-ROM #1 contains:
Inpatient Core File: This inpatient discharge-level file contains data for 100% of the discharges from a sample of hospitals in participating States. The unit of observation is an inpatient stay record. Refer to Table 1 in Appendix III for a list of data elements in the Inpatient Core File. This file is available in all years of the NIS.
Hospital Weights File: This hospital-level file contains one observation for each hospital included in the NIS and contains weights and variance estimation data elements, as well as linkage data elements. The unit of observation is the hospital. The HCUP hospital identifier (HOSPID) provides the linkage between the NIS Inpatient Core files and the Hospital Weights file. A list of data elements in the Hospital Weights File is provided in Table 2 of Appendix III. This file is available in all years of the NIS.
CD-ROM #2 contains:
Disease Severity Measures Files: These discharge-level files contain information from four different sets of disease severity measures. Information from these severity files is to be used in conjunction with the Inpatient Core files. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Disease Severity Measures files. Refer to Table 3 in Appendix III for a list of data elements in the Severity Measures files. These files are available beginning with the 2002 NIS.
Diagnosis and Procedure Groups Files: These discharge-level files contain data elements from AHRQ software tools designed to facilitate the use of the ICD-9-CM diagnostic and procedure information in the HCUP databases. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Diagnosis and Procedure Groups files. Table 4 in Appendix III contains a list of data elements in the Diagnosis and Procedure Groups files. These files are available beginning with the 2005 NIS.
On the HCUP-US Website (http://www.hcup-us.ahrq.gov), NIS purchasers can access complete file documentation, including variable notes, file layouts, summary statistics, and related technical reports. Similarly, purchasers can also download SAS, SPSS, and Stata load programs. Available online documentation and supporting files are detailed in Appendix I, Table 5.
NIS Data Elements
All releases of the NIS contain two types of data: inpatient stay records and hospital information with weights. Appendix III identifies the data elements in each NIS file:
- Table 1 for the Inpatient Core files (record = inpatient stay)
- Table 2 for the Hospital Weights files (record = hospital)
- Table 3 for the Disease Severity Measures files (record = inpatient stay)
- Table 4 for the Diagnosis and Procedure Groups files (record = inpatient stay).
Not all data elements in the NIS are uniformly coded or available across all States. The tables in Appendix III are not complete documentation for the data. Please refer to the NIS documentation located on the HCUP-US Website (http://www.hcup-us.ahrq.gov) for comprehensive information about data elements and the files.
The NIS data files are provided on CD-ROMs. The NIS Inpatient Core and Hospital Weights files are on CD-ROM #1, while the Disease Severity Measures and Diagnosis and Procedure Groups files are on CD-ROM #2. Comprehensive documentation for the NIS files is available on the HCUP-US Website (http://www.hcup-us.ahrq.gov).
NIS Data Files
In order to load and analyze the NIS data onto your PC, you will need 13 gigabytes of space available. Because of the size of the files, the data are distributed as self-extracting PKZIP compressed files. To decompress the data, you should follow these steps:
- Create a directory for the 2006 NIS on your hard drive.
- Copy the self-extracting data files from the NIS CD-ROMs into the new directory.
- Unzip each file by running the corresponding *.exe file.
- Type the file name within DOS or click on the name within Windows Explorer.
- Edit the name of the "Unzip To Folder" in the WinZip Self Extractor dialog to select the desired destination directory for the extracted file.
- Click on the "Unzip" button.
The ASCII data files will then be uncompressed into this directory. After the files are uncompressed, the *.exe files can be deleted.
NIS documentation files on the HCUP-US Website (http://www.hcup-us.ahrq.gov) provide important resources for the user. Refer to these resources to understand the structure and content of the NIS and to aid in using the database.
- To locate the NIS documentation on HCUP-US, choose "HCUP Databases" from the home page (http://www.hcup-us.ahrq.gov). The first section under Nationwide HCUP Databases is specific to the NIS.
HOW TO USE THE NIS FOR DATA ANALYSIS
This section provides a brief synopsis of special considerations when using the NIS. For more details, refer to the comprehensive documentation on the HCUP-US Website (http://www.hcup-us.ahrq.gov).
- If anyone other than the original purchaser uses the NIS data, be sure to have them read and sign a Data Use Agreement, after viewing the on-line Data Use Agreement Training Tool available on the HCUP-US Website (http://www.hcup-us.ahrq.gov). A copy of the signed Data Use Agreements must be sent to AHRQ. See page 2 for the mailing address.
- The NIS contains discharge-level records, not patient-level records. This means that individual patients who are hospitalized multiple times in one year may be present in the NIS multiple times. There is no uniform patient identifier available that allows a patient-level analysis with the NIS. This will be especially important to remember for certain conditions for which patients may be hospitalized multiple times in a single year.
Calculating National Estimates
- To produce national estimates, use one of the following discharge weights to weight discharges in the NIS Core files to the discharges from all U.S. community, non-rehabilitation hospitals. The name of the discharge weight data element depends on the year of data and the type of analysis. In order to produce national estimates, you MUST use discharge weights.
|NIS Year||Name of Discharge Weight on the Core File to Use for Creating Nationwide Estimates||Name of Discharge Weight on the 10% Subsample Core File to Use for Creating Nationwide Estimates|
|2001 - 2004||
- Because the NIS is a stratified sample, proper statistical techniques must be used to calculate standard errors and confidence intervals. For detailed instructions, refer to the special report Calculating Nationwide Inpatient Sample Variances on the HCUP-US Website.
- The NIS Comparison Report assesses the accuracy of NIS estimates. The updated report for the current NIS will be posted on the HCUP-US Website (www.hcup-us.ahrq.gov) as soon as it is completed.
- When creating national estimates, it is a good idea to check your estimates against other data sources, if available. For example, the National Hospital Discharge Survey (http://www.cdc.gov/nchs/products/series.htm#sr13) can provide benchmarks against which to check your national estimates for hospitalizations with more than 5,000 discharges.
- To ensure that you are using the weights appropriately and calculating estimates and variances accurately, you can also use HCUPnet, the free online query system (https://hcupnet.ahrq.gov/#setup). HCUPnet is a Web-based query tool for identifying, tracking, analyzing, and comparing statistics on hospitals at the national, regional, and State level. HCUPnet offers easy access to national statistics and trends and selected State statistics about hospital stays. This tool provides step—by—step guidance, helping researchers to quickly obtain the statistics they need. HCUPnet generates statistics using the NIS, KID, and SID for those States that have agreed to participate. In addition, HCUPnet provides Quick Statistics — ready-to-use tables on commonly requested information — as well as national statistics based on the AHRQ Quality Indicators.
Why the NIS Should not be Used to Make State-Level Estimates
AHRQ strongly advises researchers against using the NIS to estimate State-specific statistics. Prior to 2012, State is available as a NIS data element. However, these NIS samples were not designed to yield a representative sample of hospitals at the State level. AHRQ recommends that researchers employ the SID for State-level estimates.
Each NIS sample is drawn from the sampling frame consisting of discharge data submitted by HCUP Partners-statewide data organizations that agree to participate in the NIS. Data from non-Partner States are missing completely from the sampling frame, and data from Partner States are sometimes incomplete because of different State reporting requirements, different State restrictions, or other data omissions. The NIS is designed to represent hospitals and discharges nationally, including those outside the sampling frame.
To accomplish this, within each hospital sampling stratum the NIS draws a number of hospitals from the sampling frame required to net a total of 20 percent of hospitals nationally. The sampling strata are defined by census region (4 regions), hospital ownership (3 categories), urban-rural location, teaching status, and bed size (3 categories). As a result, the proportion of NIS hospitals in a stratum that are from a given State is unlikely to equal the State's actual proportion of hospitals in that stratum. Consequently, the sample of NIS hospitals is unlikely to be representative of hospitals in the State, and the NIS sample weights will not be appropriate at the State level.
The level of this "misrepresentation" varies across the States in any given year of the NIS, which further confounds State-to-State comparisons on the basis of State-specific estimates from the NIS. Moreover, for a given State the level of misrepresentation changes from year to year as States (and hospitals) enter and exit the sampling frame over time. This further confounds State-specific trends on the basis of State-specific estimates from the NIS.
Finally, because the NIS was not designed to be representative at the State level, design-based estimates of standard errors are not possible, which severely hampers State-level inferences. Moreover, the NIS is composed of all discharges from a sample of hospitals (a cluster sample). The hospital-to-hospital variation and the small number of hospitals available in the NIS for many States make Statelevel estimates very imprecise at best and biased at worst.
- When studying trends over time using the NIS, be aware that the sampling frame for the NIS changes almost annually (i.e., more States have been added over time). Estimates from earlier years of the NIS may be subject to more sampling bias than later years of the NIS. In order to facilitate analysis of trends using multiple years of NIS data, an alternate set of NIS discharge and hospital weights for the 1988-1997 HCUP NIS were developed. These alternative weights were calculated in the same way as the weights for the 1998 and later years of the NIS. The report, Using the HCUP Nationwide Inpatient Sample to Estimate Trends includes details regarding the alternate weights and other recommendations for trends analysis. Both the NIS trends report and the alternate weights are available on the HCUP-US Website under Methods Series (http://www.hcup-us.ahrq.gov/reports/methods/methods_topic.jsp).
- To ease the burden on researchers conducting analyses that span multiple years, NIS trends supplemental files (NIS-Trends) are available through the HCUP Central Distributor. The NIS-Trends annual files contain the alternate trend weights for data prior to 1997, in addition to renamed, recoded, and new data elements consistent with the later years of the NIS. More information on these files is available on the HCUP-US Website under NIS database documentation (http://www.hcup-us.ahrq.gov/db/nation/nis/nisdbdocumentation.jsp).
- Short-term rehabilitation hospitals are included in the 1988-1997 NIS, but are excluded from the NIS beginning in 1998. Patients treated in short-term rehabilitation hospitals tend to have lower mortality rates and longer lengths of stay than patients in other community hospitals. The elimination of rehabilitation hospitals may impact trends but the effect is likely small since only about 3% of community hospitals are short-term rehabilitation hospitals and not all State data sources included these hospitals. The NIS-Trends weights account for this change in NIS sampling.
Choosing Data Elements for Analysis
- For all data elements you plan to use in your analysis, first perform descriptive statistics and examine the range of values, including the number of missing cases. Summary statistics for the entire NIS are provided on the Summary Statistics page of the HCUP-US Website (http://www.hcup-us.ahrq.gov/db/nation/nis/nissummstats.jsp). When you detect anomalies (such as large numbers of missing cases), perform descriptive statistics by State for that variable to detect if there are State-specific differences. Sometimes performing descriptive statistics by hospital can be helpful in detecting hospital-specific data anomalies.
- Not all data elements in the NIS are provided by each State data source. These data elements are provided on the NIS because they can be valuable for research purposes but they should be used cautiously. For example, RACE is missing for a number of States; thus, national estimates using RACE should be interpreted and reported with caveats. Check the documentation and run frequencies by State to identify if a data element is not available in one or more States.
- Differences exist across the State data sources in the collection of information that could not be accounted for during HCUP processing to make the data uniform. Be sure to read State-specific notes for each data element that you use in your analysis – this information can be found on the Description of Data Elements page on the HCUP-US Website (http://www.hcup-us.ahrq.gov/db/nation/nis/nisdde.jsp).
- Data elements with "_X" suffixes contain State-specific coding (i.e., these data elements are provided by the data sources and have not been altered in any way). For some data elements (e.g., LOS_X and TOTCHG_X) this means that no edit checks have been applied. For other data elements (e.g., PAY1_X), the coding is specific to each State and may not be comparable to any other State.
ICD-9-CM Diagnosis and Procedure Codes
- ICD-9-CM diagnosis and procedure codes provide valuable insights into the reasons for hospitalization and what procedures patients receive, but these codes need to be carefully used and interpreted. ICD-9-CM codes change every October as new codes are introduced and some codes are retired. See the Conversion Table at http://www.cdc.gov/nchs/datawh/ftpserv/ftpicd9/ftpicd9.htm which shows ICD-9-CM code changes over time. It is critical to check all ICD-9-CM code used for analysis to ensure the codes are in effect during the time period studied.
- Although the NIS contains up to 15 diagnoses and 15 procedures, the number of diagnoses and procedures varies by State. Some States provide as many as 30 diagnoses and 21 procedures, while other States provide as few as 10 diagnoses and 6 procedures. Because very few cases have more than 15 diagnoses or procedures, the diagnosis and procedure vectors were truncated to save space in the NIS data files. Two variables are provided which tell you exactly how many diagnoses and procedures were on the original records (NDX and NPR).
- The collection and reporting of external cause of injury (E codes) varies greatly across States. Some States have laws or mandates for the collection of E codes; others do not. Some States do not require hospitals to report E codes in the range E870-E879 — "misadventures to patients during surgical and medical care" — which means that these occurrences will be underreported. Be sure to read the State-specific notes on diagnoses for more details; this information can be found on the Description of Data Elements page on the HCUP-US Website (http://www.hcup-us.ahrq.gov/db/nation/nis/nisdde.jsp).
Missing data values can compromise the quality of estimates. If the outcome for discharges with missing values is different from the outcome for discharges with valid values, then sample estimates for that outcome will be biased and inaccurately represent the discharge population. There are several techniques available to help overcome this bias. One strategy is to use imputation to replace missing values with acceptable values. Another strategy is to use sample weight adjustments to compensate for missing values.1 Descriptions of such data preparation and adjustment are outside the scope of this report; however, it is recommended that researchers evaluate and adjust for missing data, if necessary.
On the other hand, if the cases with and without missing values are assumed to be similar with respect to their outcomes, no adjustment may be necessary for estimates of means and rates. This is because the non-missing cases would be representative of the missing cases. However, some adjustment may still be necessary for the estimates of totals. Sums of data elements (such as aggregate charges) containing missing values would be incomplete because cases with missing values would be omitted from the calculations.
It may be important for researchers to calculate a measure of precision for some estimates based on the NIS sample data. Variance estimates must take into account both the sampling design and the form of the statistic. The sampling design consisted of a stratified, single-stage cluster sample. A stratified random sample of hospitals (clusters) was drawn and then all discharges were included from each selected hospital. To accurately calculate variances from the NIS, you must use appropriate statistical software and techniques. For details, see the special report, Calculating Nationwide Inpatient Sample Variances. This report is available on the HCUP-US Website at http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp.
If hospitals inside the frame are similar to hospitals outside the frame, the sample hospitals can be treated as if they were randomly selected from the entire universe of hospitals within each stratum. Standard formulas for a stratified, single-stage cluster sample without replacement could be used to calculate statistics and their variances in most applications.
A multitude of statistics can be estimated from the NIS data. Several computer programs are listed below that calculate statistics and their variances from sample survey data. Some of these programs use general methods of variance calculations (e.g., the jackknife and balanced half-sample replications) that take into account the sampling design. However, it may be desirable to calculate variances using formulas specifically developed for some statistics.
These variance calculations are based on finite-sample theory, which is an appropriate method for obtaining cross-sectional, nationwide estimates of outcomes. According to finite-sample theory, the intent of the estimation process is to obtain estimates that are precise representations of the nationwide population at a specific point in time. In the context of the NIS, any estimates that attempt to accurately describe characteristics and interrelationships among hospitals and discharges during a specific year should be governed by finite-sample theory. Examples of this would be estimates of expenditure and utilization patterns or hospital market factors.
Alternatively, in the study of hypothetical population outcomes not limited to a specific point in time, the concept of a "superpopulation" may be useful. Analysts may be less interested in specific characteristics from the finite population (and time period) from which the sample was drawn than they are in hypothetical characteristics of a conceptual "superpopulation" from which any particular finite population in a given year might have been drawn. According to this superpopulation model, the nationwide population in a given year is only a snapshot in time of the possible interrelationships among hospital, market, and discharge characteristics. In a given year, all possible interactions between such characteristics may not have been observed, but analysts may wish to predict or simulate interrelationships that may occur in the future.
Under the finite-population model, the variances of estimates approach zero as the sampling fraction approaches one. This is the case because the population is defined at that point in time, and because the estimate is for a characteristic as it existed when sampled. This is in contrast to the superpopulation model, which adopts a stochastic viewpoint rather than a deterministic viewpoint. That is, the nationwide population in a particular year is viewed as a random sample of some underlying superpopulation over time. Different methods are used for calculating variances under the two sample theories. The choice of an appropriate method for calculating variances for nationwide estimates depends on the type of measure and the intent of the estimation process.
Computer Software for Variance Calculations
The hospital weights are useful for producing hospital-level statistics for analyses that use the hospital as the unit of analysis, while the discharge weights are useful for producing discharge-level statistics for analyses that use the discharge as the unit of analysis. The discharge weights may be used to estimate nationwide population statistics.
In most cases, computer programs are readily available to perform these calculations. Several statistical programming packages allow weighted analyses.2 For example, nearly all SAS procedures incorporate weights. In addition, several statistical analysis programs have been developed to specifically calculate statistics and their standard errors from survey data. Version eight or later of SAS contains procedures (PROC SURVEYMEANS and PROC SURVEYREG) for calculating statistics based on specific sampling designs. STATA and SUDAAN are two other common statistical software packages that perform calculations for numerous statistics arising from the stratified, single-stage cluster sampling design. Examples of the use of SAS, SUDAAN, and STATA to calculate NIS variances are presented in the special report, Calculating Nationwide Inpatient Sample Variances. This report is available on the HCUP-US Website at http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp. For an excellent review of programs to calculate statistics from survey data, visit the following Website: http://www.hcp.med.harvard.edu/statistics/survey-soft/.
The NIS database includes a Hospital Weights file with variables required by these programs to calculate finite population statistics. The file includes hospital identifiers (Primary Sampling Units or PSUs), stratification variables, and stratum-specific totals for the numbers of discharges and hospitals so that finite-population corrections can be applied to variance estimates.
In addition to these subroutines, standard errors can be estimated by validation and cross-validation techniques. Given that a very large number of observations will be available for most analyses, it may be feasible to set aside a part of the data for validation purposes. Standard errors and confidence intervals can then be calculated from the validation data.
If the analytic file is too small to set aside a large validation sample, cross-validation techniques may be used. For example, ten-fold cross-validation would split the data into ten subsets of equal size. The estimation would take place in ten iterations. In each iteration, the outcome of interest is predicted for one-tenth of the observations by an estimate based on a model fit to the other nine-tenths of the observations. Unbiased estimates of error variance are then obtained by comparing the actual values to the predicted values obtained in this manner.
Finally, it should be noted that a large array of hospital-level variables are available for the entire universe of hospitals, including those outside the sampling frame. For instance, the variables from the AHA surveys and from the Medicare Cost Reports are available for nearly all hospitals in the U.S., although hospital identifiers are suppressed in the NIS for a number of States. For these States it will not be possible to link to outside hospital-level data sources. To the extent that hospital-level outcomes correlate with these variables, they may be used to sharpen regional and nationwide estimates.
As a simple example, the number of Cesarean sections performed in each hospital would be correlated with their total number of deliveries. The figure for Cesarean sections must be obtained from discharge data, but the number of deliveries is available from AHA data. Thus, if a regression model can be fit predicting this procedure from deliveries based on the NIS data, that regression model can then be used to obtain hospital-specific estimates of the number of Cesarean sections for all hospitals in the AHA universe.
Hospitals that continue in the NIS for multiple consecutive years are a subset of the hospitals in the NIS for any one of those years. Consequently, longitudinal analyses of hospital-level outcomes may be biased, if they are based on any subset of NIS hospitals limited to continuous NIS membership. In particular, such subsets would tend to contain fewer hospitals that opened, closed, split, merged, or changed strata. Further, the sample weights were developed as annual, cross-sectional weights, rather than longitudinal weights. Therefore, different weights might be required, depending on the statistical methods employed by the analyst.
One approach to consider in hospital-level longitudinal analyses is to use repeated-measure models that allow hospitals to have missing values for some years. However, the data are not actually missing for some hospitals, such as those that closed during the study period. In any case, the analyses may be more efficient (e.g., produce more precise estimates) if they account for the potential correlation between repeated measures on the same hospital over time, yet incorporate data from all hospitals in the sample during the study period.
Prior to the 2005 NIS, two non-overlapping 10% subsamples of NIS discharges were provided each year for analytic purposes. Beginning with the 2005 NIS, 10% subsamples are no longer provided on the NIS CD-ROMs. However, users may still draw their own subsamples, if desired. One use of 10% subsamples would be to validate models and obtain unbiased estimates of standard errors. That is, one subsample may be used to estimate statistical models, while the other subsample may be used to test the fit of those models on new data. This is a very important analytical step, particularly in exploratory studies, where one runs the risk of fitting noise in the data.
It is well known that the percentage of variance explained by a regression, R2, is generally overestimated by the data used to fit a model. The regression model could be estimated from the first subsample and then applied to the second subsample. The squared correlation between the actual and predicted value in the second subsample is an unbiased estimate of the model’s true explanatory power when applied to new data.
SAMPLING OF HOSPITALS
Sampling of Hospitals Included in the NIS
The NIS Hospital Universe
The hospital universe is defined as all hospitals located in the U.S. that are open during any part of the calendar year and designated as community hospitals in the AHA Annual Survey Database (Health Forum, LLC © 2012). The AHA defines community hospitals as follows: "All non-Federal, short-term, general, and other specialty hospitals, excluding hospital units of institutions." Starting in 2005, the AHA included long term acute care facilities in the definition of community hospitals. These facilities provide acute care services to patients who need long term hospitalization (stays of more than 25 days). Consequently, Veterans Hospitals and other Federal facilities (Department of Defense and Indian Health Service) are excluded. Beginning with the 1998 NIS, we excluded short-term rehabilitation hospitals from the universe because the type of care provided and the characteristics of the discharges from these facilities were markedly different from other short-term hospitals. Figure 1 in Appendix I displays the number of universe hospitals for each year based on the AHA Annual Survey Database (Health Forum, LLC © 2007).
For more information on how hospitals in the data set were mapped to hospitals as defined by the AHA, refer to the special report, HCUP Hospital Identifiers. For a list of all data sources, refer to Table 1 in Appendix I. Detailed information on the design of the NIS prior to 2006 is available in the year-specific special reports on Design of the Nationwide Inpatient Sample found on the HCUP-US Website. Starting with the 2006 NIS, the design information was incorporated into this report.
Hospital Merges, Splits, and Closures
All U.S. hospital entities designated as community hospitals in the AHA hospital file, except short-term rehabilitation hospitals, were included in the hospital universe. Therefore, when two or more community hospitals merged to create a new community hospital, the original hospitals and the newly-formed hospital were all considered separate hospital entities in the universe during the year they merged. Similarly, if a community hospital split, the original hospital and all newly-created community hospitals were treated as separate entities in the universe during the year this occurred. Finally, community hospitals that closed during a given year were included in the hospital universe, as long as they were in operation during some part of the calendar year.
Given the increase in the number of contributing States, the NIS team evaluated and revised the sampling and weighting strategy for 1998 and subsequent data years, in order to best represent the U.S. This included changes to the definitions of the strata variables, the exclusion of rehabilitation hospitals from the NIS hospital universe, and a change to the calculation of hospital universe discharges for the weights. A full description of this process can be found in the special report on Changes in NIS Sampling and Weighting Strategy for 1998. This report is available on the HCUP-US Website at http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp. (A description of the sampling procedures and definitions of strata variables used from 1988 through 1997 can be found in the special report: Design of the HCUP Nationwide Inpatient Sample, 1997. This report is also available on the HCUP-US Website.)
The NIS sampling strata were defined based on five hospital characteristics contained in the AHA hospital files. Beginning with the 1998 NIS, the stratification variables were defined as follows:
- Geographic Region – Northeast, Midwest, West, and South. This is an important stratification variable because practice patterns have been shown to vary substantially by region. For example, lengths of stay tend to be longer in East Coast hospitals than in West Coast hospitals. Figure 2 highlights the NIS States by region, and Table 6 lists the States that comprise each region. Both can be found in Appendix I.
- Control – government non-Federal (public), private not-for-profit (voluntary), and private investor-owned (proprietary). Depending on their control, hospitals tend to have different missions and different responses to government regulations and policies. When there were enough hospitals of each type to allow it, we stratified hospitals as public, voluntary, and proprietary. We used this stratification for Southern rural, Southern urban non-teaching, and Western urban non–teaching hospitals. For smaller strata — the Midwestern rural and Western rural hospitals — we used a collapsed stratification of public versus private, with the voluntary and proprietary hospitals combined to form a single "private" category. For all other combinations of region, location, and teaching status, no stratification based on control was advisable, given the number of hospitals in these cells.
- Location – urban or rural. Government payment policies often differ according to this designation. Also, rural hospitals are generally smaller and offer fewer services than urban hospitals. Beginning with the 2004 NIS, we changed the classification of urban or rural hospital location for the sampling strata to use the newer Core Based Statistical Area (CBSA) codes, rather than the older Metropolitan Statistical Area (MSA) codes. The CBSA groups are based on 2000 Census data, whereas the MSA groups were based on 1990 Census data. Also, the criteria for classifying the counties differ. For more information on the difference between CBSAs and MSAs, refer to the U.S. Census Bureau Website (http://www.census.gov/population/metro/).
Previously, we classified hospitals in a MSA as urban hospitals, while we classified hospitals outside a MSA as rural hospitals. Beginning with the 2004 NIS, we categorized hospitals with a CBSA type of Metropolitan or Division as urban, while we designated hospitals with a CBSA type of Micropolitan or Rural as rural. This change contributed to a slight decline in the number of hospitals that were classified as rural and a corresponding increase in the number of hospitals categorized as urban. For the 2003 NIS, 44.9% of hospitals in the AHA universe were classified as rural hospitals; for 2004, only 41.3% of AHA universe hospitals were classified as rural.
- Teaching Status – teaching or non-teaching. The missions of teaching hospitals differ from non-teaching hospitals. In addition, financial considerations differ between these two hospital groups. Currently, the Medicare Diagnosis Related Group (DRG) payments are uniformly higher to teaching hospitals. Prior to the 1998 NIS, we considered a hospital to be a teaching hospital if it had any residents or interns and met one of the following two criteria:
- Residency training approval by the Accreditation Council for Graduate Medical Education (ACGME)
- Membership in the Council of Teaching Hospitals (COTH).
Beginning with the 1998 NIS, we considered a hospital to be a teaching hospital if it met any one of the following three criteria:
- Residency training approval by the Accreditation Council for Graduate Medical Education (ACGME)
- Membership in the Council of Teaching Hospitals (COTH)
- A ratio of full-time equivalent interns and residents to beds of .25 or higher.3
We did not split rural hospitals according to teaching status, because rural teaching hospitals were rare. For example, in 2006, rural teaching hospitals comprised less than 2% of the total hospital universe. We defined the bed size categories within location and teaching status because they would otherwise have been redundant. Rural hospitals tend to be small; urban non-teaching hospitals tend to be medium-sized; and urban teaching hospitals tend to be large. Yet it was important to recognize gradations of size within these types of hospitals. For example, in serving rural discharges, the role of "large" rural hospitals (particularly rural referral centers) often differs from the role of "small" rural hospitals.
To further ensure geographic representativeness, implicit stratification variables included State and three-digit ZIP Code (the first three digits of the hospital’s five-digit ZIP Code). The hospitals were sorted according to these variables prior to systematic random sampling. Detailed information on the design of the NIS prior to 2006 is available in the year-specific special reports on Design of the Nationwide Inpatient Sample found on the HCUP-US Website. Starting with the 2006 NIS, the design information was incorporated into this report.
Hospital Sampling Frame
The universe of hospitals was established as all community hospitals located in the U.S. with the exception, beginning in 1998, of short-term rehabilitation hospitals. However, some hospitals do not supply data to HCUP. Therefore, we constructed the NIS sampling frame from the subset of universe hospitals that released their discharge data to AHRQ for research use. The number of State Partners contributing data to the NIS has expanded over the years, as shown in Table 2 of Appendix I. As a result, the number of hospitals included in the NIS sampling frame has also increased over the years, as depicted in Figure 3, also in Appendix I.
The list of the entire frame of hospitals was composed of all AHA community hospitals in each of the frame States that could be matched to the discharge data provided to HCUP. If an AHA community hospital could not be matched to the discharge data provided by the data source, it was eliminated from the sampling frame (but not from the target universe).
Figure 4 in Appendix I illustrates the number of hospitals in the universe, frame, and sample and the percentage of universe hospitals in the frame for each State in the sampling frame for 2006. In most cases, the difference between the universe and the frame represents the difference in the number of community, non-rehabilitation hospitals in the 2006 AHA Annual Survey Database (Health Forum, LLC © 2012) and the hospitals for which data were supplied to HCUP that could be matched to the AHA data.
The largest discrepancy between HCUP data and AHA data is in Texas, as is evident in Figure 4 (Appendix I). Certain Texas State-licensed hospitals are exempt from statutory reporting requirements. Exempt hospitals include:
- Hospitals that do not seek insurance payment or government reimbursement
- Rural providers.
The Texas statute that exempts rural providers from the requirement to submit data defines a hospital as a rural provider if it:
(I.) Is located in a county that:
(A.) Has a population estimated by the United States Bureau of the Census to be not more than 35,000 as of July 1 of the most recent year for which county population estimates have been published; or
(B.) Has a population of more than 35,000, but does not have more than 100 licensed hospital beds and is not located in an area that is delineated as an urbanized area by the United States Bureau of the Census; and
These exemptions apply primarily to smaller rural public hospitals and, as a result, these facilities are less likely to be included in the sampling frame than other Texas hospitals. While the number of hospitals omitted appears sizable, those available for the NIS include more than 93% of inpatient discharges from Texas universe hospitals because excluded hospitals tended to have relatively few discharges.
Refer to Table 8 of Appendix I for a full list of the number of hospitals, and discharges included in the 2006 AHA universe, frame, and NIS by State. Fewer hospitals may be in a State’s frame than in the universe because data is not always received from every hospital and hospitals are sometimes excluded because of State requirements.
Hospital Sample Design
The NIS is a stratified probability sample of hospitals in the frame, with sampling probabilities calculated to select 20% of the universe of U.S. community, non-rehabilitation hospitals contained in each stratum. This sample size was determined by AHRQ based on their experience with similar research databases. The overall design objective was to select a sample of hospitals that accurately represents the target universe, which includes hospitals outside the frame (i.e., having zero probability of selection). Moreover, this sample was to be geographically dispersed, yet drawn only from data supplied by HCUP Partners.
It should be possible, for example, to estimate DRG-specific average lengths of stay across all U.S. hospitals using weighted average lengths of stay, based on averages or regression coefficients calculated from the NIS. Ideally, relationships among outcomes and their correlates estimated from the NIS should accurately represent all U.S. hospitals. It is advisable to verify your estimates against other data sources, if available, because not all States contribute data to the NIS. Table 2 in Appendix I lists the number of NIS States, hospitals, and discharges by year. For example, the National Hospital Discharge Survey (http://www.cdc.gov/nchs/about/major/hdasd/nhds.htm) can provide benchmarks against which to check your national estimates for hospitalizations with more than 5,000 cases.
The NIS Comparison Report assesses the accuracy of NIS estimates by providing a comparison of the NIS with other data sources. The most recent report is available on the HCUP-US Website (http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp).
The NIS team considered alternative stratified sampling allocation schemes. However, allocation proportional to the number of hospitals was preferred for several reasons:
- AHRQ researchers wanted a simple, easily understood sampling methodology. The concept that the NIS sample could represent a "miniaturization" of the hospital universe was appealing. There were, however, obvious geographic limitations imposed by data availability.
- AHRQ statisticians considered other optimal allocation schemes, including sampling hospitals with probabilities proportional to size (number of discharges). They ultimately concluded that sampling with probability proportional to the number of hospitals was preferable. While this approach was admittedly less efficient, the extremely large sample sizes yield reliable estimates. Furthermore, because the data are to be used for purposes other than producing nationwide estimates, (e.g., regression modeling), it is critical that all hospital types, including small hospitals, are adequately represented.
Overview of the Sampling Procedure
To further ensure accurate geographic representation, we implicitly stratified the hospitals by State and three-digit ZIP Code (the first three digits of the hospital’s five-digit ZIP Code). This was accomplished by sorting by three-digit ZIP Code within each stratum prior to drawing a systematic random sample of hospitals.
After stratifying the universe of hospitals, we sorted hospitals by stratum, the three-digit ZIP Code within each stratum, and by a random number within each three-digit ZIP Code. These sorts ensured further geographic generalizability of hospitals within the frame States, as well as random ordering of hospitals within three-digit ZIP Codes. Generally, three-digit ZIP Codes that are proximal in value are geographically near one another within a State. Furthermore, the U.S. Postal Service locates regional mail distribution centers at the three-digit level. Thus, the boundaries tend to be a compromise between geographic size and population size.
We then drew a systematic random sample of up to 20% of the total number of U.S. hospitals within each stratum. If too few frame hospitals appeared in a cell, we selected all frame hospitals for the NIS, subject to sampling restrictions specified by States. To simplify variance calculations, we drew at least two hospitals from each stratum. If fewer than two frame hospitals were available in a stratum, we merged it with an "adjacent" cell containing hospitals with similar characteristics.
Prior to the 2005 NIS, we drew two non-overlapping 10% subsamples of discharges from the NIS file for each year. The subsamples were selected by drawing every tenth discharge, starting with two different starting points (randomly selected between 1 and 10). Having a different starting point for each of the two subsamples guaranteed that they would not overlap. Discharges were sampled so that 10% of each hospital’s discharges in each quarter were selected for each of the subsamples. The two samples could be combined to form a single, generalizable 20% subsample of discharges. Beginning with the 2005 NIS, 10% subsamples are no longer provided on the NIS CD-ROMs. However, users may still draw their own subsamples, if desired.
Change to Hospital Sampling Procedure Beginning with the 1998 NIS
Beginning with the 1998 NIS sampling procedures, all frame hospitals within a stratum have an equal probability of selection for the sample, regardless of whether they appeared in prior NIS samples. This deviates from the procedure used for earlier samples, which maximized the longitudinal component of the NIS series.
Further description of the sampling procedures for earlier releases of the NIS can be found in the special report: Design of the HCUP Nationwide Inpatient Sample, 1997. This report is available on the HCUP-US Website at http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp. For a description of the development of the new sample design for 1998 and subsequent data years, see the special report: Changes in NIS Sampling and Weighting Strategy for 1998. This report is available on the HCUP-US Website.
Beginning with the 1993 NIS, the NIS samples no longer contain zero-weight hospitals. For a description of zero-weight hospitals in the 1988-1992 samples, refer to the special report: Design of the HCUP Nationwide Inpatient Sample, Release 1. This report is available on the HCUP-US Website at http://www.hcup-us.ahrq.gov/db/nation/nis/nisrelatedreports.jsp.
Final Hospital Sample
In Appendix I, we present three figures describing the final hospital sample. Figure 5 depicts the numbers of hospitals sampled each year, while Figure 6 presents the numbers of discharges in each year of the NIS. For the 1988-1992 NIS, zero-weight hospitals were maintained to provide a longitudinal sample. Therefore, two figures exist for each of these years: one number for the regular NIS sample and another number for the total sample.
Figure 7 displays the weighted number of discharges sampled each year. Note that this number decreased from 35,408,207 in 1997 to 34,874,001 in 1998, a difference of 534,206 (1.5%). This slight decline is associated with two changes to the 1998 NIS design: the exclusion of community, rehabilitation hospitals from the hospital universe, and a change to the calculation of hospital universe discharges for the weights. Prior to 1998, we calculated discharges as the sum of total facility admissions (AHA data element ADMTOT), which includes long-term care admissions, plus births (AHA data element BIRTHS) reported for each U.S. community hospital in the AHA Annual Survey Database (Health Forum, LLC © 2012).
Beginning in 1998, we calculate discharges as the sum of hospital admissions (AHA data element ADMH) plus births for each U.S. community, non-rehabilitation hospital. This number is more consistent with the number of discharges we receive from the State data sources. We also substitute total facility admissions, if the number of hospital admissions is missing. Without these changes, the weighted number of discharges for 1998 would have been 35,622,743. The exclusion of community, rehabilitation hospitals reduced the number of universe hospitals by 177 and the number of weighted discharges by 214,490. The change in the calculation of discharges reduced the weighted number of discharges by 534,252.
Figure 8 presents a summary of the 2006 NIS hospital sample by geographic region and the number of:
- Universe hospitals (Universe)
- Frame hospitals (Frame)
- Sampled hospitals (Sample)
- Target hospitals (Target = 20% of the universe)
- Surplus hospitals (Surplus = Sample – Target).
Figure 9 summarizes the estimated U.S. population by geographic region. For each region, the figure reveals:
- The estimated U.S. population
- The estimated population of States in the 2006 NIS
- The percentage of estimated U.S. population included in NIS States.
Figure 10 depicts the number of discharges in the 2006 sample for each State.
Special consideration was needed to handle the Massachusetts data in the 2006 NIS. Fourth quarter data from sampled hospitals in Massachusetts were unavailable for inclusion in the 2006 NIS. To account for the missing quarter of data, we sampled one fourth of the Massachusetts NIS discharges from the first three quarters and modified the records to represent the fourth quarter. To ensure a representative sample, we sorted the Massachusetts NIS discharges by hospital, discharge quarter, Clinical Classifications Software (CCS) diagnosis group for the principal diagnosis, gender, age, and a random number before selecting every fourth record. The following describes the adjustments made to the selected Massachusetts NIS records:
- We relabeled the discharge quarter (DQTR) to four and saved the original discharge quarter in a new data element (DQTR_X).
- We adjusted the admission month (AMONTH) by the number of months corresponding to the change in the discharge quarter.
- We adjusted the total charges (TOTCHG and TOTCHG_X) using quarter-specific adjustment factors calculated as the mean total charges in the fourth quarter for all Northeastern NIS States (excluding Massachusetts) divided by the mean total charges in the first, second, or third quarter for all Northeastern NIS States (excluding Massachusetts).
We then adjusted the discharge weights for the Massachusetts records to appropriately account for the shifting of quarter one through three discharges to quarter four.
To obtain nationwide estimates, we developed discharge weights using the AHA universe as the standard. These were developed separately for hospital- and discharge-level analyses. Hospital-level weights were developed to extrapolate NIS sample hospitals to the hospital universe. Similarly, discharge-level weights were developed to extrapolate NIS sample discharges to the discharge universe.
Hospital weights to the universe were calculated by post-stratification. For each year, hospitals were stratified on the same variables that were used for sampling: geographic region, urban/rural location, teaching status, bed size, and control. The strata that were collapsed for sampling were also collapsed for sample weight calculations. Within each stratum s, each NIS sample hospital’s universe weight was calculated as:
Ws(universe) = Ns(universe) ÷ Ns(sample)
where Ws(universe) was the hospital universe weight, and Ns(universe) and Ns(sample) were the number of community hospitals within stratum s in the universe and sample, respectively. Thus, each hospital’s universe weight (HOSPWT) is equal to the number of universe hospitals it represents during that year. Because 20% of the hospitals in each stratum were sampled when possible, the hospital weights are usually near five.
The calculations for discharge-level sampling weights were similar to the calculations for hospital-level sampling weights. The discharge weights are usually constant for all discharges within a stratum. The only exceptions are for strata with sample hospitals that, according to the AHA files, were open for the entire year but contributed less than a full year of data to the NIS. For those hospitals, we adjusted the number of observed discharges by a factor of 4 ÷ Q, where Q was the number of calendar quarters for which the hospital contributed discharges to the NIS. For example, when a sample hospital contributed only two quarters of discharge data to the NIS, the adjusted number of discharges was double the observed number. This adjustment was performed only for weighting purposes. The NIS data set includes only the actual (unadjusted) number of observed discharges.
With that minor adjustment, each discharge weight is essentially equal to the number of AHA universe discharges that each sampled discharge represents in its stratum. This calculation was possible because the number of total discharges was available for every hospital in the universe from the AHA files. Each universe hospital’s AHA discharge total was calculated as the sum of newborns and hospital discharges.
Discharge weights to the universe were calculated by post-stratification. Hospitals were stratified just as they were for universe hospital weight calculations. Within stratum s, for hospital i, each NIS sample discharge’s universe weight was calculated as:
DWis(universe) = [DNs(universe) ÷ ADNs(sample)] * (4 ÷ Qi)
where DWis(universe) was the discharge weight; DNs(universe) represented the number of discharges from community hospitals in the universe within stratum s; ADNs(sample) was the number of adjusted discharges from sample hospitals selected for the NIS; and Qi represented the number of quarters of discharge data contributed by hospital i to the NIS (usually Qi = 4). Thus, each discharge’s weight (DISCWT) is equal to the number of universe discharges it represents in stratum s during that year. Because all discharges from 20% of the hospitals in each stratum were sampled when possible, the discharge weights are usually near five.
Appendix I: Tables and Figures
|AR||Arkansas Department of Health & Human Services|
|AZ||Arizona Department of Health Services|
|CA||Office of Statewide Health Planning & Development|
|CO||Colorado Hospital Association|
|FL||Florida Agency for Health Care Administration|
|GA||Georgia Hospital Association|
|HI||Hawaii Health Information Corporation|
|IA||Iowa Hospital Association|
|IL||Illinois Department of Public Health|
|IN||Indiana Hospital Association|
|KS||Kansas Hospital Association|
|KY||Kentucky Cabinet for Health and Family Services|
|MA||Division of Health Care Finance and Policy|
|MD||Health Services Cost Review Commission|
|MI||Michigan Health & Hospital Association|
|MN||Minnesota Hospital Association|
|MO||Hospital Industry Data Institute|
|NC||North Carolina Department of Health and Human Services|
|NE||Nebraska Hospital Association|
|NH||New Hampshire Department of Health & Human Service|
|NJ||New Jersey Department of Health & Senior Services|
|NV||Nevada Department of Health and Human Services|
|NY||New York State Department of Health|
|OH||Ohio Hospital Association|
|OK||Oklahoma State Department of Health|
|OR||Oregon Association of Hospitals and Health Systems|
|RI||Rhode Island Department of Health|
|SC||South Carolina State Budget & Control Board|
|SD||South Dakota Association of Healthcare Organizations|
|TN||Tennessee Hospital Association|
|TX||Texas Department of State Health Services|
|UT||Utah Department of Health|
|VT||Vermont Association of Hospitals and Health Systems|
|VA||Virginia Health Information|
|WA||Washington State Department of Health|
|WI||Wisconsin Department of Health & Family Services|
|WV||West Virginia Health Care Authority|
|Calendar Year||States in the Frame||Number of States||Sample Hospitals||Sample Discharges|
|1988||California, Colorado, Florida, Iowa, Illinois, Massachusetts, New Jersey, and Washington||8||758||5,265,756|
|1989||Added Arizona, Pennsylvania, and Wisconsin||11||875||6,110,064|
|1990||No new additions||11||861||6,268,515|
|1991||No new additions||11||847||6,156,188|
|1992||No new additions||11||838||6,195,744|
|1993||Added Connecticut, Kansas, Maryland, New York, Oregon, and South Carolina||17||913||6,538,976|
|1994||No new additions||17||904||6,385,011|
|1995||Added Missouri and Tennessee||19||938||6,714,935|
|1996||No new additions||19||906||6,542,069|
|1997||Added Georgia, Hawaii, and Utah||22||1,012||7,148,420|
|1998||No new additions||22||984||6,827,350|
|1999||Added Maine and Virginia||24||984||7,198,929|
|2000||Added Kentucky, North Carolina, Texas, and West Virginia||28||994||7,450,992|
|2001||Added Michigan, Minnesota, Nebraska, Rhode Island, and Vermont||33||986||7,452,727|
|2002||Added Nevada, Ohio, and South Dakota; Dropped Arizona||35||995||7,853,982|
|2003||Added Arizona, Indiana, and New Hampshire; Dropped Maine||37||994||7,977,728|
|2004||Added Arkansas; Dropped Pennsylvania||37||1,004||8,004,571|
|2005||Added Oklahoma; Dropped Virginia||37||1,054||7,995,048|
|Data from||Media/format options||Structure of Releases|
In ASCII format
|5 years of data in a 6-CD set,
Two 10% subsamples of discharges for each year
In ASCII format
|1 year of data in a 2-CD set, compressed files
Two 10% subsamples of discharges for each year
In ASCII format
|1 year of data in a 2-CD set,
Two 10% subsamples of discharges for
A companion file with four different sets
of severity measures
In ASCII format
|1 year of data in a 2-CD set,
A companion file with four different sets
of severity measures, and also
diagnosis and procedure groups
|Year||Data sources||Number of hospitals||Number of discharges in the NIS, unweighted||Number of discharges in the NIS, weighted for national estimates|
|1988||CA CO FL IL IA MA NJ WA||759||5,265,756||35,171,448|
|1989||AZ CA CO FL IL IA MA NJ PA WA WI
(Added AZ, PA, WI)
|1990||AZ CA CO FL IL IA MA NJ PA WA WI
|1991||AZ CA CO FL IL IA MA NJ PA WA WI
|1992||AZ CA CO FL IL IA MA NJ PA WA WI
|1993||AZ CA CO CT FL IL IA KS MD MA NJ NY OR PA SC WA WI
(Added CT, KS, MD, NY, OR, SC)
|1994||AZ CA CO CT FL IL IA KS MD MA NJ NY OR PA SC WA WI
|1995||AZ CA CO CT FL IL IA KS MD MA MO NJ NY OR PA SC TN WA WI
(Added MO, TN)
|1996||AZ CA CO CT FL IL IA KS MD MA MO NJ NY OR PA SC TN WA WI
|1997||AZ CA CO CT FL GA HI IL IA KS MD MA MO NJ NY OR PA SC TN UT WA WI
(Added GA, HI, UT)
|1998||AZ CA CO CT FL GA HI IL IA KS MD MA MO NJ NY OR PA SC TN UT WA WI
|1999||AZ CA CO CT FL GA HI IL IA KS MD MA ME MO NJ NY OR PA SC TN UT VA WA WI
(Added ME, VA)
|2000||AZ CA CO CT FL GA HI IL IA KS KY MD MA ME MO NC NJ NY OR PA SC TN TX UT VA WA WI WV
(Added KY, NC, TX, WV)
|2001||AZ CA CO CT FL GA HI IL IA KS KY MD MA ME MI MN MO NC NE NJ NY OR PA RI SC TN TX UT VA VT WA WI WV
(Added MI, MN, NE, RI, VT)
|2002||CA CO CT FL GA HI IL IA KS KY MD MA ME MI MN MO NC NE NJ NY NV OH OR PA RI SC SD TN TX UT VA VT WA WI WV
(Added NV, OH, SD; AZ data were not available)
|2003||AZ CA CO CT FL GA HI IL IN IA KS KY MD MA MI MN MO NC NE NH NJ NY NV OH OR PA RI SC SD TN TX UT VA VT WA WI WV
(Added AZ, IN, NH; ME data were not available)
|2004||AR AZ CA CO CT FL GA HI IL IN IA KS KY MD MA MI MN MO NC NE NH NJ NY NV OH OR RI SC SD TN TX UT VA
VT WA WI WV
(Added AR; PA data were not available)
|2005||AR AZ CA CO CT FL GA HI IL IN IA KS KY MD MA MI MN MO NC NE NH NJ NY NV OH OK OR RI SC SD TN
TX UT VT WA WI WV
(Added OK; VA data were not available)
|2006||AR AZ CA CO CT FL GA HI IL IN IA KS KY MD MA MI MN MO NC NE NH NJ NY NV OH OK OR RI SC SD TN
TX UT VA VT WA WI WV
Restrictions on the Use of the NIS
Description of the NIS Files
Availability of Data Elements
Description of Data Elements in the NIS
Corrections to the NIS
Programs to load the ASCII data files into statistical software:
HCUP Tools: Labels and Formats
NIS Related Reports
Links to HCUP-US page with various NIS related reports such as the following:
HCUP Supplemental Files
Figure 1: Hospital Universe, by Year4
Figure 2: NIS States, by Region
|1: Northeast||Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont.|
|2: Midwest||Illinois, Indiana, Iowa, Kansas, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, Wisconsin.|
|3: South||Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia.|
|4: West||Alaska, Arizona, California, Colorado, Hawaii, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, Wyoming.|
|Location and Teaching Status||Hospital Bed Size|
|Rural||1 - 49||50 - 99||100+|
|Urban, non-teaching||1 - 124||125 - 199||200+|
|Urban, teaching||1 - 249||250 - 424||425+|
|Rural||1 - 29||30 - 49||50+|
|Urban, non-teaching||1 - 74||75 - 174||175+|
|Urban, teaching||1 - 249||250 - 374||375+|
|Rural||1 - 39||40 - 74||75+|
|Urban, non-teaching||1 - 99||100 - 199||200+|
|Urban, teaching||1 - 249||250 - 449||450+|
|Rural||1 - 24||25 - 44||45+|
|Urban, non-teaching||1 - 99||100 - 174||175+|
|Urban, teaching||1 - 199||200 - 324||325+|
Figure 3: NIS Hospital Sampling Frame, by Year
Figure 4: Number of Hospitals in the 2006 Universe, Frame, and Sample for Frame States - Part A: Arkansas – North Carolina
Figure 4: Number of Hospitals in the 2006 Universe, Frame, and Sample for Frame States - Part B: Nebraska – West Virginia
|State||Number of Hospitals and Discharges in 2006 AHA Universe, Frame, and NIS, by State|
Figure 5: Number of Hospitals Sampled, by Year
Figure 6: Number of NIS Discharges, Unweighted, by Year
Figure 7: Number of NIS Discharges, Weighted, by Year
Figure 8: Number of Hospitals in the 2006 Universe, Frame, Sample, Target, and Surplus, by Region
Figure 9: Percentage of U.S. Population in 2006 NIS States, by Region
Calculated using the estimated U.S. population on July 1, 2006.5
Figure 10: Number of Discharges in the 2006 NIS, by State.
Appendix II: State-Specific Restrictions
The table below enumerates the types of restrictions applied to the Nationwide Inpatient Sample. Restrictions include the following types:
- Confidentiality of hospitals
- Confidentiality of records
- Confidentiality of physicians
- Missing discharges.
For each restriction type the data sources are listed alphabetically by State. Only data sources that have restrictions are included. Data sources that do not have restrictions are not included.
|Confidentiality of Hospitals - Restricted Identification of Hospitals|
|The following data sources required that hospitals not be identified in the NIS:
|Confidentiality of Hospitals - Limitation on Sampling|
Limitations on sampling were needed for the following data sources:
|Confidentiality of Hospitals - Restricted Release of Stratifiers|
|Stratifier data elements were restricted for the following data sources to further ensure hospital confidentiality in the NIS:
|Confidentiality of Records - Restricted Release of Age in Years, or Age in Days|
|The following data sources restrict or limit the release of age:
|Confidentiality of Records – Other Restrictions|
|The following data sources restrict or limit the release of data elements for patient confidentiality:
|Confidentiality of Physicians|
|The following data sources restrict the release of physician identifiers:
|In these states the following data elements are set to missing for all records:
|The following data sources may be missing discharge records for specific populations of patients:
|Coding Notes||Unavailable in 2006 for:|
|Admission day of week or weekend||AWEEKEND||1998-2006||Admission on weekend: (0) admission on Monday-Friday, (1) admission on Saturday-Sunday|
|ADAYWK||1988-1997||Admission day of week: (1) Sunday, (2) Monday, (3) Tuesday, (4) Wednesday, etc.|
|Admission month||AMONTH||1988-2006||Admission month coded from (1) January to (12) December||FL|
|Admission source||ASOURCE||1988-2006||Admission source, uniform coding: (1) ER, (2) another hospital, (3) another facility including long-term care, (4) court/law enforcement, (5) routine/birth/other|
|ASOURCE_X||1998-2006||Admission source, as received from data source using State-specific coding|
|ASOURCEUB92||2003-2006||Admission source (UB-92 standard coding). For newborn admissions (ATYPE = 4): (1) normal delivery, (2) premature delivery, (3) sick baby, (4) extramural birth; For non-newborn admissions (ATYPE NE 4): (1) physician referral, (2) clinic referral, (3) HMO referral, (4) transfer from a hospital, (5) transfer from a skilled nursing facility, (6) transfer from another healthcare facility, (7) emergency room, (8) court/law enforcement, (A) transfer from a critical access hospital||CA, MD, RI|
|Admission type||ATYPE||1988-2006||Admission type, uniform coding: (1) emergency, (2) urgent, (3) elective, (4) newborn, (5) trauma center beginning in 2003 data, (6) other||CA|
|ELECTIVE||2002-2006||Indicates elective admission: (1) elective, (0) non-elective admission|
|Age at admission||AGE||1988-2006||Age in years coded 0-124 years|
|AGEDAY||1988-2006||Age in days coded 0-365 only when the age in years is less than 1||FL, MA, NH, SC, TX|
|Clinical Classifications Software (CCS) category||DXCCS1 - DXCCS15||1998-2006||CCS category for all diagnoses for NIS beginning in 1998|
|DCCHPR1||1988-1997||CCS category for principal diagnosis for NIS prior to 1998. CCS was formerly called the Clinical Classifications for Health Policy Research (CCHPR)|
|PRCCS1 - PRCCS15||1998-2006||CCS category for all procedures for NIS beginning in 1998|
|PCCHPR1||1988-1997||CCS category for principal procedure for NIS prior to 1998. CCS was formerly called the Clinical Classifications for Health Policy Research (CCHPR)|
|Data source information||DSNUM||1988-1997||Data source number|
|DSTYPE||1988-1997||Data source type: (1) State data organization, (2) Hospital association, (3) Consortia|
|Diagnosis information||DX1 - DX15||1988-2006||Diagnoses, principal and secondary (ICD-9-CM). Beginning in 2003, the diagnosis array does not include any of external cause of injury codes. These codes have been stored in a separate array ECODEn.|
|NDX||1988-2006||Number of diagnoses coded on the original record|
|DSNDX||1988-1997||Number of diagnosis fields provided by the data source|
|DXSYS||1988-1997||Diagnosis system (ICD-9-CM)|
|DXV1 - DXV15||1988-1997||Diagnosis validity flags|
|Diagnosis Related Group (DRG)||DRG||1988-2006||DRG in use on discharge date|
|DRGVER||1988-2006||Grouper version in use on discharge date|
|DRG10||1988-1999||DRG Version 10 (effective October 1992 - September 1993)|
|DRG18||1998-2005||DRG Version 18 (effective October 2000 - September 2001)|
|DRG24||2006||DRG Version 24 (effective October 2006 - September 2007)|
|Discharge quarter||DQTR||1988-2006||Coded: (1) Jan - Mar, (2) Apr - Jun, (3) Jul - Sep, (4) Oct - Dec|
|DQTR_X||2006||Discharge quarter, as received from data source|
(Weights for 1988-1993 are on Hospital Weights file)
|DISCWT||1998-2006||Discharge weight on Core file and Hospital Weights file for NIS beginning in 1998. In all data years except 2000, this weight is used to create national estimates for all analyses. In 2000 only, this weight is used to create national estimates for all analyses excluding those that involve total charges.|
|DISCWT_U||1993-1997||Discharge weight on Core file and Hospital Weights file for NIS prior to 1998|
|DISCWTcharge||2000||Discharge weight for national estimates of total charges. In 2000 only, this weight is used to create national estimates for analyses that involve total charges.|
|DISCWT10||1998-2004||Discharge weight on 10% subsample Core file for NIS 1998 to 2004. In all data years except 2000, this weight is used to create national estimates for all analyses. In 2000 only, this weight is used to create national estimates for all analyses, excluding those that involve total charges.|
|D10CWT_U||1993-1997||Discharge weight on 10% subsample Core file for NIS prior to 1998|
|DISCWTcharge10||2000||Discharge weight for national estimates of total charges on 10% subsample file. In 2000 only, this weight is used to create national estimates for analyses that involve total charges.|
|Disposition of patient (discharge status)||DISP||1988-1997||Disposition of patient, uniform coding used prior to 1998: (1) routine, (2) short-term hospital, (3) skilled nursing facility, (4) intermediate care facility, (5) another type of facility, (6) home healthcare, (7) against medical advice, (20) died|
|DIED||1988-2006||Indicates in-hospital death: (0) did not die during hospitalization, (1) died during hospitalization|
|DISPUB92||1998-2006||Disposition of patient, UB-92 coding: (1) routine, (2) short-term hospital, (3) skilled nursing facility, (4) intermediate care, (5) another type of facility, (6) home healthcare, (7) against medical advice, (8) home IV provider, (20) died in hospital, (40) died at home, (41) died in a medical facility, (42) died, place unknown, (43) alive, Federal health facility, (50) Hospice, home, (51) Hospice, medical facility, (61) hospital-based Medicare approved swing bed, (62) another rehabilitation facility, (63) long-term care hospital, (64) certified nursing facility, (65) psychiatric hospital, (66) critical access hospital, (71) another institution for outpatient services, (72) this institution for outpatient services, (99) discharged alive, destination unknown||CA, IN, MD|
|DISPUNIFORM||1998-2006||Disposition of patient, uniform coding used beginning in 1998: (1) routine, (2) transfer to short-term hospital, (5) other transfers, including skilled nursing facility, intermediate care, and another type of facility, (6) home healthcare, (7) against medical advice, (20) died in hospital, (99) discharged alive, destination unknown|
|External causes of injury and poisoning||ECODE1 - ECODE4||2003-2006||External cause of injury and poisoning code, primary and secondary (ICD-9-CM). Beginning in 2003, external cause of injury codes are stored in a separate array ECODEn from the diagnosis codes in the array DXn. Prior to 2003, these codes are contained in the diagnosis array (DXn).|
|E_CCS1 - E_CCS4||2003-2006||CCS category for the external cause of injury and poisoning codes|
|NECODE||2003-2006||Number of external cause of injury codes on the original record. A maximum of 4 codes are retained on the NIS.|
|Gender of patient||FEMALE||1998-2006||Indicates gender for NIS beginning in 1998: (0) male, (1) female|
|SEX||1988-1997||Indicates gender for NIS prior to 1998: (1) male, (2) female|
|Hospital information||DSHOSPID||1988-2006||Hospital number as received from the data source||GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HOSPID||1988-2006||HCUP hospital number (links to Hospital Weights file)|
|HOSPST||1988-2006||State postal code for the hospital (e.g., AZ for Arizona)|
|HOSPSTCO||1988-2002||Modified Federal Information Processing Standards (FIPS) State/county code for the hospital links to Area Resource File (available from the Bureau of Health Professions, Health Resources and Services Administration). Beginning in 2003, this data element is available only on the Hospital Weights file|
|NIS_STRATUM||2000-2006||Stratum used to sample hospitals, based on geographic region, control, location/teaching status, and bed size. Stratum information is also contained in the Hospital Weights file.|
|Indicates in-hospital birth||HOSPBRTH||2006||Indicator that discharge record includes diagnosis of birth that occurred in the hospital|
|Length of Stay||LOS||1988-2006||Length of stay, edited|
|LOS_X||1988-2006||Length of stay, as received from data source|
|Location of the patient||PL_UR_CAT4||2003-2006||Urban–rural designation for patient’s county of residence: (1) large metropolitan, (2) small metropolitan, (3) micropolitan, (4) non-core|
|Major Diagnosis Category (MDC)||MDC||1988-2006||MDC in use on discharge date|
|MDC10||1988-1999||MDC Version 10 (effective October 1992 - September 1993)|
|MDC18||1998-2005||MDC Version 18 (effective October 2000 - September 2001)|
|MDC24||2006||MDC Version 24 (effective October 2006 - September 2007)|
|Median household income for patient’s ZIP Code||ZIPINC_QRTL||2003-2006||Median household income quartiles for patient’s ZIP Code. For 2005, the median income quartiles are defined as: (1) $1 - $35,999; (2) $36,000 - $44,999; (3) $45,000 - $58,999; and (4) $59,000 or more.|
|ZIPINC||1998-2002||Median household income category in files beginning in 1998: (1) $1-$24,999, (2) $25,000-$34,999, (3) $35,000-$44,999, (4) $45,000 and above|
|ZIPINC4||1988-1997||Median household income category in files prior to 1998: (1) $1-$25,000, (2) $25,001-$30,000, (3) $30,001-$35,000, (4) $35,001 and above|
|ZIPINC8||1988-1997||Median household income category in files prior to 1998: (1) $1-$15,000, (2) $15,001-$20,000, (3) $20,001-$25,000, (4) $25,001-$30,000, (5) $30,001-$35,000, (6) $35,001-$40,000, (7) $40,001-$45,000, (8) $45,001 or more|
|Neonatal/ maternal flag||NEOMAT||1988-2006||Assigned from diagnoses and procedure codes: (0) not maternal or neonatal, (1) maternal diagnosis or procedure, (2) neonatal diagnosis, (3) maternal and neonatal on same record|
|Payer information||PAY1||1988-2006||Expected primary payer, uniform: (1) Medicare, (2) Medicaid, (3) private including HMO, (4) self-pay, (5) no charge, (6) other|
|PAY1_N||1988-1997||Expected primary payer, nonuniform: (1) Medicare, (2) Medicaid, (3) Blue Cross, Blue Cross PPO, (4) commercial, PPO, (5) HMO, PHP, etc., (6) self-pay, (7) no charge, (8) Title V, (9) Worker’s Compensation, (10) CHAMPUS, CHAMPVA, (11) other government, (12) other|
|PAY1_X||1998-2006||Expected primary payer, as received from the data source|
|PAY2||1988-2006||Expected secondary payer, uniform: (1) Medicare, (2) Medicaid, (3) private including HMO, (4) self-pay, (5) no charge, (6) other||AZ, CA, CO, FL, HI, IA, NH, OH, OK, RI, SD, VA|
|PAY2_N||1988-1997||Expected secondary payer, nonuniform: (1) Medicare, (2) Medicaid, (3) Blue Cross, Blue Cross PPO, (4) commercial, PPO, (5) HMO, PHP, etc., (6) self-pay, (7) no charge, (8) Title V, (9) Worker’s Compensation, (10) CHAMPUS, CHAMPVA, (11) other government, (12) other|
|PAY2_X||1998-2006||Expected secondary payer, as received from the data source||AZ, CA, CO, FL, HI, IA, NH, OH, OK, RI, SD, VA|
|MDID_S||1988-2000||Synthetic attending physician number in files prior to 2001|
|MDNUM1_R||2003-2006||Re-identified attending physician number in files starting in 2003||CA, CT, GA, HI, IL, IN, MA, NC, OH, OK, UT, VT, WI, WV|
|MDNUM1_S||2001-2002||Synthetic attending physician number in files beginning in 2001 and discontinued in 2003|
|SURGID_S||1988-2000||Synthetic secondary physician number in files prior to 2001|
|MDNUM2_R||2003-2006||Re-identified secondary physician number in files starting in 2003||CA, CT, GA, HI, IL, IN, MA, NC, OH, OK, UT, VT, WI, WV|
|MDNUM2_S||2001-2002||Synthetic secondary physician number in files beginning in 2001 and discontinued in 2003|
|Procedure information||PR1 - PR15||1988-2006||Procedures, principal and secondary (ICD-9-CM)|
|NPR||1988-2006||Number of procedures coded on the original record|
|DSNPR||1988-1997||Number of procedure fields in this data source|
|PRSYS||1988-1997||Procedure system (ICD-9-CM)|
|PRV1 -PRV15||1988-1997||Procedure validity flag|
|PRDAY1||1988-2006||Number of days from admission to principal procedure||IL, OH, OK, UT, WA, WV|
|PRDAY2 - PRDAY15||1998-2006||Number of days from admission to secondary procedures||AZ, CO, IL, IN, MI, NV, OH, OK, UT, VA, WA, WI, WV|
|Race of Patient||RACE||1988-2006||Race, uniform coding: (1) white, (2) black, (3) Hispanic, (4) Asian or Pacific Islander, (5) Native American, (6) other||GA, IL, KY, MN, NV, OH, OR, WA, WV|
|Record identifier, synthetic||KEY||1998-2006||Unique record number for file beginning in 1998|
|SEQ||1988-1997||Unique record number for NIS prior to 1998|
|SEQ_SID||1988-1997||Unique record number for NIS prior to 1998|
|PROCESS||1988-1997||Processing number for NIS prior to 1998|
|Total Charges||TOTCHG||1988-2006||Total charges, edited|
|TOTCHG_X||1988-2006||Total charges, as received from data source|
|Type of Data Element||HCUP Variable Name||Years Available||Coding Notes||Unavailable in 2006 for:|
|Discharge counts||N_DISC_U||1988-2006||Number of AHA universe discharges in the stratum|
|S_DISC_U||1988-2006||Number of sampled discharges in the sampling stratum (NIS_STRATUM or STRATUM)|
|S_DISC_S||1988-1997||Number of sampled discharges in the stratum STRAT_ST|
|N_DISC_F||1988-1997||Number of frame discharges in the stratum|
|N_DISC_S||1988-1997||Number of State’s discharges in the stratum|
|TOTAL_DISC||1998-2006||Total number of discharges from this hospital in the NIS|
|TOTDSCHG||1988-1997||Total number of discharges from this hospital in the NIS|
|Discharge weights||DISCWT||1998-2006||Discharge weight used in the NIS beginning in 1998. In all data years except 2000, this weight is used to create national estimates for all analyses. In 2000 only, this weight is used to create national estimates for all analyses, excluding those that involve total charges.|
|DISCWT_U||1988-1997||Discharge weights used in the NIS prior to 1998.|
|DISCWT_F||1988-1997||Discharge weights to the sample frame are available only in 1988-1997|
|DISCWT_S||1988-1997||Discharge weights to the State are available only in 1988-1997|
|DISCWTcharge||2000||Discharge weight for national estimates of total charges for 2000 only.|
|Discharge Year||YEAR||1998-2006||Discharge year|
|Hospital counts||N_HOSP_F||1988-1997||Number of frame hospitals in the stratum|
|N_HOSP_S||1988-1997||Number of State’s hospitals in the stratum|
|N_HOSP_U||1988-2006||Number of AHA universe hospitals in the stratum|
|S_HOSP_S||1988-1997||Number of sampled hospitals in STRAT_ST|
|S_HOSP_U||1988-2006||Number of sampled hospitals in the stratum (NIS_STRATUM or STRATUM)|
|Hospital identifiers||HOSPID||1988-2006||HCUP hospital number (links to Inpatient Core files)|
|AHAID||1988-2006||AHA hospital identifier that matches AHA Annual Survey Database (not available for all States)||GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|IDNUMBER||1988-2006||AHA hospital identifier without the leading 6 (not available for all States)||GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HOSPNAME||1993-2006||Hospital name from AHA Annual Survey Database (not available for all States)||AR, GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|Hospital location||HOSPADDR||1993-2006||Hospital address from AHA Annual Survey Database (not available for all States)||AR, GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HOSPCITY||1993-2006||Hospital city from AHA Annual Survey Database (not available for all States)||AR, GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HOSPST||1988-2006||Hospital State postal code for hospital (e.g., AZ for Arizona)|
|HOSPSTCO||2002-2006||Modified Federal Information Processing Standards (FIPS) State/county code||GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HFIPSSTCO||2005-2006||Unmodified Federal Information Processing Standards (FIPS) State/county code for the hospital. Links to the Area Resource File (available from the Bureau of Health Professions, Health Resources and Services Administration)||GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|HOSPZIP||1993-2006||Hospital ZIP Code from AHA Annual Survey Database (not available for all States)||AR, GA, HI, IN, KS, MI, NE, OH, OK, SC, SD, TN, TX|
|Hospital characteristics||HOSP_BEDSIZE||1998-2006||Bed size of hospital: (1) small, (2) medium, (3) large|
|H_BEDSZ||1993-1997||Bed size of hospital: (1) small, (2) medium, (3) large|
|ST_BEDSZ||1988-1992||Bed size of hospital: (1) small, (2) medium, (3) large|
|HOSP_CONTROL||1998-2006||Control/ownership of hospital: (0) government or private, collapsed category, (1) government, nonfederal, public, (2) private, non-profit, voluntary, (3) private, invest-own, (4) private, collapsed category|
|H_CONTRL||1993-1997||Control/ownership of hospital: (1) government, nonfederal (2) private, non-profit (3) private, investor-own|
|ST_OWNER||1988-1992||Control/ownership of hospital: (1) public (2) private, non-profit (3) private for profit|
|HOSP_LOCATION||1998-2006||Location: (0) rural, (1) urban|
|H_LOC||1993-1997||Location: (0) rural, (1) urban|
|HOSP_LOCTEACH||1998-2006||Location/teaching status of hospital: (1) rural, (2) urban non-teaching, (3) urban teaching|
|H_LOCTCH||1993-1997||Location/teaching status of hospital: (1) rural, (2) urban non-teaching, (3) urban teaching|
|LOCTEACH||1988-1992||Location/teaching status of hospital: (1) rural, (2) urban non-teaching, (3) urban teaching|
|HOSP_REGION||1998-2006||Region of hospital: (1) Northeast, (2) Midwest, (3) South, (4) West|
|H_REGION||1993-1997||Region of hospital: (1) Northeast, (2) Midwest, (3) South, (4) West|
|ST_REG||1988-1992||Region of hospital: (1) Northeast, (2) Midwest, (3) South, (4) West|
|HOSP_TEACH||1998-2006||Teaching status of hospital: (0) non-teaching, (1) teaching|
|H_TCH||1993-1997||Teaching status of hospital: (0) non-teaching, (1) teaching|
|NIS_STRATUM||1998-2006||Stratum used to sample hospitals beginning in 1998; includes geographic region, control, location/teaching status, and bed size|
|STRATUM||1988-1997||Stratum used to sample hospitals prior to 1998; includes geographic region, control, location/teaching status, and bed size|
|STRAT_ST||1988-1997||Stratum for State-specific weights|
|Hospital weights||HOSPWT||1998-2006||Weight to hospitals in AHA universe (i.e., total U.S.) beginning in 1998|
|HOSPWT_U||1988-1997||Weight to hospitals in AHA universe (i.e., total U.S.) prior to 1998|
|HOSPWT_F||1988-1997||Weight to hospitals in the sample frame|
|HOSPWT_S||1988-1997||Weight to hospitals in the State|
|Type of Data Element||HCUP Variable Name||Years Available||Coding Notes|
|AHRQ Comorbidity Software (AHRQ)||CM_AIDS||2002-2006||AHRQ comorbidity measure: Acquired immune deficiency syndrome|
|CM_ALCOHOL||2002-2006||AHRQ comorbidity measure: Alcohol abuse|
|CM_ANEMDEF||2002-2006||AHRQ comorbidity measure: Deficiency anemias|
|CM_ARTH||2002-2006||AHRQ comorbidity measure: Rheumatoid arthritis/collagen vascular diseases|
|CM_BLDLOSS||2002-2006||AHRQ comorbidity measure: Chronic blood loss anemia|
|CM_CHF||2002-2006||AHRQ comorbidity measure: Congestive heart failure|
|CM_CHRNLUNG||2002-2006||AHRQ comorbidity measure: Chronic pulmonary disease|
|CM_COAG||2002-2006||AHRQ comorbidity measure: Coagulopathy|
|CM_DEPRESS||2002-2006||AHRQ comorbidity measure: Depression|
|CM_DM||2002-2006||AHRQ comorbidity measure: Diabetes, uncomplicated|
|CM_DMCX||2002-2006||AHRQ comorbidity measure: Diabetes with chronic complications|
|CM_DRUG||2002-2006||AHRQ comorbidity measure: Drug abuse|
|CM_HTN_C||2002-2006||AHRQ comorbidity measure: Hypertension, uncomplicated and complicated|
|CM_HYPOTHY||2002-2006||AHRQ comorbidity measure: Hypothyroidism|
|CM_LIVER||2002-2006||AHRQ comorbidity measure: Liver disease|
|CM_LYMPH||2002-2006||AHRQ comorbidity measure: Lymphoma|
|CM_LYTES||2002-2006||AHRQ comorbidity measure: Fluid and electrolyte disorders|
|CM_METS||2002-2006||AHRQ comorbidity measure: Metastatic cancer|
|CM_NEURO||2002-2006||AHRQ comorbidity measure: Other neurological disorders|
|CM_OBESE||2002-2006||AHRQ comorbidity measure: Obesity|
|CM_PARA||2002-2006||AHRQ comorbidity measure: Paralysis|
|CM_PERIVASC||2002-2006||AHRQ comorbidity measure: Peripheral vascular disorders|
|CM_PSYCH||2002-2006||AHRQ comorbidity measure: Psychoses|
|CM_PULMCIRC||2002-2006||AHRQ comorbidity measure: Pulmonary circulation disorders|
|CM_RENLFAIL||2002-2006||AHRQ comorbidity measure: Renal failure|
|CM_TUMOR||2002-2006||AHRQ comorbidity measure: Solid tumor without metastasis|
|CM_ULCER||2002-2006||AHRQ comorbidity measure: Peptic ulcer disease excluding bleeding|
|CM_VALVE||2002-2006||AHRQ comorbidity measure: Valvular disease|
|CM_WGHTLOSS||2002-2006||AHRQ comorbidity measure: Weight loss|
|All Patient Refined DRG (3M)||APRDRG||2002-2006||All Patient Refined DRG|
|APRDRG_Risk_Mortality||2002-2006||All Patient Refined DRG: Risk of Mortality Subclass|
|APRDRG_Severity||2002-2006||All Patient Refined DRG: Severity of Illness Subclass|
|All-Payer Severity-adjusted DRG (HSS, Inc.)||APSDRG||2002-2006||All-Payer Severity-adjusted DRG|
|APSDRG_Mortality_Weight||2002-2006||All-Payer Severity-adjusted DRG: Mortality Weight|
|APSDRG_LOS_Weight||2002-2006||All-Payer Severity-adjusted DRG: Length of Stay Weight|
|APSDRG_Charge_Weight||2002-2006||All-Payer Severity-adjusted DRG: Charge Weight|
|Disease Staging (Medstat)||DS_DX_Category1||2002-2006||Disease Staging: Principal Disease Category|
|DS_Stage1||2002-2006||Disease Staging: Stage of Principal Disease Category|
|DS_LOS_Level||2002-2006||Disease Staging: Length of Stay Level|
|DS_LOS_Scale||2002-2006||Disease Staging: Length of Stay Scale|
|DS_Mrt_Level||2002-2006||Disease Staging: Mortality Level|
|DS_Mrt_Scale||2002-2006||Disease Staging: Mortality Scale|
|DS_RD_Level||2002-2006||Disease Staging: Resource Demand Level|
|DS_RD_Scale||2002-2006||Disease Staging: Resource Demand Scale|
|Linkage Variables||HOSPID||2002-2006||HCUP hospital identification number|
|KEY||2002-2006||HCUP record identifier|
|Type of Data Element||HCUP Variable Name||Years Available||Coding Notes|
|Clinical Classifications Software category for Mental Health and Substance Abuse (CCS-MHSA)||CCSMGN1 – CCSMGN15||2005-2006||CCS-MHSA general category for all diagnoses|
|CCSMSP1 – CCSMSP15||2005-2006||CCS-MHSA specific category for all diagnoses|
|ECCSMGN1 – ECCSMGN4||2005-2006||CCS-MHSA general category for all external cause of injury codes|
|Chronic Condition Indicator||CHRON1 – CHRON15||2005-2006||Chronic condition indicator for all diagnoses: (0) non-chronic condition, (1) chronic condition|
|CHRONB1 – CHRONB15||2005-2006||Chronic condition indicator body system for all diagnoses: (1) Infectious and parasitic disease, (2) Neoplasms, (3) Endocrine, nutritional, and metabolic diseases and immunity disorders, (4) Diseases of blood and blood-forming organs, (5) Mental disorders, (6) Diseases of the nervous system and sense organs, (7) Diseases of the circulatory system, (8) Diseases of the respiratory system, (9) Diseases of the digestive system, (10) Diseases of the genitourinary system, (11) Complications of pregnancy, childbirth, and the puerperium, (12) Diseases of the skin and subcutaneous tissue, (13) Diseases of the musculoskeletal system, (14) Congenital anomalies, (15) Certain conditions originating in the perinatal period, (16) Symptoms, signs, and ill-defined conditions, (17) Injury and poisoning, (18) Factors influencing health status and contact with health services|
|Procedure Class||PCLASS1 – PCLASS15||2005-2006||Procedure Class for all procedures: (1) Minor Diagnostic, (2) Minor Therapeutic, (3) Major Diagnostic, (4) Major Therapeutic|
|Linkage Variables||HOSPID||2002-2006||HCUP hospital identification number|
|KEY||2002-2006||HCUP record identifier|
ENDNOTES1Refer to Chapter 10 in Foreman, EK, Survey Sampling Principles. New York: Dekker, 1991.
2Carlson BL, Johnson AE, Cohen SB. "An Evaluation of the Use of Personal Computers for Variance Estimation with Complex Survey Data." Journal of Official Statistics, vol. 9, no. 4, 1993: 795-814.
3We used the following American Hospital Association Annual Survey Database (Health Forum, LLC © 2012) data elements to assign the NIS Teaching Hospital Indicator:
AHA Data Element Name = Description [HCUP Data Element Name].
BDH = Number of short–term hospital beds [B001H].
BDTOT = Number of total facility beds [B001].
FTRES = Number of full time employees: interns & residents (medical & dental) [E125].
PTRES = Number of part-time employees: interns & residents (medical & dental) [E225].
MAPP8 = Council of Teaching Hospitals (COTH) indicator [A101].
MAPP3 = Residency training approval by the Accreditation Council for Graduate Medical Education (ACGME) [A102].
Prior to the 1998 NIS, we used the following SAS code to assign the NIS teaching hospital status indicator, H_TCH:
/* FIRST ESTABLISH SHORT-TERM BEDS DEFINITION */
IF BDH NE . THEN BEDTEMP = BDH ; /* SHORT TERM BEDS */
ELSE IF BDH =. THEN BEDTEMP=BDTOT ; /* TOTAL BEDS PROXY */
/* NEXT ESTABLISH TEACHING STATUS BASED ON F-T & P-T */
/* RESIDENT/INTERN STATUS FOR HOSPITALS. */
RESINT = (FTRES + .5*PTRES)/BEDTEMP ;
IF RESINT > 0 & (MAPP3=1 or MAPP8=1) THEN H_TCH=1;/* 1=TEACHING */
ELSE H_TCH=0 ; /* 0=NONTEACHING */
Beginning with the 1998 NIS, we used the following SAS code to assign the teaching hospital status indicator, HOSP_TEACH:
/* FIRST ESTABLISH SHORT-TERM BEDS DEFINITION */
IF BDH NE . THEN BEDTEMP = BDH ; /* SHORT TERM BEDS */
ELSE IF BDH =. THEN BEDTEMP = BDTOT ; /* TOTAL BEDS PROXY */
/* ESTABLISH IRB NEEDED FOR TEACHING STATUS */
/* BASED ON F-T P-T RESIDENT INTERN STATUS */
IRB = (FTRES + .5*PTRES) / BEDTEMP ;
/* CREATE TEACHING STATUS VARIABLE */
IF (MAPP8 EQ 1) OR (MAPP3 EQ 1) THEN HOSP_TEACH = 1 ;
ELSE IF (IRB GE 0.25) THEN HOSP_TEACH = 1 ;
ELSE HOSP_TEACH = 0;
4Most AHA Annual Survey Database files do not cover a January-to-December period for every hospital. The numbers of hospitals for 1988-1991 are based on adjusted versions of the files which we created by apportioning the data from adjacent survey files across calendar years. The numbers of hospitals for later years are based on the unadjusted AHA Annual Survey Database files.
5Table 1: Annual Estimates of the Population for the United States, Regions, States, and Puerto Rico: April 1, 2000 to July 1, 2007 (NST-EST2007-01). Source: Population Division, U.S. Census Bureau. Release Date: December 27, 2007.
|Internet Citation: 2006 Introduction to the NIS. Healthcare Cost and Utilization Project (HCUP). July 2016. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/db/nation/nis/NIS_Introduction_2006.jsp.|
|Are you having problems viewing or printing pages on this Website?|
|If you have comments, suggestions, and/or questions, please contact firstname.lastname@example.org.|
|Privacy Notice, Viewers & Players|
|Last modified 7/22/16|