Skip Navigation

Introduction to the HCUP State Inpatient Databases (SID)

HEALTHCARE COST AND UTLIZATION PROJECT – HCUP
A FEDERAL-STATE-INDUSTRY PARTNERSHIP IN HEALTH DATA

Sponsored by the Agency for Healthcare Research and Quality

 

 

INTRODUCTION TO

THE HCUP STATE INPATIENT DATABASES (SID)

 

 

These pages provide only an introduction to the SID package.

Full documentation is provided online at the HCUP User Support Web site:
www.hcup-us.ahrq.gov.


 

Issued October 2018

 

Agency for Healthcare Research and Quality
Healthcare Cost and Utilization Project (HCUP)
5600 Fishers Lane
Mail Stop 7W25B
Rockville, MD 20857

Phone: (866) 290-HCUP (4287)
E-mail: hcup@ahrq.gov
Web site: www.hcup-us.ahrq.gov

 

SASD Data and Documentation Distributed by:
HCUP Central Distributor

Phone: (866) 556-4287 (toll free)
Fax: (866) 792-5313 (toll free)
E-mail: HCUPDistributor@ahrq.gov



Table of Contents



HCUP STATE INPATIENT DATABASES (SID)
SUMMARY OF DATA USE LIMITATIONS

***** REMINDER *****


All users of the SID must take the online Data Use Agreement (DUA) training course, and read and sign a Data Use Agreement.

Authorized users of HCUP data agree to the following restrictions:‡

  • Will not use the data for any purpose other than research or aggregate statistical reporting.

  • Will not re-release any data to unauthorized users.

  • Will not redistribute HCUP data by posting on any Website or other publicly-accessible online repository

  • Will not identify or attempt to identify any individual, including by the use of vulnerability analysis or penetration testing. Methods that could be used to identify individuals directly or indirectly shall not be disclosed or published.

  • Will not release or disclose information where the number of observations (i.e., individual discharge records) in any given cell of tabulated data <10.

  • Will not publish information that could identify individual establishments (e.g., hospitals), and will not contact establishments.

  • Will not use the data concerning individual establishments for commercial or competitive purposes involving those establishments, and will not use the data to determine rights, benefits, or privileges of individual establishments.

  • Will acknowledge in reports that data from the "Healthcare Cost and Utilization Project (HCUP)" were used, including names of the specific databases used for analysis.

Any violation of the limitations in the data use agreement is punishable under Federal law by a fine, up to five years in prison, or both. Violations may also be subject to penalties under State statutes.

† The on-line Data Use Agreement training tool and the Data Use Agreement for State Databases are available on the HCUP User Support (HCUP-US) Web site: www.hcup-us.ahrq.gov.
‡ Specific provisions are detailed in the Data Use Agreement for HCUP State Databases.



Return to Introduction

HCUP CONTACT INFORMATION

All HCUP data users, including data purchasers and collaborators, must complete the online HCUP Data Use Agreement (DUA) Training Tool, and read and sign the HCUP Data Use Agreement. Proof of training completion and signed Data Use Agreements must be submitted to the HCUP Central Distributor as described below.

The on-line DUA training course is available at: www.hcup-us.ahrq.gov/tech_assist/dua.jsp.

The HCUP Data Use Agreement for the State Databases is available on the AHRQ-sponsored HCUP User Support (HCUP-US) Web site at: www.hcup-us.ahrq.gov

HCUP Central Distributor

Data purchasers will be required to provide their DUA training completion code and will execute their DUAs electronically as a part of the online ordering process. The DUAs and training certificates for collaborators and others with access to HCUP data should be submitted directly to the HCUP Central Distributor using the contact information below.

If you have questions concerning HCUP database purchases, current HCUP database orders and invoices, downloading Nationwide HCUP databases, unzipping State or Nationwide HCUP database products, or the submission of required HCUP Data Use Agreements (DUAs), training certificate codes, or data re-use requests, review the Purchasing FAQs located at www.hcup-us.ahrq.gov/tech_assist/faq.jsp or contact the HCUP Central Distributor:

Phone: 866-556-HCUP (4287) (toll free)
Email: HCUPDistributor@AHRQ.gov
Fax: 866-792-5313 (toll free in the United States)

Mailing address:
HCUP Central Distributor
Social & Scientific Systems, Inc.
8757 Georgia Ave, 12th Floor
Silver Spring, MD 20910

HCUP User Support:

Information about the content of the HCUP databases is available on the HCUP User Support (HCUP-US) Web site (www.hcup-us.ahrq.gov). If you have questions about using the HCUP databases, software tools, supplemental files, and other products, or about data use restrictions and publishing with the data, please review the HCUP Frequently Asked Questions located at www.hcup-us.ahrq.gov/tech_assist/faq.jsp or contact HCUP User Support:

Phone: 866-290-HCUP (4287) (toll free)
Email: hcup@ahrq.gov

We would like to receive your feedback on the HCUP data products.

Please send user feedback to hcup@ahrq.gov

Return to Introduction

HEALTHCARE COST AND UTILIZATION PROJECT — HCUP
A FEDERAL-STATE-INDUSTRY PARTNERSHIP IN HEALTH DATA

Sponsored by the Agency for Healthcare Research and Quality






The Agency for Healthcare Research and Quality and
the staff of the Healthcare Cost and Utilization Project (HCUP) thank you for
purchasing the HCUP State Inpatient Databases (SID)





HCUP State Inpatient Databases (SID)

ABSTRACT

The State Inpatient Databases (SID) are part of the Healthcare Cost and Utilization Project (HCUP), sponsored by the Agency for Healthcare Research and Quality (AHRQ).

The HCUP State Inpatient Databases (SID) are a powerful set of hospital databases from data organizations in participating States.

Researchers and policymakers use SID to investigate questions unique to one State; to compare data from two or more States; to conduct market-area variation analyses; and to identify State-specific trends in inpatient care utilization, access, charges, and outcomes.

The individual State databases are in the same HCUP uniform format and represent 100 percent of records processed by AHRQ. However, the participating data organizations control the release of specific data elements. AHRQ is currently assisting the data organizations in the release of the 1990-2017 SID.

The SID can be linked to hospital-level data from the American Hospital Association's Annual Survey of Hospitals and county-level data from the Bureau of Health Professions' Area Resource File, except in those States that do not allow the release of hospital identifiers.

Thirty-two of the data organizations participating in the HCUP have agreed to release their SID files through the HCUP Central Distributor under the auspices of the AHRQ. Uses are limited to research and aggregate statisitical reporting.

Return to Introduction

INTRODUCTION TO THE HCUP STATE INPATIENT DATABASES (SID)

OVERVIEW OF THE SID

The Healthcare Cost and Utilization Project (HCUP) State Inpatient Databases (SID) consist of individual data files from data organizations in 49 participating data organizations. In general, the SID contain the universe of that State's hospital inpatient discharge records. They are composed of annual, State-specific files that share a common structure and common data elements. Most data elements are coded in a uniform format across all States. In addition to the core set of uniform data elements, the SID include State-specific data elements or data elements available only for a limited number of States. The uniform format of the SID helps facilitate cross-State comparisons. In addition, the SID are well suited for research that requires complete enumeration of hospitals and discharges within market areas or States.

Thirty-two of the 49 data organizations that participate in the HCUP have agreed to release their State-specific files through the HCUP Central Distributor under the auspices of AHRQ. The individual state databases are in the same HCUP uniform format and represent 100 percent of records processed by AHRQ. However, the participating data organizations control the release of specific data elements.

SID data sets are currently available for multiple States and years. Each release of the SID includes:

The SID are calendar year files for all data years except 2015. Because of the transition to ICD-10-CM/PCS1 on October 1, 2015, the 2015 SID are split into two parts. Nine months of the 2015 data with ICD-9-CM2 codes (discharges from January 1, 2015 - September 30, 2015) are in one set of files labeled Q1Q3. Three months of 2015 data with ICD-10-CM/PCS codes (discharges from October 1, 2015 - December 31, 2015) are in a separate set of files labeled Q4. More information about the changes to the HCUP databases for ICD-10-CM/PCS and use of data across the two coding system may be found on the HCUP User Support Web site under ICD-10-CM/PCS Resources (www.hcup-us.ahrq.gov/datainnovations/icd10_resources.jsp).

SID documentation and tools—including file specifications, programming source code for loading ASCII data into SAS (SAS Institute Inc.; Cary, NC), SPSS (IBM Corp.; Somers, NY), and Stata (StataCorp; College Station, TX), and value labels—are available online at the HCUP User Support Web site (www.hcup-us.ahrq.gov).

Starting with the 2006 SID, the AHA Linkage files are available via the HCUP User Support Web site www.hcup-us.ahrq.gov. The AHA Linkage files may not be available when the discharge-level database is released.

Return to Introduction

How the HCUP SID Differ from State Data Files

The SID available through the HCUP Central Distributor differ from the data files available from the data organizations in the following ways:

Because the data organizations dictate the data elements that may be released through the HCUP Central Distributor, the data elements on the SID are a subset of the data collected by the corresponding data organizations. HCUP uniform coding is used on most data elements on the SID. A few State-specific data elements retain the original values provided by the respective data organizations.

Return to Introduction

What Types of Hospitals Are Included in the SID?

The types of hospitals included in the SID depend on the information provided by the data organizations and how the files were handled during HCUP processing. Most State government data organizations provide information on all acute care hospitals in the respective State. Private data organizations are often restricted to member hospitals and may not provide information on all hospitals in their State.

Beginning with the 1994 SID, all hospitals reported by the data organizations were retained in the SID files. Discharges from facilities such as psychiatric facilities, alcohol and drug dependency facilities, and State, Federal, and Veterans Affairs hospitals will be in the SID, if reported by the data source. Prior to 1994, only discharges from community hospitals were retained in the SID.

Community hospitals, as defined by the AHA, include "all nonfederal, short term, general and other specialty hospitals, excluding hospital units of institutions." Included among community hospitals are academic medical centers and specialty hospitals such as obstetrics, gynecology, otolaryngology, short term rehabilitation, orthopedic, and pediatric hospitals. Noncommunity hospitals include Federal hospitals (e.g., Veterans Affairs, Department of Defense, and Indian Health Service hospitals), long-term hospitals, psychiatric hospitals, alcohol/chemical dependency treatment facilities, and hospital units within institutions such as prisons.

Some community hospitals may not be included in the SID because their data were not provided by the data source. To identify community hospitals, the SID must be linked to the AHA Annual Survey of Hospitals by the AHA hospital identifier.

Tables showing the number of hospitals in the SID can be found online at the HCUP User Support Web site: (www.hcup-us.ahrq.gov). The tables present the hospitals by the number of hospitals of:

Information contained in the AHA Annual Survey of Hospitals was used to determine if a hospital was a community hospital. Some hospitals could not be categorized as community or noncommunity hospitals because they could not be matched with AHA information. This occurs when a hospital was closed in a previous year or when the hospital does not report to the AHA.

Return to Introduction

How to Identify Hospitals in the SID

Up to three hospital identifiers are on the SID:

Return to Introduction

What is the File Structure of the SID in the 2016-2017 Files?

Based on the availability of data elements across States, data elements included in the 2016 SID are structured as follows:

Unavailable with the 2016-2017 SID are two file types that had been included with the SID in prior data years: the Diagnosis and Procedure Groups file and the Disease Severity file. The data elements included in those two files were derived from AHRQ software tools. If you are interested in applying the AHRQ software tools to the ICD-10-CM/PCS data in the 2016-2017 SID, beta versions of the AHRQ software tools are available on the HCUP User Support Web site at www.hcup-us.ahrq.gov/tools_software.jsp. Also available is a tutorial on how to apply the AHRQ software tools to the HCUP databases at www.hcup-us.ahrq.gov/tech_assist/tutorials.jsp.

The Core file is a discharge-level file that contains:

Core data elements meet at least one of the following criteria:

State-specific data elements meet at least one of the following criteria:

The Charges file contains detailed charge information. There are three kinds of Charges files:

  1. Summarized detail in which charge information is summed within the revenue center. This type of Charges file includes one record per discharge abstract. Each record contains three corresponding arrays with the following information:
    1. Revenue center (REVCDn)
    2. Total charge for the revenue center (CHGn)
    3. Total units of service for the revenue center (UNITn)
  2. For example, if a patient had five laboratory tests, REVCD1 would include the revenue code for laboratory, CHG1 would include the total charge for the five tests, and UNIT1 would be five. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  3. Collapsed detail in which charge information is summed across revenue centers. This type of Charges file includes one record per discharge abstract. Each record contains an array of collapsed charges (CHGn) that are predefined by the data organization that provided the data.

    Consider the example of a patient that had five laboratory tests from different revenue centers in the range of 300 to 319. CHG1, which was predefined as Laboratory Charges for revenue centers 300-319, would include the total charge for the five tests, but there is no detail on which specific revenue centers were used. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  4. Line item detail in which a submitted charge pertains to a specified revenue center and there may be multiple charges reported for the same revenue center. This type of Charges file includes multiple records per discharge abstract. Each record includes the following information for one service.
    1. Revenue center (REVCODE)
    2. Charge (CHARGE)
    3. Unit of service (UNITS)
    4. Day of service (SERVDAY) for some files

    For example, if a patient had five laboratory tests, there are five records in the Charges file with information on the charge for each laboratory test. Information from this type of Charges file may be combined with the Core file by the unique record identifier (KEY), but there is not a one-to-one correspondence of records.

Refer to the Description of Data Elements online at the HCUP User Support Web site (www.hcup-us.ahrq.gov) for more information on the charge information from the different States.

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine the discharge-level files with the hospital-level file (AHA Linkage file), merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core file may contain 5,000 discharges for DSHOSPID "A," but the Hospital file contains only 1 record for DSHOSPID "A."

Starting with the 2006 SID, the AHA Linkage files are available via the HCUP User Support Web site (www.hcup-us.ahrq.gov). The AHA Linkage files may not be available when the discharge-level database is released.

What is the File Structure of the SID in the 2015 Files?

The file structure of the 2015 SID is similar to previous years (and future years) in terms of how data elements are split across multiple data files, but differs from others years because the records within the 2015 files have been separated into two sets of files based on the discharge date because of the transition from reporting medical diagnoses and inpatient procedures using ICD-9-CM to the ICD-10-CM/PCS code sets.3

The 2015 SID are split into two separate sets of files based on the discharge date and different coding schemes:

Almost all of the diagnosis and procedure-related data elements that are based on ICD-10-CM/PCS data have been renamed with the prefix of I10 to distinguish them from the ICD-9-CM-based data element. Exceptions include data elements that are based on third-party proprietary software such as the Diagnosis Related Groups (DRGs) and the All Patient Refined DRG (APR-DRG).

Based on the availability of data elements across States, data elements included in the 2015 SID are structured as follows:

The Core file is a discharge-level file that contains:

Core data elements meet at least one of the following criteria:

State-specific data elements meet at least one of the following criteria:

The Charges file contains detailed charge information. There are three kinds of Charges files:

  1. Summarized detail in which charge information is summed within the revenue center. This type of Charges file includes one record per discharge abstract. Each record contains three corresponding arrays with the following information:
    1. Revenue center (REVCDn)
    2. Total charge for the revenue center (CHGn)
    3. Total units of service for the revenue center (UNITn)

    For example, if a patient had five laboratory tests, REVCD1 would include the revenue code for laboratory, CHG1 would include the total charge for the five tests, and UNIT1 would be five. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  2. Collapsed detail in which charge information is summed across revenue centers. This type of Charges file includes one record per discharge abstract. Each record contains an array of collapsed charges (CHGn) that are predefined by the data organization that provided the data.

    Consider the example of a patient that had five laboratory tests from different revenue centers in the range of 300-319. CHG1, which was predefined as Laboratory Charges for revenue centers 300-319, would include the total charge for the five tests, but there is no detail on which specific revenue centers were used. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  3. Line item detail in which a submitted charge pertains to a specified revenue center and there may be multiple charges reported for the same revenue center. This type of Charges file includes multiple records per discharge abstract. Each record includes the following information for one service:
    1. Revenue center (REVCODE)
    2. Charge (CHARGE)
    3. Unit of service (UNITS)
    4. Day of service (SERVDAY) for some files

    For example, if a patient had five laboratory tests, there are five records in the Charges file with information on the charge for each laboratory test. Information from this type of Charges file may be combined with the Core file by the unique record identifier (KEY), but there is not a one-to-one correspondence of records.

Refer to the Description of Data Elements online at the HCUP User Support Web site (www.hcup-us.ahrq.gov) for more information on the charge information from the different States.

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine the discharge-level files with the hospital-level file (AHA Linkage file), merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core file may contain 5,000 discharges for DSHOSPID "A," but the Hospital file contains only 1 record for DSHOSPID "A."

Starting with the 2006 SID, the AHA Linkage files are available via the HCUP User Support Web site http://www.hcup-us.ahrq.gov. The AHA Linkage files may not be available when the discharge-level database is released.

Diagnosis and Procedure Groups File is a discharge-level file that contains data elements from AHRQ software tools designed to facilitate the use of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic and procedure information in the HCUP databases. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Diagnosis and Procedure Groups files. These files are available beginning with the 2005 SID.

Disease Severity Measures File is a discharge-level file that contains information from the AHRQ Comorbidity Software. Information from these severity files is to be used in conjunction with the Inpatient Core files. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Disease Severity Measures files. These files are available beginning with the 2005 SID.

What is the File Structure of the SID in the 2005-2014 Files?

Based on the availability of data elements across States, data elements included in the 2005-2014 SID are structured as follows:

The Core file is a discharge-level file that contains:

Core data elements meet at least one of the following criteria:

State-specific data elements meet at least one of the following criteria:

The Charges file contains detailed charge information. There are three kinds of Charges files:

  1. Summarized detail in which charge information is summed within the revenue center. This type of Charges file includes one record per discharge abstract. Each record contains three corresponding arrays with the following information:
    1. Revenue center (REVCDn)
    2. Total charge for the revenue center (CHGn)
    3. Total units of service for the revenue center (UNITn)

    For example, if a patient had five laboratory tests, REVCD1 would include the revenue code for laboratory, CHG1 would include the total charge for the five tests, and UNIT1 would be five. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  2. Collapsed detail in which charge information is summed across revenue centers. This type of Charges file includes one record per discharge abstract. Each record contains an array of collapsed charges (CHGn) that are predefined by the data organization that provided the data.

    Consider the example of a patient that had five laboratory tests from different revenue centers in the range of 300-319. CHG1, which was predefined as Laboratory Charges for revenue centers 300-319, would include the total charge for the five tests, but there is no detail on which specific revenue centers were used. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  3. Line item detail which a submitted charge pertains to a specified revenue center and there may be multiple charges reported for the same revenue center. This type of Charges file includes multiple records per discharge abstract. Each record includes the following information forone service:
    1. Revenue center (REVCODE)
    2. Charge (CHARGE)
    3. Unit of service (UNITS)
    4. Day of service (SERVDAY) for some files

    For example, if a patient had five laboratory tests, there are five records in the Charges file with information on the charge for each laboratory test. Information from this type of Charges file may be combined with the Core file by the unique record identifier (KEY), but there is not a one-to-one correspondence of records.

Refer to the Description of Data Elements online at the HCUP User Support Web site (www.hcup-us.ahrq.gov) for more information on the charge information from the different States.

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine the discharge-level files with the hospital-level file (AHA Linkage file), merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core file may contain 5,000 discharges for DSHOSPID "A," but the Hospital file contains only 1 record for DSHOSPID "A."

Starting with the 2006 SID, the AHA Linkage files are available via the HCUP User Support Web site http://www.hcup-us.ahrq.gov. The AHA Linkage files may not be available when the discharge-level database is released.

Diagnosis and Procedure Groups Files is a discharge-level file that contains data elements from AHRQ software tools designed to facilitate the use of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic and procedure information in the HCUP databases. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Diagnosis and Procedure Groups files. These files are available beginning with the 2005 SID.

Disease Severity Measures Files is a discharge-level file that contains information from the AHRQ Comorbidity Software. Information from these severity files is to be used in conjunction with the Inpatient Core files. The unit of observation is an inpatient stay record. The HCUP unique record identifier (KEY) provides the linkage between the Core files and the Disease Severity Measures files. These files are available beginning with the 2005 SID.

What is the File Structure of the SID in the 1998-2004 Files?

Based on the availability of data elements across States, data elements included in the 1998-2004 SID are structured as follows:

The Core file is a discharge-level file that contains:

Core data elements meet at least one of the following criteria:

State-specific data elements meet at least one of the following criteria:

The Charges file contains detailed charge information. There are two kinds of Charges files:

  1. Summarized detail in which charge information is summed within the revenue center. This type of Charges file includes one record per discharge abstract. Each record contains three corresponding arrays with the following information:
    1. Revenue center (REVCDn)
    2. Total charge for the revenue center (CHGn)
    3. Total units of service for the revenue center (UNITn)

    For example, if a patient had five laboratory tests, REVCD1 would include the revenue code for laboratory, CHG1 would include the total charge for the five tests, and UNIT1 would be five. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  2. Collapsed detail in which charge information is summed across revenue centers. This type of Charges file includes one record per discharge abstract. Each record contains an array of collapsed charges (CHGn) that are predefined by the data organization that provided the data.

    Consider the example of a patient that had five laboratory tests from different revenue centers in the range of 300 to 319. CHG1, which was predefined as Laboratory Charges for revenue centers 300-319, would include the total charge for the five tests, but there is no detail on which specific revenue centers were used. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

Refer to the Description of Data Elements online at the HCUP User Support Web site (www.hcup-us.ahrq.gov) for more information on the charge information from the different States.

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The Core and Charges files are discharge-level files with one observation per abstract. The same record is represented in each file, but contains different data elements. To combine data elements across discharge-level files, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine discharge-level files with the hospital-level file (AHA Linkage file), merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core file may contain 5,000 discharges for DSHOSPID "A," but the Hospital file contains only 1 record for DSHOSPID "A."

Return to Introduction

What is the File Structure of the SID in the 1995-1997 Files?

Based on the availability of data elements across States, data elements included in the 1995-1997 SID are structured as follows:

The Core file contains core data elements that form the nucleus of the SID. Core data elements meet at least one of the following criteria:

The State-specific file contains State-specific data elements intended for limited use. State-specific data elements meet at least one of the following criteria:

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The Core and State-specific files are discharge-level files with one observation per abstract. The same record is represented in each file, but each contains different data elements. To combine data elements across discharge-level files, merge the files by the unique record identifier (SEQ_SID). There will be a one-to-one correspondence of records.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine discharge-level files with the AHA Linkage file, merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core may contain 5,000 discharges for DSHOSPID "A," but the AHA Linkage file contains only 1 record for DSHOSPID "A."

Return to Introduction

What is the File Structure of the SID in the 1990-1994 Files?

Based on the availability of data elements across States, data elements included in the 1990-1994 SID are structured the same as the 1998-2004 files. This includes a maximum of three types of files:

The Core file contains:

Core data elements meet at least one of the following criteria:

State-specific data elements meet at least one of the following criteria:

The Charges file contains detailed charge information. There are two kinds of Charges files:

  1. Summarized detail in which charge information is summed within the revenue center. This type of Charges file includes one record per discharge abstract. Each record contains three corresponding arrays with the following information:
    1. Revenue center (REVCDn)
    2. Total charge for the revenue center (CHGn)
    3. Total units of service for the revenue center (UNITn)

    For example, if a patient had five laboratory tests, REVCD1 would include the revenue code for laboratory, CHG1 would include the total charge for the five tests, and UNIT1 would be five. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

  2. Collapsed detail in which charge information is summed across revenue centers. This type of Charges file includes one record per discharge abstract. Each record contains an array of collapsed charges (CHGn) that are predefined by the data organization that provided the data.

    Consider the example of a patient that had five laboratory tests from different revenue centers in the range of 300 to 319. CHG1, which was predefined as Laboratory Charges for revenue centers 300-319, would include the total charge for the five tests, but there is no detail on which specific revenue centers were used. To combine data elements between this type of Charges file and the Core file, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

Refer to the Description of Data Elements online at the HCUP User Support Web site (www.hcup-us.ahrq.gov) for more information on the charge information from the different States.

The AHA Linkage file contains AHA linkage data elements that allow the SID to be used in conjunction with the AHA Annual Survey of Hospitals data files. These files contain information about hospital characteristics and are available for purchase through the AHA. Because the data organizations in participating States determine whether the AHA linkage data elements may be released through the HCUP Central Distributor with the SID, not all SID include AHA linkage data elements.

The Core and Charges files are discharge-level files with one observation per abstract. The same record is represented in each file, but contains different data elements. To combine data elements across discharge-level files, merge the files by the unique record identifier (KEY). There will be a one-to-one correspondence of records.

The AHA Linkage file is a hospital-level file with one observation per hospital or facility. To combine discharge-level files with the hospital-level file (AHA Linkage file), merge the files by the hospital identifier provided by the data source (DSHOSPID), but be careful of the different levels of aggregation. For example, the Core file may contain 5,000 discharges for DSHOSPID "A," but the Hospital file contains only 1 record for DSHOSPID "A."

Return to Introduction

GETTING STARTED

SID Data Files are provided on CD-ROMs. The number of CD-ROMs depends on the State and year of data.

SID Programs, Documentation, and Tools for all States and all years are available online at the HCUP User Support Web site at www.hcup-us.ahrq.gov.

SID Data Files

To load SID data onto your PC, you will need between one and four gigabytes of space available, depending on which SID database you are using. Because of the size of the files, the data are distributed as self-extracting PKZIP compressed files. To decompress the data, you should follow these steps:

  1. Create a directory for the State-specific SID on your hard drive.
  2. Copy the self-extracting data files from the SID Data Files CD-ROM(s) into the new directory.
  3. Unzip each file by running the corresponding *.exe file.
    • Type the file name within DOS or click on the name within Windows Explorer.
    • Edit the name of the "Unzip to Folder" in the WinZip Self Extractor dialog to select the desired destination directory for the extracted file.
    • Click on the "Unzip" button.

The ASCII data files will then be uncompressed into the selected destination directory. After the files are uncompressed, the *.exe files can be deleted.

Return to Introduction

SID Programs, Documentation, and Tools

The SID programs, technical documentation and HCUP tools are available online via the Databases page at the HCUP User Support Website (www.hcup-us.ahrq.gov/databases.jsp). The site provides important resources for SID users, and all of the files may be downloaded free of charge. A summary is provided in Table 1.

The SID programs include SAS, SPSS, and Stata load programs containing the programming code necessary to convert SID ASCII files into SAS, SPSS, or Stata. Please note that for the 2015 SID, there will be one set of load programs for the Q1-Q3 files and another set of load programs for the Q4 files.

The SID technical documentation provides detailed descriptions of the structure and content of the SID.

The HCUP Tools include the Clinical Classifications Software (CCS) and general label and format information applicable to all HCUP databases.

Information intended to summarize key issues to be anticipated by researchers before analyzing health services outcomes in the HCUP databases that include ICD-10-CM/PCS coding is included on the HCUP User Support Web site (www.hcup-us.ahrq.gov/datainnovations/icd10_resources.jsp). The section discusses key differences in the structure of HCUP databases, presents preliminary coding differences that were observed in HCUP databases, and provides general guidance and forewarning to users interested in analyzing outcomes that are potentially impacted by the transition.

Table 1. SID Related Reports and Database Documentation Available on HCUP-US

Description of the SID Database
  • SID Overview
  • Introduction to the SID (this document)
  • SID File Compositions—describes types of hospitals and types of records included in each SID (e.g., number of discharges and hospitals by year)
  • SID-Related Reports

Restrictions on the Use
  • HCUP Data Use Agreement Training
  • SID Data Use Agreement
  • Requirements for Publishing with HCUP Data

File Specifications and Load Programs
  • File Specifications—details data file names, number of records, record length, and record layout (e.g., file size by year)
  • SAS Load Programs
  • SPSS Load Programs
  • Stata Load Programs

Data Elements
  • Availability of States Across All Years
  • Availability of Data Elements by Year
  • Availability of HCUP Revisit Variables Across States and Years
  • Summary Statistics for All States Across All Years—lists means and frequencies on nearly all data elements

Additional Resources for Data Elements
  • HCUP Quality Control Procedures—describes procedures used to assess data quality
  • HCUP Coding Practices—describes how HCUP data elements are coded
  • HCUP Hospital Identifiers—explains data elements that characterize individual hospitals across States and Years
  ICD-10-CM/PCS Included in 2015-2017 SID
  • Caution: 2015 SID Includes ICD-9-CM and ICD-10-CM/PCS Data
    • 2015 State Databases Revised File Structure and New Data Elements
  • Additional ICD-10-CM/PCS Resources
  • Tutorial for Loading HCUP Software Tools for ICD-10-CM/PCS
Known Data Issues
  • Includes State-specific information on databases that have been updated or have known data issues
HCUP Tools: Labels and Formats
  • Clinical Classifications Software (CCS)—a categorization scheme that groups diagnosis and procedure codes into mutually exclusive categories
  • DRG Formats Program—Creates SAS formats to label the values of each DRG and MDC category
  • HCUP Formats Program—Creates SAS formats to label the values of selected categorical data elements in HCUP files
  • HCUP Diagnosis and Procedure Groups Formats Program—Creates SAS formats to label the values of HCUP Diagnosis and Procedure Groups data elements, including Clinical Classifications Software (CCS) data elements
  • ICD-9-CM Formats Program—Creates SAS formats to label the values of ICD-9-CM Diagnoses and Procedures
  • ICD-10-CM Formats Program—Creates SAS formats to label the values of ICD-10-CM Diagnoses and Procedures
  • Severity Formats Program creates—SAS formats to label the values data elements in the Severity File
HCUP Supplemental Files
  • American Hospital Association Linkage Files
  • Cost-to-Charge Ratio Files
  • Hospital Market Structure (HMS) Files
  • HCUP Variables for Revisit Analysis
Obtaining HCUP Data
  • Purchase HCUP data from the HCUP Central Distributor

Return to Introduction

DATA USE AGREEMENT for the State Databases from the Healthcare Cost and Utilization Project Agency for Healthcare Research and Quality

This Data Use Agreement ("Agreement") governs the disclosure and use of data in the HCUP State Databases from the Healthcare Cost and Utilization Project (HCUP) which are maintained by the Center for Delivery, Organization, and Markets (CDOM) within the Agency for Healthcare Research and Quality (AHRQ). The HCUP State databases include the State Inpatient Databases (SID), State Ambulatory Surgery and Services Databases (SASD), and State Emergency Department Databases (SEDD). Any person ("the data recipient") seeking permission from AHRQ to access HCUP State Databases must sign and submit this Agreement to AHRQ or its agent, and complete the online Data Use Agreement Training Course at www.hcup-us.ahrq.gov, as a precondition to the granting of such permission.

Section 944(c) of the Public Health Service Act (42 U.S.C. 299c-3(c)) ("the AHRQ Confidentiality Statute"), requires that data collected by AHRQ that identify individuals or establishments be used only for the purpose for which they were supplied. Pursuant to this Agreement, data released to AHRQ for the HCUP Databases are subject to the data standards and protections established by the Health Insurance Portability and Accountability Act of 1996 (HIPAA) (P.L. 104-191) and implementing regulations ("the Privacy Rule"). Accordingly, HCUP Databases may only be released in "limited data set" form, as that term is defined by the Privacy Rule, 45 C.F.R. § 164.514(e). HCUP data may only be used by the data recipient for research which may include analysis and aggregate statistical reporting. AHRQ classifies HCUP data as protected health information under the HIPAA Privacy Rule, 45 C.F.R. § 160.103. By executing this Agreement, the data recipient understands and affirms that HCUP data may only be used for the prescribed purposes, and consistent with the following standards:

No Identification of Persons-The AHRQ Confidentiality Statute prohibits the use of HCUP data to identify any person (including but not limited to patients, physicians, and other health care providers). The use of HCUP Databases to identify any person constitutes a violation of this Agreement and may constitute a violation of the AHRQ Confidentiality Statute and the HIPAA Privacy Rule. This Agreement prohibits data recipients from releasing, disclosing, publishing, or presenting any individually identifying information obtained under its terms. AHRQ omits from the data set all direct identifiers that are required to be excluded from limited data sets as consistent with the HIPAA Privacy Rule. AHRQ and the data recipient(s) acknowledge that it may be possible for a data recipient, through deliberate technical analysis of the data sets and with outside information, to attempt to ascertain the identity of particular persons. Risk of individual identification of persons is increased when observations (i.e., individual discharge records) in any given cell of tabulated data is <10. This Agreement expressly prohibits any attempt to identify individuals, including by the use of vulnerability analysis or penetration testing. In addition, methods that could be used to identify individuals directly or indirectly shall not be disclosed, released, or published. Data recipients shall not attempt to contact individuals for any purpose whatsoever, including verifying information supplied in the data set. Any questions about the data must be referred exclusively to AHRQ. By executing this Agreement, the data recipient understands and agrees that actual and considerable harm will ensue if he or she attempts to identify individuals. The data recipient also understands and agrees that actual and considerable harm will ensue if he or she intentionally or negligently discloses, releases, or publishes information that identifies individuals or can be used to identify individuals.

Use of Establishment Identifiers-The AHRQ Confidentiality Statute prohibits the use of HCUP data to identify establishments unless the individual establishment has consented. Permission is obtained from the HCUP data sources (i.e., state data organizations, hospital associations, and data consortia) to use the identification of hospital establishments (when such identification appears in the data sets) for research, analysis, and aggregate statistical reporting. This may include linking institutional information from outside data sets for these purposes. Such purpose does not include the use of information in the data sets concerning individual establishments for commercial or competitive purposes involving those individual establishments, or to determine the rights, benefits, or privileges of establishments. Data recipients are prohibited from identifying establishments directly or by inference in disseminated material. In addition, users of the data are prohibited from contacting establishments for the purpose of verifying information supplied in the data set. Any questions about the data must be referred exclusively to AHRQ. Misuse of identifiable HCUP data about hospitals or any other establishment constitutes a violation of this Agreement and may constitute a violation of the AHRQ Confidentiality Statute.

The undersigned data recipients provide the following affirmations concerning HCUP data:

Protection of Individuals

Protection of Establishments

Limitations on the Disclosure of Data and Safeguards

Terms, Breach, and Compliance

Any violation of the terms of this Agreement shall be grounds for immediate termination of this Agreement. AHRQ shall determine whether a data recipient has violated any term of the Agreement. AHRQ shall determine what actions, if any, are necessary to remedy a violation of this Agreement, and the data recipient(s) shall comply with pertinent instructions from AHRQ. Actions taken by AHRQ may include but not be limited to providing notice of the termination or violation to affected parties and prohibiting data recipient(s) from accessing HCUP data in the future.

In the event AHRQ terminates this Agreement due to a violation, or finds the data recipient(s) to be in violation of this Agreement, AHRQ may direct that the undersigned data recipient(s) immediately return all copies of the HCUP State Databases to AHRQ or its designee without refund of purchase fees.

Acknowledgment

I understand that this Agreement is requested by the United States Agency for Healthcare Research and Quality to ensure compliance with the AHRQ Confidentiality Statute. My signature indicates that I understand the terms of this Agreement and that I agree to comply with its terms. I understand that a violation of the AHRQ Confidentiality Statute may be subject to a civil penalty of up to $14,140 under 42 U.S.C. 299c-3(d), and that deliberately making a false statement about this or any matter within the jurisdiction of any department or agency of the Federal Government violates 18 U.S.C. § 1001 and is punishable by a fine, up to five years in prison, or both. Violators of this Agreement may also be subject to penalties under state confidentiality statutes that apply to these data for particular states.

Signed:__________________________________________________________________ Date:_________________________

Print or Type Name:_______________________________________________________________________

Title:__________________________________________________________________________________________________

Organization:____________________________________________________________________________________________

Address:________________________________________________________________________________________________

City:_____________________________________________ State:__________ ZIP Code:________________

Phone:______________________________________________ Fax:________________________________________

E-mail:_____________________________________________________________________________________

The information above is maintained by AHRQ only for the purpose of enforcement of this Agreement and for notification in the event data errors occur.

Note to Purchaser: Shipment of the requested data product will only be made to the person who signs this Agreement, unless special arrangements that safeguard the data are made with AHRQ or its agent.

Submission Information

Please send signed HCUP Data Use Agreements and proof of online training to:

HCUP Central Distributor
Social & Scientific Systems, Inc.
8757 Georgia Avenue, 12th Floor
Silver Spring, MD 20910
E-mail: HCUPDistributor@AHRQ.gov
FAX: (866)792-5313

According to the Paperwork Reduction Act of 1995, no persons are required to respond to a collection of information unless it displays a valid OMB control number. The valid OMB control number for this information collection is 0935-0206. The time required to complete this information collection is estimated to average 30 minutes per response, including the time to review instructions, search existing data resources, gather the data needed, and complete and review the information collection. If you have any comments concerning the accuracy of the time estimate(s) or suggestions for improving this form, please write to: Agency for Healthcare Research and Quality, Attn: Reports Clearance Officer, 600 Fishers Lane, Rockville, Maryland 20857.

OMB Control No. 0935-0206 expires 01/31/2019.

Return to Introduction

1 ICD-10-CM/PCS: International Classification of Diseases, 10th Edition, Clinical Modification/ Procedure Coding System

2 ICD-9-CM: International Classification of Diseases, Ninth Edition, Clinical Modification

3 ICD-9-CM: International Classification of Diseases, Ninth Edition, Clinical Modification; ICD-10-CM/PCS: International Classification of Diseases, 10th Edition, Clinical Modification/ Procedure Coding System


Internet Citation: Introduction to the HCUP State Inpatient Databases (SID). Healthcare Cost and Utilization Project (HCUP). October 2018. Agency for Healthcare Research and Quality, Rockville, MD. www.hcup-us.ahrq.gov/db/state/siddist/SID_Introduction.jsp.
Are you having problems viewing or printing pages on this Website?
If you have comments, suggestions, and/or questions, please contact hcup@ahrq.gov.
Privacy Notice, Viewers & Players
Last modified 10/17/18