Patient Discharge Data (PDD)

The Patient Discharge Dataset consists of a record for each inpatient discharge from a California-licensed hospital. Licensed hospitals include general acute care, acute psychiatric, chemical dependency recovery, and psychiatric health facilities. For more information on the data and reporting requirements, see the California Inpatient Data Reporting Manual. These datasets are available starting in 1983.

For detailed information about the data elements within the PDD, view the data dictionaries in the Data Documentation.

Emergency Department Data (ED)

The emergency department dataset includes demographic, clinical, payer, and facility information from hospitals licensed to provide emergency medical services. The ED encounters include those patients who had face-to-face contact with the provider. In the event that the patient left without being seen, the patient would not have had a face-to-face encounter with a provider and therefore the ED encounter would not be reported. A provider is defined as the person who has primary responsibility for assessing and treating the condition of a patient at a given contact and exercises independent judgment in the care of the patient. Providers include medical doctors, doctors of osteopathy, doctors of dental surgery, or doctors of podiatric medicine. If the ED encounter resulted in a same-hospital admission, the ED encounter would be combined with the inpatient record and a separate ED record would not be reported. When analyzing ED records, you may want to include the records identified in the inpatient database as having the hospital’s own ED as the source of admission. For more information on the data and reporting requirements, see the California Emergency Department and Ambulatory Surgery Data Reporting Manual. These datasets are available beginning January 2005.

For detailed information about the data elements within the ED, view the data dictionaries in the Data Documentation.

Ambulatory Surgery Center Data (AS)

The ambulatory surgery dataset includes encounters from general acute care hospitals and licensed freestanding Ambulatory Surgery Centers, during which at least one ambulatory surgery procedure is performed. A freestanding ambulatory surgery clinic is defined as a surgical clinic licensed by the California Department of Public Health (CDPH). Many facilities that are called ambulatory surgery centers are not required to be licensed as surgical clinics and do not report data to the Office. An ambulatory surgery procedure is defined as those procedures performed on an outpatient basis in the general operating rooms, ambulatory surgery rooms, endoscopy units, or cardiac catheterization laboratories of a hospital or a freestanding ambulatory surgery clinic. If a procedure was done elsewhere (such as in a radiology unit), no ambulatory surgery record is required to be filed. If a hospital-based AS encounter resulted in a same-hospital admission, the AS encounter would be combined with the inpatient record and a separate AS record would not be reported. When analyzing hospital-based AS records, you may want to include AS direct admissions, which are identified in the hospital’s inpatient data as having Ambulatory Surgery at the same hospital as the source of admission. For more information on the data and reporting requirements, see the California Emergency Department and Ambulatory Surgery Data Reporting Manual. These datasets are available beginning January 2005.

For detailed information about the data elements within the AS, view the data dictionaries in the Data Documentation.


Back to Top


California Hospitals and Health Departments – Request Information

About

OSHPD offers several types of non-public data to licensed California Hospitals and California Local Health Departments.  Eligible hospitals and local health departments may request Limited Model Data Sets for Patient Discharge Data, including Inpatient (PDD), Emergency Department (EDD), and Ambulatory Surgery Center (ASD).  They may also order Patient Origin/Market Share data (PO/MS), created to assist hospitals and communities facing tremendous budgetary pressures, which makes the need to understand key operating performance issues critical.  In addition, there are also Prevention Quality Indicators, a set of measures that can be used with hospital inpatient discharge data to identify quality of care for “ambulatory care sensitive conditions. This is data standardized for the Agency for Healthcare Research and Quality (AHRQ PQI.)  

Limited Data Set

The Limited Data Set includes Inpatient (PDD), Emergency Department (EDD) and Ambulatory Surgery (AS) files. The contents of these files, including descriptions of the variables that they contain, are described in the non-public data documentation.  A cross-referenced list of variables across multiple years is contained in the Master Variable Grid.

PO/MS

Hospitals and communities face tremendous budgetary pressures, making the need to understand key operating performance issues critical.

The Patient Origin and Market Share (PO/MS) Report available to California licensed hospitals and Local Health Departments (AB2876-eligible requesters) includes facility, patient ZIP Code and county, age group, payer, and MS-DRG information. Data is available for 2008-2012; data can be requested for one or more years.  Visit the Open Data Portal for access to PO/MS reports for the general public. 

The PO/MS Reports support ZIP Code-based analyses, such as:

  • How big is the market?
  • Which hospital(s) control the market? (Top 5 by volume?)
  • What is the market share by service line?
  • Where do the residents of my county go for care?
AHRQ Quality Indicator

Two AHRQ Prevention Quality Indicator (PQI) products are available to California licensed hospitals and Local Health Departments.

These products provide indicators for preventable hospitalizations or ambulatory care-sensitive conditions, which can be used to assess healthcare quality and access. These indicators often reveal striking variations in healthcare for conditions that are potentially preventable through treatment in non-hospital settings and/or proper medication and management. The indicators are based on OSHPD’s hospital patient discharge data and include conditions such as hypertension, diabetes, and asthma in the adult population. PQI reports for the general public are also available.

The PQI Summary Table and Record-Level File can be used for community health assessments, county and ZIP Code-level “hot-spotting,” and quality improvement monitoring and evaluation. These products can help answer questions about a community’s healthcare delivery system, such as:

  • What counties and ZIP Codes have the highest preventable hospitalization rates?
  • What are the most common preventable hospitalizations in California?
  • What is the distribution of the preventable hospitalizations by age/race/ethnicity, gender, and payer?

For additional information, see our Frequently Asked Questions.


Back to Top


University Sponsored Researcher Data – Request Information

About

The California Office of Statewide Health Planning and Development (OSHPD) provides confidential patient-level data sets to eligible researchers.  Some of these researchers are eligible through the Information Practices Act (or “IPA,” CA Civil Code Section 1798 et seq.), which permits nonprofit educational institutions (such as the University of California) and state agencies to request data for research purposes and for performing legally mandated activities. The contents of these IPA files are described in the Master Variable Grid and the Nonpublic Data Documentation

Eligibility 

Before you start the application process, please confirm your eligibility. Nonprofit university-sponsored researchers are eligible to request IPA files. All IPA confidential data requests must have:

  • Request for Nonpublic Patient Level data approved by OSHPD
  • Research protocol approved by the Committee for the Protection of Human Subjects (CPHS). The CPHS protocol must be renewed annually and kept current while the researcher has the data. CPHS is the state Internal Review Board for the California Health and Human Services Agency. CPHS approves requests at committee meetings, held six times per year.

State agencies are also eligible to obtain IPA files, based on the need to perform their constitutional or statutory duties; the use of the data must be compatible with the purpose for which the data was collected. The requesting agency must attest that their use of OSHPD data is to support a mandated activity and a reference to the legal citation must be included as part of the application.

Linked Files – Birth and Death Data 

The OSHPD patient record-level data is linked with the Vital Statistics Birth Statistical Master File, Birth Cohort File, and Death Statistical Master Files. The vital statistics files themselves are available from the California Department of Public Health at the Vital Statistics Data Web site. The linked data files are available to qualified researchers through requests submitted to OSHPD.

Linked Birth Files

The Linked Birth File is a research database created for the purpose of studying delivery and birth outcomes. This linkage utilizes information from the following datasets:

  • California Patient Discharge Data
  • Vital Statistics Birth Certificate Data
  • Vital Statistics Death Certificate Data
  • Vital Statistics Fetal Death File
  • Vital Statistics Birth Cohort File

It includes maternal antepartum and postpartum hospital records for the nine months prior to delivery and one-year post delivery. In addition, the linked file includes birth records and all infant readmissions occurring within the first year of life. The linked pairs of birth/delivery records include information associated with a mother/baby pair from the baby’s discharge data record, the mother’s discharge data record, and the birth certificate data. All associated records (prenatal, postnatal, transfers and infant readmissions) are identified by the variable _BRTHID and are sorted in admission date order.

The file contains all infants that were born in a given year including births that occurred in a California hospital that report to OSHPD, births that occurred in a California hospital that did not report to OSHPD, and births that occurred outside California. It includes all infants and mothers irrespective of whether they were linked to a birth record or not. Linked Birth files are available to qualified researchers beginning with the 1991 calendar year reporting period. See the Master Variable Grid for available years. Note: The most recent year may not be a full cohort file depending on availability of the input cohort file.

Linked Death Files 

OSHPD has developed validated research datasets linking patient data with the state death statistical master file. These datasets allow researchers to track mortality outcomes within and outside of the hospital.

Probabilistic Linked Death File

This data file provides a unique best match of a single death record to a patient’s last identifiable record in the Patient Discharge Data. Two versions of the probabilistic Linked Death file are available beginning with calendar year 1990.

Probabilistic Linked Death File

  • Version A – Death records are linked to the last PDD discharge record, regardless of type of care.
  • Version B – Death records are linked to the last PDD discharge record for acute type of care.

Deterministic Linked Death File

The deterministic linkage, available beginning with calendar year 2005, links the state death statistical master file to the PDD, ED, and AS files. Requesting all 3 files can provide mortality outcomes for any inpatient, emergency department, or licensed ambulatory surgery care setting.

Coronary Artery Bypass Graft (CABG) Data

The Coronary Artery Bypass Graft (CABG) File is a research database created for the purpose of studying outcomes related to the most common surgical procedure for treating coronary artery disease. In this surgery, a vein or artery from another part of the body is used to create a new path for blood to flow to the heart, bypassing the blocked artery. Coronary artery disease is the leading cause of all adult non-maternal admissions to California hospitals, representing nearly 9% of all admissions. CABG data is collected from California-licensed hospitals where surgeons performed isolated CABG surgery, via the California Coronary Artery Bypass Graft (CABG) Outcomes Reporting Program (CCORP). The data is analyzed for quality of care reporting purposes, in compliance with California Health and Safety Code Sections 128745-128750. Data from the California Coronary Artery Bypass Graft (CABG) Outcomes Reporting Program (CCORP) are available for research purposes, subject to review and approval by OSHPD. CABG files are available to qualified researchers beginning with the 2006 calendar year. See the Master Variable Grid for available years.

For additional information, see our Frequently Asked Questions.


Back to Top


Customized Data Service – Request Information

About 

The Healthcare Analytics Branch analysts respond to customized data requests. We specialize in statistical summaries of OSHPD data that can be used to better understand healthcare delivery systems and population health and to support planning, policy development, and performance improvement and evaluation efforts. OSHPD data customers include the legislature, federal, state, and local government agencies, the healthcare industry, insurers, consumer groups, the media and the general public.

Professional analytical staff provide technical assistance on the uses of OSHPD’s data and assist the user in ensuring that correct data and information meets each user’s request.  

OSHPD analysts cannot provide diagnostic or procedural coding advice. 

All data released is aggregate data, and is subject to the CHHS De-identification Guidelines.

Customized Resources Services Cost

Students, nonprofits, and media will receive the first 4 hours of resource services at no charge and subsequent hours at $100/hour. This includes consultation, product development, de-identification, and technical assistance needed to produce data products.

For additional information, see our Frequently Asked Questions.


Back to Top


Publicly Available Data – Information

The following products are available for anyone to order or access. 

Public Use Files

About

Each record within the data sets consists of either one inpatient discharge, or one outpatient encounter, also known as a service visit. Data included in the public datasets includes clinical, payer, and facility information. Review the documentation to determine if the PUF meets your analytical needs. 

To protect individual patient privacy and confidentiality, OSHPD no longer creates Public Use Files (PUFs) of patient discharge, emergency department, and ambulatory surgery data. 

Changes in data availability and data use, as well as changes in the specificity of the medical coding within the records, have made it impossible to create sets of individual patient records for public release that are both de-identified and retain any significant utility.  OSHPD publishes many aggregated data products and continues to offer services to create customized data sets and analyses and to answer data questions. 

For additional information, see our Frequently Asked Questions.


Back to Top

ANNUAL FINANCIAL DISCLOSURE REPORTS

About 

Hospital Annual Financial Disclosure Report. This report is filed annually by each hospital licensed by the State of California. The information collected includes the type of ownership, number of beds, balance sheets and income statements, revenues by payer, and expenses by natural classification.

Long-Term Care Annual Financial Disclosure and Medi-Cal Cost Report. This report is filed annually by each skilled nursing, intermediate care, mentally disordered/developmentally disabled and congregate living health facility licensed by the State of California. The information collected includes the type of ownership, number of beds, balance sheets and income statements, revenues by payer, and expenses by natural classification.

You can search for Annual Reports for individual health facilities on SIERA.

 

QUARTERLY Financial AND UTILIZATION Reports

About 

Quarterly Financial and Utilization Report. This report is filed quarterly by each hospital licensed by the State of California. The information collected includes summary financial and utilization information.

You can search for Quarterly Reports for individual health facilities on SIERA.

CHHS Open Data Portal

About

The California Health and Human Services Agency (CHHS) has launched its Open Data Portal initiative in order to increase public access to one of the State’s most valuable assets – non-confidential health and human services data. Its goals are to spark innovation, promote research and economic opportunities, engage public participation in government, increase transparency, and inform decision-making. “Open Data” describes data that are freely available, machine-readable, and formatted according to national technical standards to facilitate visibility and reuse of published data.

The portal offers access to standardized data that can be easily retrieved, combined, downloaded, sorted, searched, analyzed, redistributed and re-used by individuals, business, researchers, journalists, developers, and government to process, trend, and innovate.


Back to Top


Frequently Asked Questions

General Frequently Asked Questions

How much does the data cost?

OSHPD will no longer charge for patient-level datasets ordered or filled on or after 7/1/2018 for eligible requesters. This includes PDD, EDD and ASD datasets for eligible requests for Limited Dataset Requests (AB2876) and Research Data Request (IPA)  processes as well as for the currently available PUF files (2010-2014.) There are no refunds for data requests fulfilled prior to midnight on 6/30/2018.

What am I eligible for?

Eligible entities or persons can receive patient-level data sets.  Ineligible entities or persons may request the most up to date Public Use File (PUF) or use the Open Data Portal for select datasets, or request a Customized Data Resource Service (i.e., deidentified aggregated data summaries created by professional analytical staff at OSHPD.)

Limited Data Sets: California Licensed Hospitals and Local Health Departments, as well as some State and Federal agencies, are eligible to receive the Limited Data Sets (formerly known as AB2876).  These are HIPAA limited data sets. Hospitals and Local Health Departments can also request Patient Origin/Market Share reports and AHRQ Prevention Quality Indicator Products within the request.  The “Limited Data Sets” were developed by OSHPD to streamline the §128766 data request process for many hospitals and public health entities. The concept behind the limited data sets is that public health officials and hospitals could explain or justify in advance their need for certain data elements for certain common purposes so that each requester would not need to do so each time they requested non-public data under §128766. Before the first limited data sets were designed, there was a series of meetings with interested parties who explained their most common anticipated uses for the data and the data elements that were the minimum necessary for those uses. Each dataset contains a set of the least sensitive data elements that will meet the most common needs cited by these data requesters. The Limited Data Set Documentation provides the justification for the inclusion of each data element for datasets designed for certain purposes. When the data sets are requested for such purposes, the specific justification normally required for each individual data element can be waived.

If a hospital or public health entity wants to request a set of data elements different from those included in the “limited data sets” or a public file, they must justify the need for each data element individually, using the Justification Grids for a Custom Data Set, based on the same limited data files that the Limited Data Sets are developed from.

Research Data Sets: Researchers from non-profit degree-granting research institutions (i.e., Universities) can apply for Research Datasets (formerly known as IPA).  These data are restricted to “minimum variables necessary” and require approval through CPHS. Request for Linked Birth and Linked Death must also be processed by the Vital Statistics Advisory Committee at the California Department of Public Health.  The California Office of Statewide Health Planning and Development (OSHPD) provides confidential patient-level data sets to eligible researchers. Some of these researchers are eligible through the Information Practices Act (or “IPA,” CA Civil Code Section 1798 et seq.), which permits nonprofit educational institutions (such as the University of California) and state agencies to request data for research purposes and for performing legally mandated activities. The contents of these IPA files are described in the Master Variable Grid and the Nonpublic Data Documentation. Additionally, detailed submission guidelines for the patient data are available for the current year.

What are the differences between OSHPD’s confidential data and public datasets? 

Here is a side by side view of the different datasets. 

How is data shipped?

You may receive data digitally via an approved SFTP method. If you require a hard copy, shipments inside of California are sent overnight via GSO. Shipments outside of California are sent via FedEx Ground. If you are outside of California but require expedited shipping you may either pay for the shipping fees or supply us with a shipping label. 


Back to Top


California Hospitals and Local Health Departments

When is the data available?

Patient level data, i.e., PDD, ED, and AS data, is generally available annually by Mid-July for eligible requestors.

How long does it take to receive data after I submit a request?

Approximately 6-8 weeks, dependent on corrections needed.

What years of confidential data are available?

A Limited Data file is available for each year the inpatient discharge (PDD), emergency department (ED) or ambulatory surgery (AS) patient data was collected.

Can a private consultant or contractor order the Limited Data Sets?

No. However, a licensed California hospital or Local Health Department can request the data and submit a signed Business Associate Agreement with a contractor. The contractor would then be able to analyze the data for the hospital. All contacts for corrections and more information for Limited Data Set requests will be made through the contact listed on the request form, no information can be shared with the contractor or consultants.

Is there a template “Business Associate Agreement” available?

OSHPD does not have a template Business Associate Agreement.

What purpose does the Data Use Agreement serve?

HIPAA specifically requires that the data use agreement must:

  • Establish the permitted uses and disclosures of the health information; these must be consistent with the stated purpose. The agreement may not authorize the recipient to use or further disclose the data in a manner that would violate the HIPAA regulations if done by the covered entity;
  • Establish who is permitted to use or receive the limited data set (including any agents and subcontractors)
  • Provide that the data recipient will:
  • Not use or further disclose the information other than as permitted by the agreement or as otherwise required by law.
  • Use appropriate safeguards to prevent use or disclosure of the information other than as provided for in the agreement (i.e., provide appropriate data security).
  • Report to the entity providing the information any use or disclosure that is not provided for in the agreement of which it becomes aware.
  • Ensure that any agents who have access to the information agree to the same restrictions and conditions that apply to the limited data set recipient with respect to such information.
  • Not identify the information or contact the individuals.
What is a “Limited Data Set”?

Under HIPAA, a Limited Data Set is a set of individually identifiable health information. The term “Limited Data Set” refers to a specific subset of data created for a specific purpose.

A Limited Data Set may be created and disclosed if a set of requirements are met:

OSHPD’s Limited Data Sets are required to be consistent with the requirements of 45 CFR Section 164.514. The statutes that govern OSHPD’s release of patient record-level data specify that only the minimum necessary data for the approved purpose may be released. In consultation with hospitals and local health officers OSHPD has developed Limited Data Sets for inpatient discharge (PDD), emergency department (ED), and ambulatory surgery (AS) data. These data sets contain the data elements typically required for the functions of healthcare operations and public health activities; a standard justification for these data elements has been pre-approved. Additional data elements may be requested based on justification of need for these elements. Note that the direct identifiers collected in the OSHPD data are not available for release under H&S Code 128766. The term “Limited Data Set” refers to a specific subset of data created for a specific purpose.


Back to Top


University Researchers

When is the data available? 

Patient level data, i.e., PDD, ED, and AS data, is generally available annually by Mid-July for eligible requestors. Linked Birth Data is current through 2012.  Linked death is current through 2013.

How long does it take to receive data after I submit a request?

Approximately 6-9 months, dependent on corrections needed.

Why does it take longer to get PDD/Linked Birth or Death data?

The California Department of Public Health is required to review all requests that contain Birth and Death Certificate data, which includes our OSHPD Patient data linked to Birth or Death Certificate data. We are not allowed to release PDD/Linked Birth data until it is approved by the CDPH Vital Statistics Advisory Committee. This review is in addition to the approval by OSHPD and CPHS.

I want to link OSHPD data to data from another source.  Is this possible?

It depends on what is being linked and how the data are being used. This is one of the factors we look at when reviewing your request. OSHPD data cannot be used to re-identify actual patients. When a linkage of this nature is needed for a specific research project, details of how the linkage can occur without the identifiers being released to the researcher need to be discussed.

What am I allowed to do with the data?

You may use the data for the project that you have been approved by OSHPD and CPHS to use it for, analyze it, prepare reports and articles and publish your findings.

YOU MAY NOT

  • use the data for a different project
  • use someone else’s approved data for your own project
  • share the data with anyone not explicitly listed in the OSHPD request form and Data Use Agreement
  • publish patient level data  or small cell size counts less that 15
  • change your scope of work in your protocol or your OSHPD request form without proper approvals
  • keep patient-level data in your system after the project has ended
If an article is published from the project I did, am I required to give OSHPD a copy?

You are not required to submit a copy of published materials or papers to us, but we like to know how our data is being used.

Do you publish my information on your website?

We currently list all approved IPA requests on the CHHS Open Data Portal, after the data has been released.


Back to Top


Customized Data Resources 

How much do Customized Data Resources cost? 

Students, nonprofits, and media will receive the first 4 hours of resource services at no charge and subsequent hours at $100/hour. This includes consultation, product development, de-identification, and technical assistance needed to produce data products.

How do you handle small cell sizes?

If necessary, OSHPD will mask for small cell sizes under 11.  Complementary cell masking may also be applied.


Back to Top


Public Data Requests 

What years do you have available for the Public Data Sets? 

We have PDD, ED and AS from 2010-2014.

How long does it take to receive data after I submit a request?

Between 1-5 business days. 

What is the difference between PDD, AS, and ED Public Use Files (PUF)?

The PDD, ED and AS data files represent data submissions from different types of California provider organizations. Patient discharge data is submitted to OSHPD by hospitals, emergency department data is submitted by hospital emergency departments, and ambulatory surgery (general acute care, acute psychiatric, chemical dependency recovery, and psychiatric health facilities) data is submitted by general acute care hospitals and licensed freestanding ambulatory surgery clinics.

Do the Public Data Sets contain demographic variables (Age, gender, race, ethnicity, ect.)?

No, however, the federal Agency for Healthcare Research and Quality (AHRQ), as part of its Healthcare Cost and Utilization Project (HCUP), makes available de-identified files from the OSHPD patient-level data sets that have been statistically manipulated to render them un-linkable to other OSHPD patient-level datasets. Geographical identifiers (ZIP Code and county) have been removed from these files, but not demographic identifiers. Access to these files requires signing a detailed data use agreement and taking a short online training course on data use. More information and application kits are available at the HCUP Central Distributor Technical Assistance Center.

Are 3-digit ZIP Codes available for all records? Is there a masking rule based on population?

Three-digit ZIP Codes are available on all records; this variable is only masked if there is one record per facility (but there is only one or so of those records per file).

Is it possible to get County added to the Public Use File, given that Gender, Race, and Age are removed?

The Public Data Set will not be modified; however, all feedback will be considered in the future. There currently are several county-level products available and more coming soon.

Is the revised Public Use File 3-digit ZIP Code the first three digits or the last three digits of the 5-digit ZIP Code?

The 3-digit ZIP Code is the USPS prefix, or the first three digits of the 5-digit ZIP Code.

Can I show the PUF to my co-workers/affiliates or is this strictly for my own use?

The PUF Data Use Agreement specifies:

In accessing patient level data, I agree to the following:

  • I will not further distribute any patient-level data or individual patient records, and I will not permit others to do so.
  • I will not use or permit others to use the data to learn the identity of any individual patient.
  • I will not link or permit others to link the data with any other individual level data that would increase the potential for patient identification
What is the difference between comma-delimited text format (.txt) and SAS (.sas7bdat) format for the public data sets?

Comma-delimited text format provides the data as ASCII text. SAS formatted files are created using Statistical Analysis Software (SAS), a widely-used statistical data analysis software package, in a format native to the SAS program.

Why can’t I see the data? Do I need a certain version of Excel or SAS software to see the data?

Statistical analysis software (SAS, SPSS, etc.) is required to open .sas7bdat formatted files. Comma-delimited files (.txt) can be opened by multiple software programs, including Excel, Access, SAS, and SPSS.


Back to Top