Registration to the Annual Research Meeting (ARM) is not required to attend the health data workshops. 

Registration: To register for both the health data workshops and the ARM, use the green online registration button. If you would like to register for a health data workshop, and not the full ARM program, please use the registration form and fax, email, or mail the form to AcademyHealth.

Session #1: Introduction to NPI and NPPES


Date: Tuesday, June 4, 2019
Time: 8:30 a.m. - 11:30 a.m.
Location: Marriott Marquis - Liberty I/J (Meeting Room Level 4)
Cost: $125

NPPES is the public data that lists every provider (both doctor and hospital) in the United States. It details where they provide treatment, what type of healthcare provider they are, and basic contact information. Typically, claims data analysis and healthcare system modeling begin with this dataset.

Data Source: http://download.cms.gov/nppes/NPI_Files.html

This year, there are three new files coming out of NPPES include health information exchange endpoints, additional locations and other business names. This class will explain all of the fields, and the basic instructions for getting the data into an online database like AWS or Google database products, and how to filter the data on basic fields. Key topics will include:

  • Short history of HIPAA and its relationship to NPPES.
  • How to break the NPPES data into state-level data that can be loaded easily into Excel.
  • How to use csvkit to filter the NPPES csv file.
  • Understanding the National Uniform Claim Committee Healthcare Provider Taxonomy.
  • How to properly determine the “primary” provider type (taxonomy) for a healthcare provider.
  • Working with credentials in NPPES (be careful).
  • Review the address information in NPPES.
  • Review phone information in NPPES.
  • Review new endpoints file.
  • Review new other-name file
  • Review new practice location file.

Session #2: Referral/Patient Sharing Data Tutorial


Date: Tuesday, June 4, 2019
Time: 12:30 p.m. - 3:30 p.m.
Location: Marriott Marquis - Liberty I/J (Meeting Room Level 4)
Cost: $125

The “DocGraph” dataset shows how Medicare Providers share patients in time. This large, graph dataset can reveal the structure of the healthcare system, showing how patients flow through medicare. Learn about the dataset that is frequently studied as one of the largest social graph datasets, using real-names, that is available to the public. This class will also cover MrPUP, which is the explicit referral dataset, as opposed to the implicit referral dataset. Key topics will include:

  • Brief history of the dataset.
  • The basic concepts of working with graph datasets, notions of centrality, etc.
  •  Understanding the basic structure of the DocGraph HOP Datase.t
  • Understanding how DocGraph “Hop” differs from the original FOIA version of the dataset.
  • Using NPPES to filter a subset of the graph
  • Loading the graph of small state.
  • Querying the graph for a single provider.
  • Looking at secondary relationships, considering the “dandelion graph.”
  • Understanding the relationships between referral data and medication/utilization data.
  • Understanding MrPUP and explicit referral relationships.
  • Review of tools for next steps. Neo4J and GePhi