The following events take place Tuesday, March 26th, a day prior to the Health Datapalooza.

Health Data Workshops

In collaboration with:

careset logo

Location: Washington Hilton (1919 Connecticut Ave., NW,  Washington, D.C. 20009)
: $150/class
Registration: Click Here


Session 1A: Introduction to NPI and NPPES
Time: 9:00 a.m. - 12:30 p.m. ET
Location: Lincoln East

NPPES is the public data that lists every provider (both doctor and hospital) in the United States. It details where they provide treatment, what type of healthcare provider they are, and basic contact information. Typically, claims data analysis and healthcare system modeling begin with this dataset. Data Source:

This year, there are three new files coming out of NPPES include health information exchange endpoints, additional locations and other business names. This class will explain all of the fields, and the basic instructions for getting the data into an online database like AWS or Google database products, and how to filter the data on basic fields. Key topics will include:

  • Short history of HIPAA and its relationship to NPPES.
  • How to break the NPPES data into state-level data that can be loaded easily into Excel.
  • How to use csvkit to filter the NPPES csv file.
  •  Understanding the National Uniform Claim Committee Healthcare Provider Taxonomy.
  •  How to properly determine the “primary” provider type (taxonomy) for a healthcare provider.
  • Working with credentials in NPPES (be careful).
  • Review the address information in NPPES.
  • Review phone information in NPPES.
  • Review new endpoints file.
  • Review new other-name fileReview new practice location file.

Session 2A: Referral/Patient Sharing Data Tutorial
Time: 1:30 p.m. – 5:00 p.m. ET
Location: Lincoln East

The “DocGraph” dataset shows how Medicare Providers share patients in time. This large, graph dataset can reveal the structure of the healthcare system, showing how patients flow through medicare. Learn about the dataset that is frequently studied as one of the largest social graph datasets, using real-names, that is available to the public. This class will also cover MrPUP, which is the explicit referral dataset, as opposed to the implicit referral dataset. Key topics will include:

  • Brief history of the dataset.
  • The basic concepts of working with graph datasets, notions of centrality, etc.
  • Understanding the basic structure of the DocGraph HOP Datase.t
  • Understanding how DocGraph “Hop” differs from the original FOIA version of the dataset.
  • Using NPPES to filter a subset of the graph
  • Loading the graph of small state.
  • Querying the graph for a single provider.
  • Looking at secondary relationships, considering the “dandelion graph.”
  • Understanding the relationships between referral data and medication/utilization data.
  • Understanding MrPUP and explicit referral relationships.
  • Review of tools for next steps. Neo4J and GePhi

Time: 9:00 a.m. - 5:00 p.m. ET
Location: AcademyHealth (1666 K Street NW, Ste. 1100, Washington, D.C.)
Cost: $50
Register: Click Here

As Application Programming Interfaces proliferate across the industry we still face challenges for consumers, data holders and application developers. How can API endpoints be discovered? How can consumers have confidence in applications that are receiving their health data? How can data holders know which applications to trust? How can we address all of these challenges without limiting choice for consumers? is hosting a pre-Palooza un-conference that will examine and brainstorm these issues and discuss standard approaches to addressing the challenges in a way that can unleash innovation that will enable an accelerated adoption and use of APIs for health data. If you are an application developer producing consumer-facing health apps, or if you are a data holder needing to provide an API for consumers to use to share their health data you want to be at this event.

Supported by:

Innovation Horizon new wave logo