AI for Healthcare Data Analysis

Open this on your own device

https://black-jl.github.io/data-analysis-lab/

Today's lab

You will work through two healthcare data-analysis exercises using whatever AI tool you prefer (ChatGPT, Claude, Microsoft Copilot, etc.). Both exercises follow the same arc: download a dataset, clean it, ask an AI good analytical questions, and verify the answers.

1 Patient Satisfaction 2 Supply Chain

Heads up All data used here is fictionalized for training. Never upload real patient data, PHI, or sensitive operational data to a public AI tool.

Exercise 1

AI-Assisted Patient Satisfaction Data Analysis

Joint Outpatient Experience Survey (JOES) data — Aspen Pass Army Medical Center

The scenario

You are a leader at Aspen Pass Army Medical Center tasked with improving patient satisfaction. Before launching new initiatives, you need to assess the current state of satisfaction at your facility. You have obtained one year of patient satisfaction data covering both your facility and comparable Army facilities.

About the data

The Joint Outpatient Experience Survey (JOES) combines and standardizes outpatient satisfaction surveys across Army, Navy, and Air Force medical facilities. It focuses on beneficiary experience with care received at Military Treatment Facilities (MTFs). The Defense Health Agency uses JOES data quarterly to generate "best of the best" reports ranking top-performing clinics, providers, and support staff across the Military Health System.

Key aspects evaluated:

Appointment timeliness and scheduling efficiency
Helpfulness and professionalism of front desk personnel
Provider trust and quality of clinical communication
Thoroughness of medical history review
Access to care and appointment availability
Overall satisfaction with the visit experience

This exercise uses fictionalized data modeled after authentic JOES datasets to provide realistic training while maintaining data privacy.

Download

Dataset

JOES Patient Satisfaction Data

One-year fictionalized JOES dataset, Aspen Pass + comparable Army facilities. Excel (.xlsx).

Download →

Steps

Download the patient satisfaction data file
Obtain the one-year JOES dataset above. Save it somewhere accessible on your computer.
Clean the data using AI
Healthcare data often contains inconsistencies, duplicate entries, missing values, and formatting errors that undermine analysis accuracy. Use your AI tool to identify and fix these issues. Keep a record of what changes you make.
Upload the clean data to an AI analysis platform
Once cleaning is complete and you have verified the corrections, upload the cleaned dataset to an AI platform capable of data analysis — ChatGPT, Claude, or Microsoft Copilot with data analysis capabilities.
Ask structured questions using AI prompts
Structured, context-rich prompts significantly improve AI accuracy. Try prompts like:

"Compare patient satisfaction scores for Aspen Pass Army Medical Center against the average of comparable Army facilities. Identify areas where Aspen Pass performs above or below benchmark."

"Analyze satisfaction trends by department or clinic type. Which specialties have the highest and lowest patient satisfaction ratings?"

"Identify the top three factors contributing to patient dissatisfaction at Aspen Pass based on survey responses."

"Calculate the percentage of patients rating overall satisfaction as excellent, good, fair, or poor for Aspen Pass versus comparator facilities."

Effective prompts specify the analytical goal, the desired output format, and the relevant context.
Validate the results
Don't trust AI output blindly. Validation techniques:
- Verify data completeness — are all records and fields accounted for?
- Cross-reference AI calculations with manual spot-checks of source data
- Check that findings align logically across different analytical queries
- Compare AI insights against known benchmarks or expected patterns
Never rely solely on AI output for clinical or operational decisions without human verification.

Data dictionary

Variable	Type	Label	JOES Question / Meaning
`SEX`	Categorical	Patient Sex	Patient sex/gender identifier
`MEPRS4`	Categorical	MEPRS Clinic Code	Medical Expense and Performance Reporting System (MEPRS) clinic/service code
`CLINIC_NAME`	Text	Clinic Name	Name of the clinic or outpatient service
`BENCAT`	Categorical	Beneficiary Category	Military beneficiary category of patient
`PATAGE`	Numeric	Patient Age	Patient age in years
`QGENERAL_1`	Likert (1–5)	General Satisfaction #1	"Overall, how satisfied were you with your visit?"
`QGENERAL_2`	Likert (1–5)	General Satisfaction #2	"Overall, how satisfied were you with the clinic/facility?"
`QVISITOEv1`	Open-ended	Visit Comment	"Please provide comments about your visit/provider experience."
`QFAC_OEv1`	Open-ended	Facility Comment	"Please provide comments about the facility or clinic experience."
`QSEEPROE`	Open-ended	Access to Provider Comment	"Please provide comments about your ability to see your provider when needed."
`facility_name`	Text	MTF Name	Name of military treatment facility
`return_date`	Date	Survey Return Date	Date survey was submitted or returned

BENCAT codes

Code	Meaning	Code	Meaning
`AD`	Active Duty	`RETFAM`	Retiree Family Member
`ADFAM`	Active Duty Family Member	`DR`	Dependent Retiree
`DA`	Dependent Adult	`MCR`	Medicare Eligible Retiree
`DC`	Dependent Child	`MCRFAM`	Medicare Eligible Retiree Family Member
`RET`	Retiree	`RES`	Reservist
`NG`	National Guard	`CIV`	Civilian
`OTHER`	Other Beneficiary Category

Deliverable

A "clean" data file. Keep a record of the changes you made.

Exercise 2

AI-Assisted Supply Chain Analysis

DMLSS transaction data — Medical Logistics at WHMC/SAMMC

The scenario

The Base Realignment Commission (BRAC) directed all inpatient and specialty care to relocate from Wilford Hall Medical Center (WHMC) to the new San Antonio Military Medical Center (SAMMC). The Medical Logistics Department must support both facilities during the transition.

The problem

Customer Processing Division staffing has been cut from 11 to 3 personnel
Same logistics mission: 14 customer accounts served by 2 delivery trucks
External deliveries take half a day each — the current tempo is unsustainable
You must determine the maximum delivery interval each customer can tolerate based on their ordering patterns

Your task

Using AI tools, analyze the DMLSS transaction dataset to determine each customer's consumption patterns and recommend a revised delivery schedule.

About the dataset

DMLSS transactions from 14 customer accounts (one tab per customer)
Each tab represents an external customer
Deliveries to each external customer take ½ day (4 hours)
26,500+ line-item transactions over approximately 12 months
Fields: Item ID, Description, Date Posted, Unit Price, Quantity

Download

Dataset

DMLSS Transaction Dataset

14 tabs — one per external customer. 26,500+ line items over ~12 months. Excel (.xlsx).

Download →

Key questions to answer

Maximum delivery interval per facility
What is the maximum delivery interval for each facility based on their ordering patterns?
Can the new schedule actually be executed?
Assuming you reduce the delivery schedule to account for the average / min / max service interval for each account, does this solve your problem? Can you achieve your proposed delivery schedule given:
- 3 personnel
- 5-day work week
- 4 hours per delivery
- 8-hour shifts for each employee
If it doesn't work, refine
If your proposed schedule doesn't solve the problem, how would you refine your analysis to ensure that the new delivery schedule is achievable?

Deliverables

Charts showing ordering frequency by customer
Average service interval per customer account
Recommended delivery frequency reduction
Briefing slide(s) presenting your findings

Data source: DMLSS transaction records, WHMC Lackland AFB, 2010–2011 (fictionalized for training).