QR code linking to this site

Open this on your own device

https://black-jl.github.io/data-analysis-lab/

Today's lab

You will work through two healthcare data-analysis exercises using whatever AI tool you prefer (ChatGPT, Claude, Microsoft Copilot, etc.). Both exercises follow the same arc: download a dataset, clean it, ask an AI good analytical questions, and verify the answers.

Heads up All data used here is fictionalized for training. Never upload real patient data, PHI, or sensitive operational data to a public AI tool.
Exercise 1

AI-Assisted Patient Satisfaction Data Analysis

Joint Outpatient Experience Survey (JOES) data — Aspen Pass Army Medical Center
The scenario
You are a leader at Aspen Pass Army Medical Center tasked with improving patient satisfaction. Before launching new initiatives, you need to assess the current state of satisfaction at your facility. You have obtained one year of patient satisfaction data covering both your facility and comparable Army facilities.

About the data

The Joint Outpatient Experience Survey (JOES) combines and standardizes outpatient satisfaction surveys across Army, Navy, and Air Force medical facilities. It focuses on beneficiary experience with care received at Military Treatment Facilities (MTFs). The Defense Health Agency uses JOES data quarterly to generate "best of the best" reports ranking top-performing clinics, providers, and support staff across the Military Health System.

Key aspects evaluated:

  • Appointment timeliness and scheduling efficiency
  • Helpfulness and professionalism of front desk personnel
  • Provider trust and quality of clinical communication
  • Thoroughness of medical history review
  • Access to care and appointment availability
  • Overall satisfaction with the visit experience

This exercise uses fictionalized data modeled after authentic JOES datasets to provide realistic training while maintaining data privacy.

Download

Steps

  1. Download the patient satisfaction data file

    Obtain the one-year JOES dataset above. Save it somewhere accessible on your computer.

  2. Clean the data using AI

    Healthcare data often contains inconsistencies, duplicate entries, missing values, and formatting errors that undermine analysis accuracy. Use your AI tool to identify and fix these issues. Keep a record of what changes you make.

  3. Upload the clean data to an AI analysis platform

    Once cleaning is complete and you have verified the corrections, upload the cleaned dataset to an AI platform capable of data analysis — ChatGPT, Claude, or Microsoft Copilot with data analysis capabilities.

  4. Ask structured questions using AI prompts

    Structured, context-rich prompts significantly improve AI accuracy. Try prompts like:

    "Compare patient satisfaction scores for Aspen Pass Army Medical Center against the average of comparable Army facilities. Identify areas where Aspen Pass performs above or below benchmark."
    "Analyze satisfaction trends by department or clinic type. Which specialties have the highest and lowest patient satisfaction ratings?"
    "Identify the top three factors contributing to patient dissatisfaction at Aspen Pass based on survey responses."
    "Calculate the percentage of patients rating overall satisfaction as excellent, good, fair, or poor for Aspen Pass versus comparator facilities."

    Effective prompts specify the analytical goal, the desired output format, and the relevant context.

  5. Validate the results

    Don't trust AI output blindly. Validation techniques:

    • Verify data completeness — are all records and fields accounted for?
    • Cross-reference AI calculations with manual spot-checks of source data
    • Check that findings align logically across different analytical queries
    • Compare AI insights against known benchmarks or expected patterns

    Never rely solely on AI output for clinical or operational decisions without human verification.

Data dictionary

VariableTypeLabelJOES Question / Meaning
SEXCategoricalPatient SexPatient sex/gender identifier
MEPRS4CategoricalMEPRS Clinic CodeMedical Expense and Performance Reporting System (MEPRS) clinic/service code
CLINIC_NAMETextClinic NameName of the clinic or outpatient service
BENCATCategoricalBeneficiary CategoryMilitary beneficiary category of patient
PATAGENumericPatient AgePatient age in years
QGENERAL_1Likert (1–5)General Satisfaction #1"Overall, how satisfied were you with your visit?"
QGENERAL_2Likert (1–5)General Satisfaction #2"Overall, how satisfied were you with the clinic/facility?"
QVISITOEv1Open-endedVisit Comment"Please provide comments about your visit/provider experience."
QFAC_OEv1Open-endedFacility Comment"Please provide comments about the facility or clinic experience."
QSEEPROEOpen-endedAccess to Provider Comment"Please provide comments about your ability to see your provider when needed."
facility_nameTextMTF NameName of military treatment facility
return_dateDateSurvey Return DateDate survey was submitted or returned

BENCAT codes

CodeMeaningCodeMeaning
ADActive DutyRETFAMRetiree Family Member
ADFAMActive Duty Family MemberDRDependent Retiree
DADependent AdultMCRMedicare Eligible Retiree
DCDependent ChildMCRFAMMedicare Eligible Retiree Family Member
RETRetireeRESReservist
NGNational GuardCIVCivilian
OTHEROther Beneficiary Category
Deliverable
  1. A "clean" data file. Keep a record of the changes you made.
Exercise 2

AI-Assisted Supply Chain Analysis

DMLSS transaction data — Medical Logistics at WHMC/SAMMC
The scenario
The Base Realignment Commission (BRAC) directed all inpatient and specialty care to relocate from Wilford Hall Medical Center (WHMC) to the new San Antonio Military Medical Center (SAMMC). The Medical Logistics Department must support both facilities during the transition.

The problem

  • Customer Processing Division staffing has been cut from 11 to 3 personnel
  • Same logistics mission: 14 customer accounts served by 2 delivery trucks
  • External deliveries take half a day each — the current tempo is unsustainable
  • You must determine the maximum delivery interval each customer can tolerate based on their ordering patterns

Your task

Using AI tools, analyze the DMLSS transaction dataset to determine each customer's consumption patterns and recommend a revised delivery schedule.

About the dataset

  • DMLSS transactions from 14 customer accounts (one tab per customer)
  • Each tab represents an external customer
  • Deliveries to each external customer take ½ day (4 hours)
  • 26,500+ line-item transactions over approximately 12 months
  • Fields: Item ID, Description, Date Posted, Unit Price, Quantity

Download

Key questions to answer

  1. Maximum delivery interval per facility

    What is the maximum delivery interval for each facility based on their ordering patterns?

  2. Can the new schedule actually be executed?

    Assuming you reduce the delivery schedule to account for the average / min / max service interval for each account, does this solve your problem? Can you achieve your proposed delivery schedule given:

    • 3 personnel
    • 5-day work week
    • 4 hours per delivery
    • 8-hour shifts for each employee
  3. If it doesn't work, refine

    If your proposed schedule doesn't solve the problem, how would you refine your analysis to ensure that the new delivery schedule is achievable?

Deliverables
  1. Charts showing ordering frequency by customer
  2. Average service interval per customer account
  3. Recommended delivery frequency reduction
  4. Briefing slide(s) presenting your findings

Data source: DMLSS transaction records, WHMC Lackland AFB, 2010–2011 (fictionalized for training).