Sunday, June 9, 2013

SPC for Lean Newbies

I noted my concern a couple of posts back that the Lean Healthcare Transformation Summit 2013 appeared to be way light on the technical detail issue of SPC (Statistical Process Control) as a core component of the PDSA cycle that is otherwise touted as the foundation for the Lean process.

PDSA should really be "SPDSA" -- Study, Plan, Do, Study, Act.

I guess it's implicit in the "Plan" part (study the current state and incorporate the findings into your Plan). But, I didn't see much evidence of the quantification imperative of that in Orlando.

In fairness, my cautionary dubiety about "Six Sigma" aside, the DMAIC people are on point here (props to the Wiki):
The purpose of this step is to clearly articulate the business problem, goal, potential resources, project scope and high-level project timeline. This information is typically captured within project charter document. Write down what you currently know. Seek to clarify facts, set objectives and form the project team. Define the following:

  • A problem statement
  • The customer(s)
  • Critical to Quality (CTQs) — what are the critical process outputs?
  • The target process subject to DMAIC and other related business processes
  • Project targets or goal
  • Project boundaries or scope
  • A project charter is often created and agreed upon during the Define step.
The purpose of this step is to objectively establish current baselines as the basis for improvement. This is a data collection step, the purpose of which is to establish process performance baselines. The performance metric baseline(s) from the Measure phase will be compared to the performance metric at the conclusion of the project to determine objectively whether significant improvement has been made. The team decides on what should be measured and how to measure it. It is usual for teams to invest a lot of effort into assessing the suitability of the proposed measurement systems. Good data is at the heart of the DMAIC process:

  • Identify the gap between current and required performance.
  • Collect data to create a process performance capability baseline for the project metric, that is, the process Y(s) (there may be more than one output).
  • Assess the measurement system (for example, a gauge study) for adequate accuracy and precision.
  • Establish a high level process flow baseline. Additional detail can be filled in later.
The purpose of this step is to identify, validate and select root cause for elimination. A large number of potential root causes (process inputs, X) of the project problem are identified via root cause analysis (for example a fishbone diagram). The top 3-4 potential root causes are selected using multi-voting or other consensus tool for further validation. A data collection plan is created and data are collected to establish the relative contribution of each root causes to the project metric, Y. This process is repeated until "valid" root causes can be identified. Within Six Sigma, often complex analysis tools are used. However, it is acceptable to use basic tools if these are appropriate. Of the "validated" root causes, all or some can be

  • List and prioritize potential causes of the problem
  • Prioritize the root causes (key process inputs) to pursue in the Improve step
  • Identify how the process inputs (Xs) affect the process outputs (Ys). Data is analyzed to understand the magnitude of contribution of each root cause, X, to the project metric, Y. Statistical tests using p-values accompanied by Histograms, Pareto charts, and line plots are often used to do this.
  • Detailed process maps can be created to help pin-point where in the process the root causes reside, and what might be contributing to the occurrence.
The purpose of this step is to identify, test and implement a solution to the problem; in part or in whole. Identify creative solutions to eliminate the key root causes in order to fix and prevent process problems. Use brainstorming or techniques like Six Thinking Hats and Random Word. Some projects can utilize complex analysis tools like DOE (Design of Experiments), but try to focus on obvious solutions if these are apparent.

  • Create innovative solutions
  • Focus on the simplest and easiest solutions
  • Test solutions using Plan-Do-Study-Act (PDSA) cycle
  • Based on PDSA results, attempt to anticipate any avoidable risks associated with the "improvement" using FMEA
  • Create a detailed implementation plan
  • Deploy improvements
The purpose of this step is to sustain the gains. Monitor the improvements to ensure continued and sustainable success. Create a control plan. Update documents, business process and training records as required.

A Control chart can be useful during the control stage to assess the stability of the improvements over time.
OK, thought experiment example. I Googled "control chart" and just picked one based on visual appeal.

So, let's call this Current State Customer Support Email Response Cycle Time and do a quick bit of Photoshopping. The idea here is cycle time from date/time receipt of a customer support email request to the time a response is recorded as "delivered" (not opened and read, just "delivered" -- because that's all we control).

I eyeballed and added the 2 sigma upper and lower "warning limit" lines in yellow.

Let's assume we culled a random sample of n=160 out of our support email server inbox. We find a current state of roughly two days response time, ~58 hours worst case. Sample appears to be roughly normally distributed (though we could test for that), and compliant with Gaussian assumptions for our purposes (though 2 CL "outliers" at n=160 begs a question; it's ~5x what we might expect by The Book. Still...).
  • Standard Deviation ("1 sigma") is 3.38 (I had to calculate this from the original data; no biggie).
  • C.V. ("Coefficient of Variation," a.k.a "Relative Standard Deviation" or "RSD") is ~7.1%, meaning we can unremarkably expect +/- 7.1% variation around the mean response time, current process (that's what "standard deviation" means -- expected variability).
  • The variation spread between the UCL and LCL, then, is about 42.5% relative to the mean.
The RSD is simply a measure of variability relative to the mean. It is useful. High RSD is a red flag, given that a core goal of any QI method is reduction of variation.

OK: Notwithstanding that this appears to be (in our thought experiment) a representative baseline random sample (no evident non-zero trendline, one basic marker of process instability), I'd be wanting to drill down deeper. But, that's another, more subtle issue.
For example, might we isolate all of the encounters which are, say, below -1 sigma (quicker response times), and look for any commonalities (i.e., identifiable "special causes")? As I noted in prior posts discussing HEDIS data examples, I might see a nominally random scatter depicting no apparent relation between cost and quality (below, CAD outcomes by cost proxies), but I'd be on the data-mining lookout for anything unique in that first quadrant. What are the people in the high-quality, low-cost segment doing right?


OK, so, back to our "control chart," we have some current state data. We then have to decide upon what will constitute a "significant" improvement should we undertake to try a process change. In science, you decide and document this prior to proceeding to your "Do" stage.

The salient (and difficult) question becomes one of declaring something along the lines of "we can reduce response cycle time by 20% with a concomitant reduction in variability" by doing "X".

At this point, "Do X," measure the upshot ("Study"), and "Act" on the basis of your findings.

This stuff is no "thought experiment" abstraction to me. It was my daily life in the 1980's in Oak Ridge (below). I painstakingly wrote the code that rendered this (PDF).

This is admittedly pedestrian"old school" QC stuff, but it's at the heart of being scientific.


While attending the Lean Healthcare Transformation Summit 2013 "CEO Panel" discussion session last week, I had the irascible thought "my, my, -- what an incredibly diverse group of middle-aged white men." I noted the absence of women CEOs in a tweet.

This just came in my inbox.


Had to Photoshop it.

From a comment on The Health Care Blog today.


...SNOMED CT clinical terminology is not widely adopted among providers and vendors, yet Stage 2 starts in October 2013 for hospitals. In particular, EHRs don’t capture communication codes present in 2014 CQMs, such as a specific code that conveys among physicians the degree of a medical condition, or “exclusion” codes that give a patient’s reason for declining medication or notes a patient doesn’t qualify for the medication, DeLano explains. Nor are most providers yet familiar with using SNOMED for clinical documentation, he adds.

Further, adopting SNOMED codes for clinical documentation is a major task, not so far from the complexity of ICD-10, DeLano contends, but the time needed to focus on SNOMED isn’t available as the industry adopts ICD-10. There are benefits to using SNOMED, but if providers and vendors aren’t ready for it, then they won’t be able to attest for meaningful use, he notes. “Providers think they are good because they are on a certified EHR product, but won’t get the clinical quality measures they want if the codes aren’t properly mapped.”

Asked if the federal government recognizes a gap in SNOMED readiness for Stage 2, Delano says, “I think there is awareness that there will be a shortfall in the reporting of CQMs.”...
Interesting. Concerns have been voiced over the utility of CQMs. e.g.,
Validity of electronic health record-derived quality measurement for performance monitoring

Amanda Parsons, Colleen McCullough, Jason Wang, and Sarah Shih

J Am Med Inform Assoc. 2012 Jul-Aug; 19(4): 604–609.
Published online 2012 January 16.
...We looked across the 11 clinical quality measures to assess where information was documented. The presence of data recognized for automated quality measurement varied widely, ranging from 10.7% to 99.9% (table 2). Measure components relying on vitals, vaccinations, and medications had the highest proportion of information documented in structured fields recognized by the automated quality measures. The majority of diagnoses for chronic conditions such as diabetes (>91.4% across measures), hypertension (89.3%), ischemic cardiovascular disease (>78.8% across measures) and dyslipidemia (75.1%) were documented in the problem list, a structured field used for automated quality measurement. Patient diagnoses not recognized for inclusion in the measure were recorded in the medical history, assessment, chief complaint, or history of present illness, sections that typically allow for free-text entries.

Diagnostic orders or results for mammogram had the lowest proportion (10.7%) of data recorded in structured fields recognized for automated quality measurement. The majority of the information for breast cancer screening was found as scanned patient documents and diagnostic imaging; both sources of information are not amenable for automated electronic queries.

Nearly half of the information for measures that require a laboratory test result, such as control of hemoglobin A1c and cholesterol, was documented in structured fields recognized for automated quality measurement (range 53.4–63.0%). Similarly, only half of the information regarding patient smoking status (53.4%) was recognized for automated quality measurement.

With the exception of medications, vaccinations, and blood pressure readings, practices varied substantially in where they chose to document the data elements required for automated quality measurement.

In estimating the denominator loss due to unrecognizable documentation, the average practice missed half of the eligible patients for three of the 11 quality measures—hemoglobin A1c control, cholesterol control, and smoking cessation intervention (table 3). No statistically significant differences were observed between the e-chart and EHR automated quality measurement scores in the number of patients captured for the denominator for the remaining eight measures. Current EHR reporting would underreport practice numerators for six of the 11 measures—hemoglobin A1c control, hemoglobin A1c screening, breast cancer screening, cholesterol control, cholesterol screening, and smoking status recorded.

...More studies are needed to assess the validity of EHR-derived quality measures and to ascertain which measures are best calculated using claims or administrative data or a combination of data sources. If provider-specific quality measurements are to be reported and made public, as is the plan for the meaningful use quality measures, further analysis is needed to understand the limitations of these data, particularly if they are prone to underestimation of true provider performance.
See also

Inaccurate quality reports could skew EHR incentives: study
By Maureen McKinney
Posted: January 15, 2013 - 1:00 pm ET

Electronically reported clinical quality measures vary widely in accuracy, an obstacle that could hinder the federal government's electronic health-record incentive program, according to a study appearing in the Annals of Internal Medicine.
The problem could lead to the highest quality providers not being given the intended incentives, the study concluded.

Beginning in 2014, participants in the CMS' EHR incentive program will be required to report quality data via EHRs. Currently, most quality-reporting initiatives rely on administrative billing data, which has drawn criticism for a lack of clinical relevance, or manual record review, which is time-consuming. Many experts have pointed to EHR-extracted quality data as the best representation of actual patient care.

But researchers, using 2008 data from more than 1,100 patients treated at a federally qualified health center, found big gaps in sensitivity from one electronic measure to another. For instance, electronic measures significantly underestimated the health center's rates of pneumococcal vaccinations and appropriate use of asthma medication, when compared with manual record review...
“If electronic reports are not proven to be accurate, their ability to change physicians' behavior to achieve higher quality, the underlying goal, will be undermined,” they said in the study...
CQMs sometimes reek of "Quadrant Three."

More to come...

No comments:

Post a Comment