InforMedix Marketing Research, Inc. 1
SelectionofaStratifiedRandomSample
StevenJ. Fuller
InforMedixMarketingResearch, Inc.
A.ABriefOverview
Stratifiedrandomsamplingisatechniqueusedtoimprovetheaccuracyofmarket sizing surveyresults, ortolowerthecostofthis type ofsurveywithoutlosingaccuracy. Withaproperlydesignedsample, thetotalnumberofsurveycontactscansometimesbereducedbymorethan 50%, comparedtosimplerplans, withoutlosinganyaccuracyintheresults. Thetechniqueisoftenusedinpreparingrandomsamplesforlargequantitativesurveys.
Stratifiedrandomsamplinginvolvesdividingthemarketupintosegmentsthataredifferentfromeachother, studyingeachofthesegmentsseparately, andthenputtingtheseparateresultsbacktogetherusingaweightedaverage. Thisweightedaverageofseveralprecisemeasurementscanbebetterthanageneralmeasurementofthewholediversemarket.
Togainthegreatestbenefitfromthistechnique, itisimportanttohavesomeadvanceinformationaboutthemarketbeingsurveyed. Inparticular, onemustrecognizethatthetotalmarketcontainssubsetswhicharedifferentfromeachother -- andmustbeabletosayhowthesubsetsdiffer. ItiscommontosubdivideU.S. marketsintogroupswhichdifferbygeography, volumeofpurchasing, age, sex, andsoon.
Thisreportexplainsthebenefitsofstratifiedrandomsampling, anddemonstrateshowtoconstructsuchasample, usingrealdata. TheexamplecomesfromaresearchinvestigationofthemarketforpatientmonitorsinU.S. hospitals, butthetechniquescanbeappliedtoalmostanyproductorcustomergroup.
B.TheEasyWay -- ProportionalSampling
Tounderstandthemethodofpreparingastratifiedrandomsample,andthebenefitsthistechniquecanprovide, itishelpfultostartbydescribinganeasierandmorecommonmethodusedforselectingasample. Veryoften, samplesarechosenwithasimpletechniquecalled“proportionalsampleallocation”, whichisnomorethanpickingsurveyrespondentsatrandomfromthecompletelistofavailablerespondents, calledtheuniverse.
Forinstance, ifwewanttoknowwhatfractionofthehospitalsinthecountryuseaparticulartypeofhigh-techpatientmonitor, wecanjustsurvey, say, 10% ofallthehospitalsintheAHAGuide, oranothercompletelistofhospitals. Inthiscase, thesamplingproportionis 10%, andiftheuniversecontains 5,151 hospitals, wewouldsurveyeverytenthoneinthelist, until 515 siteshadbeeninterviewed. Theresultsofthesurveywouldgiveafairlyaccurateestimateoftheaveragenumberofmonitorsperhospital, andthetotalnumberofmonitorsintheU.S.
TheresultsofsuchasurveyareshowninthegraphinFigure 1, andthehorizontallinepointsouttheaveragenumberofmonitorsperhospital, whichisabout 20.
Figure 1: Proportional Sampling
C.WastedTelephoneCalls
Whilethisisausefulresult, thequestionremainswhetheritwasacquiredinthemosteconomicalway. ManyresearchanalystswouldlookatthegraphinFigure 1, andwonderwhattheyreallylearnedfromallthedatapointsformingthedenseblackcloudinthelowerleftcorner. Andtheymightbeuncomfortablewiththesmallamountofsolidinformationtheyhaveaboutlargerhospitals, wherethedatapointsarefewandsparselydistributed.
Therearetworeasonswhythissamplingplanwastestheresearcher’stimeandaddsunnecessarychargestothetelephonebillortheBusinessReplyMailaccount. Oneisthatitmaynotmatchacompany’smarketingpriorities; theotheristhatagooddealoftheinformationcollectedisstatisticallyuseless. Theseproblems, andwaystosolvethem, areexplainedinthefollowingparagraphs.
D.TheMarketingProblem
Theproblemfromamarketingperspectiveisthatmostmanufacturersofmedicalproductsaremoreinterestedinlargehospitals, whichdrivethemajorityoftheirsalesandproductdevelopmentinitiatives. Soanysurveythatcontactsmanysmallhospitalsattheexpenseofsamplingthelargeoneswillbequestionedbythesalesmanager, ifnotthecompany’sstatistician.
TheproportionalallocationwhichgavethedatainthegraphispresentedinFigure 2.
Total Hospital Size (Staffed Beds) / Hospitals Available / Number of Respondents / Percent of Universe Sampled1 - 24 / 234 / 23 / 10%
25 - 49 / 871 / 87 / 10%
50 - 99 / 1,073 / 107 / 10%
100 - 199 / 1,218 / 122 / 10%
200 - 299 / 773 / 77 / 10%
300 - 399 / 443 / 44 / 10%
400 - 499 / 238 / 24 / 10%
500 + / 301 / 30 / 10%
Total Market / 5,151 / 515 / 10%
Figure 2: Proportional Sampling Plan: 10% of Market
Itiseasytoseethatsmallhospitalsaccountedforaverylargepartofthissurvey: 217 respondents, orover 40% ofthetotal, werehospitalsunder 100 beds. Thisisbecause 40% ofthehospitalsintheU.S. areunder 100 beds, andtheproportionalallocationschemerequiredsurveying 10% ofhospitalsofallsizes. Butunlessthecompanyhasaparticularproductstrategyaimedatsmallhospitals, itwouldbeanerrortofocussomuchofthesurveyonsmallcustomers.
Tenpercentofthelargehospitalswerecontacted, too. Butsincetherearerelativelyfewlargehospitals, only 98 responsesweregatheredfromsiteswithmorethan 300 beds. Thismeansthatlessthan 20% ofthesurveyprovidedinformationaboutamarketsegmentthatisusuallyofgreatimportancetomedicalmarketresearchers.
Fromamarketingstandpoint, itmaybeacceptabletoarbitrarilyeliminatesomepartsofthehospitalmarketaltogether. Manysurveyplannersdothisbyexcludinghospitalsunder 100 beds, long-termcarehospitals, psychiatrichospitals, VeteransAdministrationsites, andsoon. Othersdecidefromtheoutsettosurveyonly 1% ofthesmallesthospitals, whiledevelopingaproperlystratifiedsamplefortherest. Thereisnothingwrongwiththesesolutions, aslongastheresearcheriswillingtosettleforlessaccurateinformationabouttheunder-sampledpartsofthemarket.
E.TheStatisticalProblem
Thesecondreasonisstatistical, andshouldbekeptinmindwhenplanninganymarketresearchsurvey: Whenyouhaveenoughinformationtodrawconclusionswithconfidence, itistimetostopcollectingdata. Intheexample, itisclearthatsmallhospitalsdon’tvarywidelyintheiruseofmonitors. Afterthisbecameevident (probablyafterafewdozeninterviews), nothingusefulwasgainedbycontinuingthesurveytohundredsofsmallhospitals.
WithsurveyresultssuchasthoseinFigure 1, theresearchercanbeextremelyconfidentindrawingconclusionsabouthospitalsunder 200 beds. Amonglargerhospitals, though, especiallythoseover 400 beds, itisanyone’sguesswhattheaveragenumberofmonitorsperhospitalmightbe. Thisisbecauselargehospitalsvarygreatlyintheiruseofthisproduct, andthesurveyhasnotgatheredenoughinformationtostateanyconclusionswithconfidence.
Todemonstratethisstatistically, thechartinFigure 3 shows 90% confidenceintervalsformarketestimatesineachbed-sizesegment.
Hospital Size (Staffed Beds) / Percent of Universe Sampled / 90% Confidence Interval1 - 24 / 10% / +/- 18%
25 - 49 / 10% / +/- 6%
50 - 99 / 10% / +/- 5%
100 - 199 / 10% / +/- 5%
200 - 299 / 10% / +/- 7%
300 - 399 / 10% / +/- 11%
400 - 499 / 10% / +/- 14%
500 + / 10% / +/- 20%
Total Market / 10% / +/- 5.33%
Figure 3: Accuracy of Results: Proportional Sampling
Itisobviousthataproportionalsamplingplanhasyieldedlittlevariationamongsmallhospitals, andquitealotofvariationamonglargeones[1]. Themostimportantresultisthattheoverallaccuracyisnobetterthanplusorminus 5.33% (fora 90% confidenceinterval). Narrowingthiswiderangeisacentralgoalofstratifiedrandomsampling.
Incidentally, itisquitecommontofindresultslikethiswithinthehospitalmarket -- thereisnothingunique, statistically, aboutthemarketforpatientmonitors. Inmanyways, smallhospitalsarealike, whilelargehospitalsdifferintheirbuyingpatternsanduseofproducts. Somefactorsthatcancauselargehospitalstobesodiversearetheireffortstospecializeinparticularareasofmedicalcare, participationinmultiplegrouppurchasingcontracts, andwidelyvaryingeconomicconditionsencounteredbyurbanhospitals.
F.Stratification
OptimalAllocationusingastratifiedrandomsamplesolvesthestatisticalproblemfoundwithproportionalallocation, byensuringthatenoughrespondentsaresurveyedineachsegmenttoprovidethegreatestpossiblelevelofaccuracyfortheoverallresults.
Thekeyliesinbeingabletoidentifysubsetsofthemarketwhereanswersvarywidely, andotherswhereanswersareessentiallythesame. Itisverycommontosubdividehospitalmarketsbybed-size, butonlyalittlemoreeffortisrequiredtosegmentbygeographyaswell. Manysurveysofhospitalsusetwo- orthree-dimensionalstratificationschemes, resultingindozensofmarketsegments. Tokeeptheexplanationssimple, theexampleusedheresegmentsthehospitalmarketonlybybed-size.
G.PlanningtheStratifiedSample
Intheexample, thenumberofpatientmonitorsinsmallhospitalsissmall, anddoesnotvarygreatlyfromonesitetoanother. OptimalAllocationtakesadvantageofthisobservationbyspecifyingalownumberofsurveycontactswithinthismarketsegment. Ontheotherhand, largehospitalsoftenusemanymonitors, butthedatapointscanbe“alloverthemap”. OptimalAllocationsolvesthisproblembyindicatingthatmoresurveyresponsesshouldbefoundinthissegment.
Statisticaltextbooksprovideformulastodeterminejusthowlargethesampleshouldbeineachmarketsegment, tomaximizeaccuracyandnarrowtheconfidenceintervals. Theseformulasstatethatallocationofthesampletoeachsegmentshouldbeproportionaltothesegment’sstandarddeviationtimesthenumberofpotentialrespondentsinthesegment.
TheresultsofthesecalculationsareshowninFigure 4. Foreachbed-sizecategory, thenumberofhospitalsintheU.S. hasbeenmultipliedbythestandarddeviationofthenumberofmonitorsmeasuredinthefirstsurvey. Thesevenresultsshowtheproperweighttobeappliedtoeachsegmentinselectinganumberofrespondents. (Asmentionedearlier, thesmallestcategorywaseliminatedaftertheinitialsurvey.)
Hospital Size(Staffed Beds) / Segment Size / x Standard Deviation / = Weighting / Allocation of Sample
25 - 49 / 871 / X 1.1 / = 936 / 2.3%
50 - 99 / 1,073 / X 1.9 / = 2,035 / 5.1%
100 - 199 / 1,218 / X 4.9 / = 5,983 / 15.0%
200 - 299 / 773 / X 8.5 / = 6,595 / 16.5%
300 - 399 / 443 / X 13.0 / = 5,748 / 14.4%
400 - 499 / 238 / X 22.7 / = 5,397 / 13.5%
500 + / 301 / X 43.5 / = 13,103 / 32.8%
Total Sample / 100%
Figure 4: Calculations to Allocate Respondents For a Stratified Sample
Toprepareforasecondsurveyofthemonitorsmarket, anewstratifiedrandomsamplewasdesigned. Thetotalnumberofrespondentswaskeptat 515, butthesewerere-allocatedaccordingtothefractionsgiveninthecolumnontherightinFigure 4. ThedifferencebetweenProportionalAllocationandOptimalAllocationcanbeseenbycomparingFigures 5 and 2.
Total Hospital Size (Staffed Beds) / Hospitals Available / Number of Respondents / Percent of Universe Sampled25 - 49 / 871 / 13 / 2%
50 - 99 / 1,073 / 26 / 2%
100 - 199 / 1,218 / 77 / 6%
200 - 299 / 773 / 85 / 11%
300 - 399 / 443 / 74 / 17%
400 - 499 / 238 / 71 / 30%
500 + / 301 / 169 / 56%
Total Market / 5,151 / 515 / 10%
Figure 5: Optimal Allocation with a Stratified Sampling Plan: 10% of Market
H.ImprovedResults
Usingthenewstratifiedsample, thesurveywasconductedagain, andtheresultsareshowngraphicallyinFigure 6.
Figure 6: Stratified Sampling
ThebenefitsofOptimalAllocationofthesampleareseenbycomparingFigure 7 withFigure 3. Twochangeshaveresultedfromstratifiedrandomsampling.
First, confidenceintervalsarenowtighterinlargehospitalsegments, wherethenumberofmonitorsislargerandlesspredictablefromonehospitaltoanother. Second, theoverallaverageforthetotalmarketcanbepredictedwithmuchbetteraccuracy: the 90% confidenceintervalhasbeennarrowedfrommorethan +/- 5% withProportionalSampleAllocation, tolessthan +/- 3% withtheOptimalAllocation.
Total Hospital Size (Staffed Beds) / Percent of Universe Sampled / 90% Confidence Interval25 - 49 / 2% / +/- 18%
50 - 99 / 2% / +/- 9%
100 - 199 / 6% / +/- 6%
200 - 299 / 11% / +/- 6%
300 - 399 / 17% / +/- 7%
400 - 499 / 30% / +/- 8%
500 + / 56% / +/- 8%
Total Market / 10% / +/- 2.96%
Figure 7: Accuracy of Results: Stratified Random Sampling
I.SmallerSurveysorBetterAccuracy -- YouCanChoose
Armedwithinformationaboutindividualmarketsegmentsandthevariationofresponsestothistypeofsurvey, amarketresearchercandesignastratifiedrandomsamplethatmeetstheparticularbudgetoraccuracyobjectivesrequiredforthenextsurvey.
Obviously, withanunlimitedbudget, itwouldbepossibletoachievethegreatestpossiblestatisticalaccuracybysurveyingtheentireavailableuniverseofrespondents. Intherealworld, weareusuallyfacedwithafixedbudget, whichmeansalimitednumberofcompletedinterviews.
Whenbudgetsarelimited, thetechniquesofstratifiedrandomsamplingallowtheanalysttodistributethesampleacrossvariousmarketsegmentsinawaythatmaximizestheaccuracyoftheresults. Forinstance, inthepatientmonitorssurvey, theresearchersfoundthatastratifiedsampleofonly 225 hospitalswouldhaveprovidedaboutthesameoverallaccuracyastheoriginal 10% proportionalrandomsample. Inotherwords, preparingastratifiedrandomsamplecouldsavetheclientcompanythecostof 290 interviews -- morethanhalfofthedatacollectioncost!
Oncearesearcherisfamiliarwiththemethodsused, somesimplespreadsheetsordatabasecalculationscanmakeiteasytocalculatetherequiredsamplesize. Andusually, itiswellworththeeffort, sincecomputerizedcalculationsarenotexpensive, butunnecessarilylargeresearchsurveysare.
Itisalsopossibletousethesestatisticstodeterminewhatsamplesizewouldbeneededtogiveadesiredlevelofaccuracyintheresults. Iftheresearcherneedstobeabletoclaim“plusorminus 2% accuracywithaconfidencelevelof 90%”, alargesamplesizewillbedictated. If“plusorminus 7% witha 90% confidencelevel”issatisfactory, thenasmallersamplewilldo.
Figure 8 showshowmanyresponseswouldbeneededinordertoarriveat 90% confidenceintervalsofvarioussizes, forthepatientmonitorsurveyusedintheexample.
Number of Respondents / 90% Confidence Interval120 / +/- 7%
225 / +/- 5.31
515 / +/- 2.96%
800 / +/- 2%
1,300 / +/- 1%
Figure 8: Accuracy of Results with Various Sample Sizes, Stratified Random Sampling
J.WhatAboutFirst-TimeSurveys?
Stratifiedrandomsamplingtechniquescanbeappliedfairlyeasily, buttheydorequireadvancemeasurementsofeachsegmentofthemarkettobesurveyed. Intheexamplepresentedhere, therequiredOptimalAllocationofthesamplecouldbecalculatedonlyafterthefirstsurveyhadmeasuredthestandarddeviationofthenumberofmonitorswithineachhospitalbed-sizesegment. Ifanearliersurveycanbeusedtodeterminetheneededstatistics, thensettingupastratifiedrandomsamplerequireslittlemorethansomecarefuldatabasemanipulationsandspreadsheetcalculations.
Butforresearcherstryingforthefirsttimetomeasurethesizeofamarket, ortodeterminemarketsharesorotherquantitativeinformation, findingtherightstatisticscanbeaproblem.
Atleastfoursolutionsareavailabletotheresearcherwhoneedstheeconomicalefficiencyofastratifiedrandomsample, butwhodoesnothavethebenefitofanexistingdatabasedescribingtherelevantmarket.
- Statisticstextsusuallyrecommendthatasmall“pilotsurvey”beconductedinadvance. Evenarelativelysmallsurveycanprovidearoughestimateofthestatisticsneeded, andthiscanbefarbettereconomicallythanusingasimpleproportionalallocation. Unfortunately, itissometimesdifficulttodelaythemajorinvestigationbyafewweekstoconductapilotsurvey, andallocatingabudgetforthistypeofworkcanposeaspecialchallenge.
- Sometimes, onecancalculatetheneededstatisticsfromdatacollectedinasurveyofasimilarmarket. Forinstance, ifnoinformationwereavailableonthedistributionofpatientmonitorsamongU.S. hospitals, theresearchersmighthavelookedforadatabaseofsomeotherproductwithsomesimilarmarketcharacteristics. Forinstance, surveydataforsometypesofultrasoundequipment, infusionpumps, oradvancedoperatingroomdevicesmighthavebeensubstituted, atleastforthefirstsurvey, tofindestimatesoftherequiredstatistics.
- Beginthesurveyusingasimpleproportionalallocationplan, andgathertherequiredstatisticsasthedatacomesin. Thisapproachrequiressomeveryquickhandlingofsurveydata, especiallyifthesurveyingprocessisprovidingmanynewanswerseveryday. Still, theanalystcanperformcalculationsontheavailabledataeverydayortwo, andmakeperiodicdecisionsaboutwhentocutoffsurveyingineachmarketsegment.
Themethoddescribedherecanonlyworkifthereisagreatdegreeofrandomnessamongtheearlyrespondents -- theanalystwouldnotwanttostoptelephonesurveystosmallhospitalsif, forexample, onlytheWestCoasthadbeencalled. (Thistechniquewouldbedifficulttoapplytoamailsurvey, unlessonehastheluxuryofsendingoutmailingsinwaves.)
- Simplyusearoughguessaboutappropriatesamplesizes, andplantoimproveontheallocationmethodthenexttimethesurveyisconducted. Forexample, thereareprobablymanymedicalequipmentmarketslikethepatientmonitorsexample, inwhichsalesandusagearemorediverseamonglargehospitalsthanamongsmallones. Iftheresearcherthinksthatthemarkettobestudiedhasthischaracteristic, thenthereisprobablysomethingtobegainedbyusingasampleallocationmatchingtheoneshowninFigure 4.
Recognizingthis, amarketresearchermightarbitrarilyallocatemoreofthesampletolargehospitals, andlesstosmallones, andhopeforthebest. Theadvantageofthisapproachisthatitisveryeasyandinexpensive; thedisadvantageisthatitsimplydoesnottakefulladvantageofthemethodsofstratifiedrandomsampling.
K.Conclusion
Stratifiedrandomsamplingisatechniquethatcangivesomedramaticbenefits, byloweringthecostofsurveyssuchasthepatientmonitorsprojectdescribedhere. Aswaspointedoutinthisexample, properstratificationofthesamplecansavehundredsoftelephoneinterviewsamongsurveysofhospitalmarkets.
Thetechniquesusedforstratificationalsohelpsolvesomeofthequestionsmanyresearchershaveabouttheaccuracyoftheresultstheyhaveworkedsohardtogather. Marketanalystswithastrongquantitativebackgroundcanmasterthestatisticalcalculationsinvolved, andfinishtheirresearchinvestigationswithmuchgreaterconfidenceinthemeaningoftheresults.
Theinformationpresentedhereappliestoalmostanyquantitativesurvey -- notjusttohospitalmarketsformedicalequipment. Stratificationsbygeographicregion, customersize, purchasingmethod, age, sex, urban/ruralsetting, andmanyothersarecommoninmarketresearch. Ineachcase, theresearcherhasdecidedtousethesestatisticaltechniquestogainthebenefitsofaccuracyandeconomicsinconductingmarketresearch. Properlyapplied, stratifiedrandomsamplinggivestheresearcheramuchhigherlevelofconfidenceinsurveyresultsandconclusions, makingmarketresearchamorereliableandeffectiveactivityinanyindustryorapplication.
References
StatisticalConceptsandMethods, G. K. BhattacharyyaandR. A. Johnson, JohnWileySons, NewYork, 1977.
SomeTheoryofSampling, W. E. Deming, DoverPublications, NewYork, 1966.
InforMedix Marketing Research, Inc. 1
[1]The smallest bed-size category, hospitals with 1-24 beds, also contains a large statistical variation. This is because all of the hospitals had either no monitors or a single device, giving an average which is not close to either of the data points. However, this finding was considered to be sufficient information about the smallest hospital group, and the segment was eliminated from any further survey work.