co-selling recommendations from an ecommerce...
TRANSCRIPT
PawanRewariCS224W
1
Co-sellingrecommendationsfromaneCommercedataset.PawanRewari–SUID01421653–[email protected]
CS224W–ProjectMilestoneUpdate1.0Introduction/Motivation/ProblemDefinitionAverysimplecustomeruse-caseofeCommerceisasfollows.LetssayJanewantstobuyaspecificproduct(aKielty2personbackpackingtent)oroneofacategoryofitems(atent).TypicallyshewillgototheweboraneCommercesiteanddoasearch.Thenshemayresearchtheproductandafterseveralvisitseitherthesamedayoroverseveraldaysorweekseventuallyconvergetoapurchasingdecision.Eachsuchinteractionisanopportunityforthesellertoshowcasethespecificproductsheislookingforandshowotherproductsthatsheislikelytobeinterestedin.Fromaretailerperspectivetheywanttokeeptheproductsshowcasedtothecustomerrelevanttotheirpurchaseinterest,andtheywanttoprovideaninterestingvarietyofproductstoshowcase.TheoverallproblemwesetouttoresearchishowcanoneprovideanewandnovelapproachtosellingmoreproductsataneCommercesite.Therealworldapplicationisadatascienceteam,asconsultants,whoareapproachedtoprovidestrategiestosellmoreproductsandgivenasliceofhistoricaldata–andaskedtoworkwithwhattheyaregiven.WhilethisprojectisfocusedonaneCommercedatasetfromAmazon(co-sellingproducts)wewouldliketoextrapolatethefindingstoinformanyeCommerceretaileronthevalueofsuchnetworkanddataanalysis.Specifically,ifwearegivenapartialdatasettoworkwith,asisthecasewithourdatasetandnocustomerinformationbeyondthisaggregateddata.Wehavebeenprovidedwithadatasetthatincludes:
- 548552products(including5868discontinuedproducts)- foreachproductrelevantinformationincludes:
o Id:nodesorverticesnumberedfrom0to548551o ASIN:AmazonStandardIdentificationNumberforaproducto Group:Book,Video,DVD,Music(theprimarygroups)o SalesRank:computedbyAmazonasafunctionofhistoricalsales,basedonnumber
ofproductssold;alowerranksignifiesahighersellingproducto Similar:productsthathaveco-sold(Similar==Co-Selling)o Categories:thecategoriesthisproductbelongsto,canbe10deep
Theinitialexploratoryanalysisofthedatasetrevealedthattherewereatmost5co-sellingrecommendationsforanyproduct(the“Similar”products)andthiswasonlyfor60%oftheproductsinthedataset,fortherestnoneorfewerrecommendations(seeFigure1nextpage).Nowasstatedintheintroductoryusecase,ifyoulookatatypicalcustomer’sbehavior,theyarelikelytovisitaneCommercewebsiteseveraltimesbeforetheyconsummateapurchase.Eachsuchvisitisanopportunitytoselltheproductandrecommendsimilaronesandyouwanttohaveavarietyofproductstorecommend,onetofiveisnotenoughweneedmore,manymore,Thereinliesthecoreproblemstatementthatweaddressinthispaper.Howdoyoutoprovideanaddedsetofrecommendationsthatareuseful,relevantandonaveragetaketheretailertoabetterplacethantheystartedoutwith.
PawanRewariCS224W
2
Soifwelookatthebasicdistributionofrecommended/similarproductsprovidedbytheinitialdatasetwehave:Discontinued 5868 1%0Similar 163591 30%1Similar 11071 2%2Similar 10808 2%3Similar 10275 2%4Similar 9482 2%5Similar 337457 62%Total 548552 100%Sotheproblemstatementbecameveryclearwiththreeobjectives:
1. providearecommendationforalargernumberof“Similar”productssoyoucanprovidefreshdataonmultiplevisitsbytheshopper(metric–morerecommendedproducts)
2. increasethe“relevance”oftheaddedsimilarproductswithstronger“categoricalsimilarity”–explainedindetailinSection3,brieflyitisthedepthofsimilarcategoriesfor2products(metric–raisetheavgcategoricalsimilarityofrecommendedproducts)
3. improvetheaverageSalesRankoftherecommendedset(metric–lowertheavgSalesRankofrecommendedproducts)
This translates to 3 measures that we call the 3 CoSellingMetrics. We create a baseline with the initial dataset and try toimproveuponeachofthesemeasures.Intherealworldwewouldrunactualexperimentstoseeiftheserecommendationsactuallyimprovesalesandtheniterativelyimprovetherecommendations,inourcasesincewecan’tdorealtrialswearemeasuringperformancerelativetothebaselineestablishedbythe3CoSellingMetrics.Goingbacktothesimpleuse-case,wewouldliketorecommend5productsforCoSaleateveryvisitandthenrecommendotherproductsonnewvisits,butwewanttodothisfromalimitedsetofSimilarproductsthatwedrawfrom,sowhiletheuserseesdifferentproductsrecommended,theyalsoseethesameonesrecyclebackonsubsequentvisits–onestheymaybeinterestedin.Sowesetareasonabletargetfortherecommendedproductstobe~15,thisisrelativelyarbitrarybutmakessensefortheusecasewehavehere.
PawanRewariCS224W
3
2.0RelatedWorkAsstatedintheReferenceswedidaliteraturereviewofseveralpapersonCollaborativeFiltering,ClusteringMethodsandItem-to-ItemCollaborativeFiltering.Hereisasummary.AtraditionalcollaborativefilteringalgorithmrepresentsacustomerasanN-dimensionalvectorofitems,whereNisthenumberofdistinctcatalogitems.Thecomponentsofthevectorarepositiveforpurchasedorpositivelyrateditemsandnegativefornegativelyrateditems.Tocompensateforbest-sellingitems,thealgorithmtypicallymultipliesthevectorcomponentsbytheinversefrequency(theinverseofthenumberofcustomerswhohavepurchasedorratedtheitem),makinglesswell-knownitemsmorerelevant.Formostcustomers,thisvectorisextremelysparse.Thealgorithmgeneratesrecommendationsbasedonafewcustomerswhoaremostsimilartotheuser.Itcanmeasurethesimilarityoftwocustomers,AandB,invariousways;acommonmethodistomeasurethecosineoftheanglebetweenthetwovectors.Thealgorithmcanselectrecommendationsfromthesimilarcustomers’itemsusingvariousmethodsaswell,acommontechniqueistorankeachitemaccordingtohowmanysimilarcustomerspurchasedit.Inclustermodelstofindcustomerswhoaresimilartotheuser,dividethecustomerbaseintomanysegmentsandtreatthetaskasaclassificationproblem.Thealgorithm’sgoalistoassigntheusertothesegmentcontainingthemostsimilarcustomers.Itthenusesthepurchasesandratingsofthecustomersinthesegmenttogeneraterecommendations.Becauseoptimalclusteringoverlargedatasetsisimpractical,mostapplicationsusevariousformsofgreedyclustergeneration.Thesealgorithmstypicallystartwithaninitialsetofsegments,whichoftencontainonerandomlyselectedcustomereach.Theythenrepeatedlymatchcustomerstotheexistingsegments,usuallywithsomeprovisionforcreatingnewormergingexistingsegments.Forverylargedatasets—especiallythosewithhighdimensionality—samplingordimensionalityreductionisalsonecessary.Clustermodelshavebetteronlinescalabilityandperformancethancollaborativefilteringbecausetheycomparetheusertoacontrollednumberofsegmentsratherthantheentirecustomerbase.However,recommendationqualityislow.Ratherthanmatchingtheusertosimilarcustomers,Amazon’sitem-to-itemcollaborativefilteringmatcheseachoftheuser’spurchasedandrateditemstosimilaritems,thencombinesthosesimilaritemsintoarecommendationlist.Todeterminethemost-similarmatchforagivenitem,thealgorithmbuildsasimilar-itemstablebyfindingitemsthatcustomerstendtopurchasetogether.Theybuildaproduct-to-productmatrixbyiteratingthroughallitempairsandcomputingasimilaritymetricforeachpairbasedonthecosineofthetwovectors.Givenasimilar-itemstable,thealgorithmfindsitemssimilartoeachoftheuser’spurchasesandratings,aggregatesthoseitems,andthenrecommendsthemostpopularorcorrelateditems.Thiscomputationdependsonlyonthenumberofitemstheuserpurchasedorrated.AsstatedinSection1,withourdatasetwehaveinformationonproducts,co-sellingrelations&categories,butnotoncustomers.Thismayverywellbethecaseinarealworldsituationwhereyouareconsultingforacustomerwhoprovidesyouwithalimiteddataset.Sincewearenotprovidedwithcustomerpurchasedata,onlyaggregatedata,wecouldlearnfrombutnotdirectlyapplytheaboveideas.WecameupwithanapproachthatallowedustomakegoodrecommendationsbyimprovingonthethreeCoSellingMetricsstatedearlier.
PawanRewariCS224W
4
3.1ApproachWestartwithaninitialsparsesetofrecommendations(representedinG1),atreeofcategoriesandsub-categories(G2)andusethesetocomeupwithadensersetofrecommendations(G3)asshown.The3Graphs(Figure1)relevanttoourapproach&analysisare:
• G1:G(V,E),eachVertexisaproductandeachEdgerepresentsaco-sellingrelationship.Thisisanundirectedgraph(ifIsellwithyou,yousellwithme).EachVertexhasanASIN,Group,SalesRank.EdgestoallSimilar/co-sellingproducts.
• G2:(V,E)isthecategorytree.TheVerticesareallthecategories&sub-categories.Therootnodeisproduct,nextlevelisBooks,Music,Video,DVD,theirsub-categoriesatthenextlevelandsoon.TheEdgesindicatesacategorytosub-categoryrelationship.Eachproductthenbelongstomultiplecategorypathsasillustratedonthenextpage.
• G3:G(V,E)isbuiltontopofG1andusestheinformationfromG2,itcontainsasignificantnumberofaddededgesrepresentingnewlyrecommendedco-sellingrelationshipsbyourrecommendationenginedescribedin3.2.
MetricsforinitialGraphG1:G1:Nodes542684,Edges987903G1:AverageDegree:1.82G1:MaxWcc:Nodes334852,Edges925823G1:Diameter:43G1:AverageClusteringCoefficient:0.274496Figure1:G1,G2&G3
P
P
P P
P
P P
P
P P
P
P
C
C C
C C C C
PawanRewariCS224W
5
Baselineandwhatwesetouttomeasure.ThethreeCoSellingMetricsarebasedon:1:AvgRecommendedProducts–thenumberofSimilarproducts(avgDegree) Thisissimplythe#edges(degree)ofeachnode,averagedacrossallnodes2:AvgCategoricalSimilarity–theCategoricalSimilarityaverageoftherecommendedproducts. Thisisexplainedbelow3:AvgSalesRank–averageSalesRankofthesimilarproductsthatarerecommended Eachproduct(node)hasaSalesRank,soweaveragetheSalesRankofallitneighborsTheobjectiveistoincrease1(Co-Selling)&2(relevance)anddecrease3(lowSalesRankisbetter).Figure2:PartialtreeofG2:_________CategorytoSub-CategoryEdges_________Product1belongsto3categorypaths_________Product2belongsto3categorypathsProduct1&Product2MaxPathintersectionatLevel4CategoricalSimilarity=4[Note–eachcategorypathdoesnotneedtoreachaleafofthetree]
C
C C
C C C C
C C
C C C C
C
C C
C C
C C C
C
C C
C
Level0
Level1
Level2
Level3
Level4
Level5
Level6
PawanRewariCS224W
6
AverageCategoricalSimilarity:Astherealexamplebelowshowseachproduct(ASIN:0486220125)belongstoanumberofcategorypaths,eachproductitissimilarto(ASIN:0486401960&ASIN:0486229076&ASIN:0714840343)alsobelongstoanumberofcategorypaths.WelookforthehighestCategoryLevelthatiscommonbetweenthe“Subject”andtheproductsthatis“SimilarTo”…wethenaveragethisacrossall“Similarproducts”.
ASIN:0486220125 title:HowtheOtherHalfLives:StudiesAmongtheTenementsofNewYork
Similarto:=Co-PurchasingRelationship
ASIN:0486401960
ASIN:0486229076 ASIN:0714840343
Categorical
CategoryPathstheproduct(ASIN)islistedin
Similarity
CategoryLevels 1 2 3 4 5 6 7
Books[283155] Subjects[1000]Arts&Photography[1] Photography[2020] PhotoEssays[2082]
Subject Books[283155] Subjects[1000] History[9] Americas[4808] UnitedStates[4853] General[4870]
ASIN:0486220125 Books[283155] Subjects[1000] History[9] Jewish[4992] General[4993]
Books[283155] Subjects[1000] Nonfiction[53] SocialSciences[11232] Sociology[11288] Urban[11296]
[172282] Categories[493964]Camera&Photo[502394]
PhotographyBooks[733540] PhotoEssays[733676]
SimilartoSubject 6 Books[283155] Subjects[1000] History[9] Americas[4808] UnitedStates[4853] General[4870]
ASIN:0486401960 3 Books[283155] Subjects[1000] Nonfiction[53] CurrentEvents[10546] Poverty[10576] General[10578]
6 Books[283155] Subjects[1000] Nonfiction[53] SocialSciences[11232] Sociology[11288] Urban[11296]
4 Books[283155] Subjects[1000]
Arts&Photography[1] Photography[2020]| History[2032]
SimilartoSubject 4 Books[283155] Subjects[1000]Arts&Photography[1] Photography[2020] Travel[2087]
UnitedStates[2099] General[2100]
ASIN:0486229076 4 Books[283155] Subjects[1000]Arts&Photography[1] Photography[2020] Travel[2087]
UnitedStates[2099]
MidAtlantic[2101]
5 Books[283155] Subjects[1000] History[9] Americas[4808] UnitedStates[4853] State&Local[4872]
4 [172282] Categories[493964]Camera&Photo[502394]
PhotographyBooks[733540] Travel[733684]
UnitedStates[733708] General[733710]
4 [172282] Categories[493964]Camera&Photo[502394]
PhotographyBooks[733540] Travel[733684]
UnitedStates[733708]
MidAtlantic[733712]
SimilartoSubject 4 Books[283155] Subjects[1000]
Arts&Photography[1] Photography[2020]
Photographers,A-Z[2033]| General[2050]
ASIN:0714840343 5 Books[283155] Subjects[1000]Arts&Photography[1] Photography[2020] PhotoEssays[2082]
4 [172282] Categories[493964]Camera&Photo[502394]
PhotographyBooks[733540]
Photographers,A-Z[733566] General[733568]
5 [172282] Categories[493964]Camera&Photo[502394]
PhotographyBooks[733540] PhotoEssays[733676]
Toillustratethisindetail.TheSubject(ASIN:…0125)islistedin5categoryPaths.Ithasaco-sellingrelationship(Similarto)withASIN:…1960whichislistedin3CategoryPaths,withASIN:…9076whichislistedin6CategoryPaths,withASIN:…0343whichislistedin4CategoryPaths.IfIexaminethefirstCategoryPathofeachofthese,theyintersectwiththeCategoryPathsoftheSubjectatLevels6,4and4respectively.Wecontinuethisprocessandcomputeallthepathintersections(inYellow)andtaketheMaxCategoricalSimilarityineachcase(inGreen).AverageCategoricalSimilarity[fornodewithASIN:…0125]=Averageof(max)SimilaritywithASINs:196090760343=(6+5+5)/3=5.33
PawanRewariCS224W
7
BASELINEMetricsforG1:
OriginalDataset
WholeGraph AvgRecommendedProducts 1.82 AvgCategoricalSimilarity 3.25 AvgSalesRank 81729
Subset-AllProductswith1ormoreco-sellingproducts AvgRecommendedProducts 2.69 AvgCategoricalSimilarity 4.81 AvgSalesRank 67662Books
AvgRecommendedProducts 2.75 AvgCategoricalSimilarity 4.85 AvgSalesRank 84674Music
AvgRecommendedProducts 2.43 AvgCategoricalSimilarity 4.92 AvgSalesRank 26739Video
AvgRecommendedProducts 1.95 AvgCategoricalSimilarity 3.56 AvgSalesRank 7196DVD
AvgRecommendedProducts 3.43 AvgCategoricalSimilarity 5.03 AvgSalesRank 8430
PawanRewariCS224W
8
3.2Model/Algorithm/Method- formaldescriptionofanyalgorithmsused
Algorithmfollowed:
CreateG1(V,E):ReadtheinitialdatasetandeachVertexisaproductandeachEdgerepresentsaco-sellingrelationshipforeachVertex–ASIN,Group,SalesRank;EdgestoSimilarco-sellingproductsCreateG2(V,E):acategorytreewithsub-categoriesateachsub-level.Mapeachproducttothecategorypathsitbelongsto(typically3-7paths).Thepathsareprovidedbythedataset.Compute:AvgRecommendedProductsisthe#edges(degree)ofeachnode,averagedacrossallnodesAvgSalesRankistheaverageoftheSalesRankofallofitsneighbors.CategoricalSimilarityforany2productsisthehighestCategoryLeveltheyhaveincommonTheAvgCategoricalSimilarityisaverageCategoricalSimilarityacrossalltheproductsthatareinaco-sellingrelationshipwithaspecificproduct.CreateG3(V,E):StartwithG1.Foreachproduct–developalistofadditional“similar”productsbylookingatoneswiththehighestCategoricalSimilaritywiththeproduct,andamongsttheoneswiththesameCategoricalSimilarityhaveapreferenceforproductswithalowerSalesRank,stopafter~15orso(a“reasonable”targetweestablishedinSection1).AddtheseedgestoG3.Toidentifytheadditionalproductstocreateedgesto:(a) Forproducts(nodes)whichhaveoneormore“similar”productswefollowthepathofthe
neighborsandtheirneighbors…andevaluatetheonestoaddbasedonhigherCategoricalSimilarityandlowerSalesRank.
(b) Forallproductsinaddition,andforoneswithZERO“similar”productsthisistheonlywaytoaddmorerecommendations,lookforproductsthatareinthesamecategoriesandaddbasedonhigherCategoricalSimilarityandlowerSalesRank.
Re-ComputeAvgRecommendedProducts,AvgCategoricalSimilarity&AvgSalesRankandcomparewithinitialresults.Ideallywenowshouldhaveamuchdensergraphofco-sellingrelationshipswithmorerelevanceandlowerSalesRank.
PawanRewariCS224W
9
4.0ResultsandFindingsSoifwelookattheexamplewestartedwith,thenewsetof“similar”productsare:ASIN:0486220125 title:HowtheOtherHalfLives:StudiesAmongtheTenementsofNewYork
Newsetof"similar"products.ASIN:0486401960 title:TheBattlewiththeSlumASIN:0486229076 title:OldNewYorkinEarlyPhotographsASIN:0714840343 title:JacobRiis(55)ASIN:0451628810 title:TheFederalistPapersASIN:0395914787 title:AmericanVintage:TheRiseofAmericanWineASIN:1841762121 title:MatchlockMusketeer1588-1688(Warrior43)ASIN:0609608363 title:WondrousContrivances:TechnologyattheThresholdASIN:0807120820 title:Freedom'sLawmakers:ADirectoryofBlackOfficeholdersDuringReconstructionASIN:0060004436 title:PowerPlays:WinorLose--HowHistory'sGreatPoliticalLeadersPlaytheGameASIN:0807841625 title:TheEnduringSouth:SubculturalPersistenceinMassSocietyASIN:0805072845 title:ProjectOrion:TheTrueStoryoftheAtomicSpaceshipASIN:0811717003 title:StreetWithoutJoyASIN:0813911311 title:TheReligiousLifeofThomasJeffersonASIN:0801486955 title:GoingNative:IndiansintheAmericanCulturalImaginationASIN:0029024803 title:AnEconomicInterpretationoftheConstitutionofTheUnitedStatesFirst,justtakingacasuallookatthislist,itlookslikeareasonablesetofrecommendedproducts.Manyareinthethemeoftheoriginalproduct,andseveraltakeyoutonew,butrelatedplaces.SoeverytimeauservisitstheeCommercesitetopurchasetheywillbeshownasubsetof“peoplewholookedatthisalsolookedatthis”,“peoplewhopurchasedthis,alsopurchasedthis”.AswelookatthetransformationfromG1toG3,forthisONEproductwehavethenewmetrics:
• startingwith3(pg5)wenowhave15“similar”products(better)• theaveragecategoricalsimilaritygoesfrom5.33to5.72(better)• theaverageSalesRankfrom209144to299227(worse)
Fortheoverallgraphofsimilar/co-sellingproducts,weclearlyhaveatransformationtoamuchdensergraphwithasignificantlylargernumberofrecommendationsperproduct.MetricsforinitialGraphG1: MetricsfornewGraphG3:G1:Nodes542684,Edges987903 G3:Nodes542684,Edges7352419G1:AverageDegree:1.82 G3:AverageDegree:13.55G1:MaxWcc:Nodes334852,Edges925823 G3:MaxWcc:Nodes524997,Edges6426637G1:Diameter:43 G3:Diameter:11G1:AverageClusteringCoefficient:0.274 G1:AverageClusteringCoefficient:0.569Nowifwelookintothedetails,itclearlyshowsthatwehavesignificantlyraisedthenumberofrecommendedproductswhilesimultaneouslyimprovingaveragecategoricalsimilarity(relevance)andaverageSalesRank(rankbasedonhistoricalsales)oftherecommendedproducts.This
PawanRewariCS224W
10
involvedsignificantprogrammingwithmuchnewlearning,theresultsaresolidandwehaveconfidenceinthefindings.NEWMETRICS:
OriginalDataset
NewRecommendations Improvement
WholeGraph
AvgRecommendedProducts 1.82 13.55 7.4 X AvgCategoricalSimilarity 3.25 5.60 72.0% AvgSalesRank 81729 24343 70.2%
Graphsubset-AllProductswith1ormoreco-sellingproducts
AvgRecommendedProducts 2.69 13.97 5.2 X AvgCategoricalSimilarity 4.81 5.77 19.9% AvgSalesRank 67662 21654 68.0% Books
AvgRecommendedProducts 2.75 14.00 5.1 X AvgCategoricalSimilarity 4.85 5.78 19.0% AvgSalesRank 84674 27450 67.6% Music
AvgRecommendedProducts 2.43 13.96 5.7 X AvgCategoricalSimilarity 4.92 5.67 15.2% AvgSalesRank 26739 8217 69.3% Video
AvgRecommendedProducts 1.95 12.73 6.5 X AvgCategoricalSimilarity 3.56 5.83 63.7% AvgSalesRank 7196 2821 60.8% DVD
AvgRecommendedProducts 3.43 14.99 4.4 X AvgCategoricalSimilarity 5.03 6.08 20.9% AvgSalesRank 8430 2646 68.6% SincetheexamplewehaveusedisonBooksletmeexplorethebookscategoryinmoredetailtoexplainthistable.Whenwestarted,onaveragetherewere2.75recommendationsforeachbook,notnearenoughtoprovideadequatechoices,afterthetransformation,wenowhave14recommendationsperbook,thatismuchmoreusefulforaneCommerceengine.Thisalsoensuresalargersetofchoicesprovidedtoconsumers.AsameasureofrelevancewehaveusedCategoricalSimilarityandonthismeasurethereisa19%improvementinCategoricalSimilarity.TheSalesRankforaproductiscomputedbyAmazonasafunctionofhistoricalsales.AlowerSalesRanksignifiesahighersellingproduct.ThereisaniceimprovementintheaverageSalesRankby67%increasingtheprobabilityofsellingotherproducts.Nowletmechallengesomeoftheassumptionsmadeinthisproject:
PawanRewariCS224W
11
- increaseaverage#ofrecommendedproductsto~15,isthisareasonablenumber,doweneedmore,doweneedless;inthiscasewiththesamealgorithmthiscanbechanged
- the“weight”ofeachLevelinthecategorytreeisthesame,wecouldweighdeepersub-categorieshigherandseeifthatyieldsbetterresults,thisisworthexperimentingwith
- loweringtheaveragePageRankoftheco-sellingproductscouldleadtoarichgetricherphenomenon,isthiswhatwewantordowewanttosprinkletherecommendationswithsomehighsellersandsomelowsellers,thisiscertainlyastrategythatcanbeinvestigated
- isCategoricalSimilarityagoodmeasure,theresultsseempromising,butneedstobetested- tothelastpointandeverythingstatedsofartheultimatemeasureiswillthisleadtomore
ecommercesalesandtheonlywaywedeterminethatisbyrealworldtestsofthemodel- finallytheassumptionmadeinthisdiscussionaboutmaximizingsalesistomaximizethe
numberofitemssold,wouldwedoanythingdifferenttomaximizesalesrevenues,wesuspecttheapproachissimilar,butthiswouldbeanotherworthypathtoinvestigate
5.0ConclusionWestartedwithaDatasetthatprovidedasparsesetofrecommendationsforco-sellingproducts.Thisisarealworldscenariowhereoneneedstoproducemoreandrelevantrecommendationsgivenalimitedsetofdata.Welookedatclustermodelsandcollaborativefiltering,bothofwhichwouldrequiremoreinformationoncustomersthatwasnotavailable.Sowiththelimitedinformationavailable,wecameupwithametricaroundCategoricalSimilarityandprovidedadditionalrecommendationsthatimproveduponboththismeasureandtheaverageSalesRank.Nowinarealworldscenariothenextstepistorunatestforalimitedperiodoftime,sowecanprogressivelyrefineitbasedonrealresults.Thatsaidourtheoreticalresultsshowthatweareheadedintherightdirection.6.0References
1. J.Breese,D.Heckerman,andC.Kadie,“EmpiricalAnalysisofPredictiveAlgorithmsforCollaborativeFiltering,”Proc.14thConf.UncertaintyinArtificialIntelligence,MorganKaufmann,1998,pp.43-52.
2. B.M.Sarwaretal.,“Item-BasedCollaborativeFilteringRecommendationAlgorithms,”10thInt’lWorldWideWebConference,ACMPress,2001,pp.285-295.
3. M.BalabanovicandY.Shoham,“Content-BasedCollaborativeRecommendation,”Comm.ACM,Mar.1997,pp.66-72.
4. L.UngarandD.Foster,“ClusteringMethodsforCollaborativeFiltering,”Proc.WorkshoponRecommendationSystems,AAAIPress,1998.
5. LindenG.,SmithS.&YorkJ,–Amazon.comRecommendations:ItemtoItemCollaborativeFiltering,2003,