do user entrepreneurs speak different? applying natural ...library.usc.edu.ph/acm/chi...

7
Do User Entrepreneurs Speak Different? Applying Natural Language Processing to Crowdfunding Videos Jermain Kaminski RWTH Aachen University Technology Entrepreneurship Group TIME Research Area [email protected] Frank Piller RWTH Aachen University Technology and Innovation Management TIME Research Area [email protected] Yibin Jiang RWTH Aachen University Technology Entrepreneurship Group TIME Research Area [email protected] Christian Hopp RWTH Aachen University Technology Entrepreneurship Group TIME Research Area [email protected] Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s). CHI’17 Extended Abstracts, May 6-11, 2017, Denver, CO, USA. ACM 978-1-4503-4656-6/17/05. http://dx.doi.org/10.1145/3027063.3053223 Abstract UPDATED—February 19, 2017. In this work, we analyze video pitches of creators in crowdfunding campaigns and focus on linguistic particularities of ’lead user’ entrepreneurs. Theory suggests that lead users sense needs long before they become known to the broader public and would ben- efit greatly from finding a solution to these needs. For our study, we consider 404 video pitches of creators on Kick- starter, distinguishing lead users and regular campaigners. The study employs natural language processing (NLP) of (video-to-)speech-to-text content. Initial results indicate that lead users are more oriented towards product and problem- solving rather than focusing on pecuniary motives. Author Keywords Crowdfunding; natural language processing (NLP); video; lead users ACM Classification Keywords H.5.2 [User Interfaces]: Natural language; H.5.3 [Group and Organization Interfaces]: Web-based interaction. Introduction The angel investor Ron Conway (SV Angel) said in open online education class How To Start A Startup - Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA 2683

Upload: others

Post on 20-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

Do User Entrepreneurs SpeakDifferent? Applying Natural LanguageProcessing to Crowdfunding Videos

Jermain KaminskiRWTH Aachen UniversityTechnology EntrepreneurshipGroupTIME Research [email protected]

Frank PillerRWTH Aachen UniversityTechnology and InnovationManagementTIME Research [email protected]

Yibin JiangRWTH Aachen UniversityTechnology EntrepreneurshipGroupTIME Research [email protected]

Christian HoppRWTH Aachen UniversityTechnology EntrepreneurshipGroupTIME Research [email protected]

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the Owner/Author. Copyright is held by the owner/author(s).CHI’17 Extended Abstracts, May 6-11, 2017, Denver, CO, USA.ACM 978-1-4503-4656-6/17/05.http://dx.doi.org/10.1145/3027063.3053223

AbstractUPDATED—February 19, 2017. In this work, we analyzevideo pitches of creators in crowdfunding campaigns andfocus on linguistic particularities of ’lead user’ entrepreneurs.Theory suggests that lead users sense needs long beforethey become known to the broader public and would ben-efit greatly from finding a solution to these needs. For ourstudy, we consider 404 video pitches of creators on Kick-starter, distinguishing lead users and regular campaigners.The study employs natural language processing (NLP) of(video-to-)speech-to-text content. Initial results indicate thatlead users are more oriented towards product and problem-solving rather than focusing on pecuniary motives.

Author KeywordsCrowdfunding; natural language processing (NLP); video;lead users

ACM Classification KeywordsH.5.2 [User Interfaces]: Natural language; H.5.3 [Group andOrganization Interfaces]: Web-based interaction.

IntroductionThe angel investor Ron Conway (SV Angel) said in openonline education class How To Start A Startup -

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2683

Page 2: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

CS183B at Stanford University:1: "Literally while you aretalking to me in the first minute I am saying: Is this persona leader? Is this person rightful, focused, and obsessed bythe product? I am hoping, because usually the first questionI ask is: What inspired you to create this product? - I’m hop-ing that it’s based on a personal problem that that founderhad and this product is the solution to that personal prob-lem."

Kickstarter Projects537 Kickstarter campaignswith expert-evaluation of

lead-userness.

Campaign Videos404 valid campaigns withspeech content in videos.

Video to SpeechConversion of video to audiowith ffmpeg, if no subtitles

(*.vtt) is provided.

Speech to TextConversion of video to audiowith Google Cloud SpeechAPI, as of November 2016

(Python integration).

Text ProcessingLowercase conversion,

stopword cleaning, frequencythreshold.

NLP Applicationsn-grams, LIWC classification.

Conway emphasizes the vital, but often neglected role ofpersonal founder aspirations embedded in communica-tion patterns. These affect the perception of others and ev-idently also the chances for future funding. Our researchtherefore aims to analyze two under-researched areas:

1. Is language in videos predictive for funding outcomesin crowdfunding, and

2. to what extent are creators on the path to solve apersonal problem (lead users) more successful incrowdfunding.

Until January 2017, 118,000 campaigners on the crowd-funding platform Kickstarter [8] successfully raised morethan $2.47bn. The platform emerged as one of the gate-ways for entrepreneurs to present and fund novel and in-novative ideas. Through its transparency and open infor-mation flow, crowdfunding platforms provide researcherswith the opportunity to study recipes of success in theseenvironments. Firstly, the creator’s communication andentrepreneurial journey is very transparent, and secondly,crowdfunding investors show a high level of professionalismin making investment decisions [12].

1http://startupclass.samaltman.com/, Lecture 9 - How to Raise Money(Marc Andreessen, Ron Conway, Parker Conrad), Timestamp: 1:15.

Particularly interesting in the crowdfunding setting are leadusers. Lead users "anticipate relatively high benefits fromobtaining a solution to their needs" and "are at the leadingedge of an important market trend" [19]. Lead users arethe source of many successful new products [18, 20], forinstance in health, technology and sports (cf. Fig. 1 and2 for examples of ’lead-userness’). They turn into ’userentrepreneurs’ [16], when they realize that their inven-tions have an appeal for other users [6, 2]. The lead userphenomenon appears to be at odds with the dominantparadigm and logic in innovation management and policy.Here, the common perception is the that there are manu-facturers who innovate, by identifying an open need in amarket, and by engaging in creativity and problem solvingtechniques, they translate this need into a solution. Not sur-prisingly then, there is currently little research that can helpto shed more light on the lead user phenomenon and its so-cietal importance. We think that crowdfunding platforms arean ideal place to explore lead-userness and how far fundingoutcomes are driven by the language and content they useto communicate.

An increasing corpus of the crowdfunding literature an-alyzes project descriptions of crowdfunding campaigns[10, 5, 1, 3]. For instance, Mitra and Gilbert (2014) ana-lyze Kickstarter project descriptions find that the languageused by creators to pitch their ideas (phrases) account for58.56% of the variance around success.

While research indicates that the presence of a pitch videoin a Kickstarter project largely influences a campaign’s out-come [10, 11], only little work has been done on crowdfund-ing videos. As for instance an analysis of emotions to cam-paign videos [15]. Entrepreneurship researchers alreadystudied decision making processes of business angels andthe preparedness of founders by analyzing television shows

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2684

Page 3: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

[9, 14]. However, video pitches by founders in crowdfund-ing represent a different setting, where creators address"product-investors", i.e. users, for support, as compared to"company-investors" in venture capital (VC).

In this early-stage work, we analyze video pitches of cre-ators in crowdfunding, and focus on linguistic differencesof lead user entrepreneurs. By doing so, we extend the re-search on conversation and emotion in crowdfunding cam-paigns, moving from written text [10] to speech. As the pre-sentation in a video allows campaign creators to pitch theirown motivation for creating the product, they may engagein storytelling to market their product, using lead user termi-nology.

Figure 1: Arpeggio: The PortableArpeggiator, Sequencer, andSynth: "Ever since I’ve beenfascinated with musicalinstruments. I get frustrated withstandard step sequencers becausethey aren’t capable of executingmany of the kinds of melodies I findcompelling". Funded, $94,713.Image credit: Tangible Instruments.

Figure 2: HUDWAY Glass: keepsyour eyes on the road whiledriving: "The idea came fromAlex’s professional hobby - rallyracing. In rally, they have co-pilotsfor efficient navigation. Butco-pilots are humans who cannaturally make mistakes". Funded,$622,785. Image credit: HUDWAYGlass.

DataOur sample consists of 537 "Product Design" or "Technol-ogy" Kickstarter projects in the United States, initiated be-tween September 1st and October 31st 2015. Using Pythonlibrary ’Beautiful Soup’, we automatically downloaded Kick-starter campaign videos that are uploaded by creators tosubstantiate their descriptions. We receive a total of 404valid videos, as some projects have either no video presentor no spoken word (music only).

Video to TextIn our research, all related mp4 videos are firstly convertedinto audio format. The average video length in sample is2:46 min. We used ffmpeg to convert videos in mp4 for-mat into audios in wav format with a bit rate of 160 kbit/s,and a sampling rate of 44100 Hz with 2 channels. We usea Python implementation of the Google Speech API,SpeechRecognition, to further process the audio filesand convert them into text. Each audio is split into snippetsof each 10 seconds and then stored as plain text. We con-verted all video-to-text files into lowercase and tokenized

them to unigram, bigram and trigrams. We also removedstop words and numbers, and only consider phrases with afrequency of equal of greater than 5.

Lead UsersIn order to evaluate lead-userness, we recruited a total of27 individuals to assess the campaign and founder char-acteristics of the selected videos through a survey com-prising multiple questions on campaigners, campaigns,and product characteristics. Applying several robustnesschecks, the intraclass correlation coefficient varies between0.89 and 0.96. Following Franke, von Hippel, and Schreier(2006),[4] we asked raters, whether "The creator has ideasthat are ’ahead of the trend’". Similarly, we assessed the’benefit and utility expected’ from the product. We furtherassessed whether campaign founders are dissatisfied withexisting product solutions on the market and whether ornot they believed that there exists an unresolved prob-lem. We conclude ’lead-userness’, if the average score ofweighted items on ’benefit expected’ and ’ahead of trend’succeeded or equals the value 5 on a 7-p Likert scale,with 7 = ’strongly agree’ and 1 = ’strongly disagree’. Cron-bach’s alpha is 0.79. Based on this value, we divided ourtotal sample into two cohorts, that are (1) lead users (LU),whose value ≥ 5, and non-lead user (NLU) whose value< 5. As shown in Table 1, we get 102 lead users and 302non-lead user, each differentiated by funding outcome. Ourfinal sample comprises 604 unique phrases for 102 LU and1707 unique phrases for 302 NLU. While responders ratedlead-userness based on all available campaign information,there is potential for mimetic behavior and false-positives,i.e. creators pretending to have lead user motivations.

MethodOur analysis follows the approach of Mitra and Gilbert(2014) [10], who use textual descriptions of Kickstarter

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2685

Page 4: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

projects to extract phrases that predict funding outcomeson Kickstarter. In our analysis, we also apply a penalizedlogistic regression that accounts for collinearity and sparsefeatures. Using the Python library sklearn, we predict thedependent variable funded, implying ’status: successful ’,i.e. $ amount pledged ≥ $ fund-raising goal within a settime frame. The parameter α, determining the movement ofthe weights of the predictive variable, is set to α = 1.0. Assuch, our regression represents a Lasso model (as com-pared to α = 0.0, a Ridge regression model).

Results

LU NLU

Success 56 131Failed 46 171

Total 102 302

Table 1: Data samples.

LU β

Duration 1.92Comments 0.09

Total_Words -0.47Backers -0.58

Goal -0.60

NLU β

Duration 2.01Comments -0.12

Backers -0.41Total_Words -0.49

Goal -0.52

Table 2: Control variables withnon-zero predictive power forfunding outcome (Z-scores).

Figure 3: Selected LIWCcategories. The x-axis depicts the% to which phrases predict LU orNLU projects.

Table 3 and 4 present the parameter β, as generated by ourmodel. We list the top 30 phrases with positive β weightsand top 30 phrases with negative β weights for LU, andNLU respectively. As for lead users (LU), Table 3 lists thetop 30 phrases having positive β weights that predict a pos-itive funding outcome and top 30 phrases having a negativeβ weight.

Table 3 provides an indication for roughly four dimensionsthat are represented in successful LU projects:

1. Product: "batteries" (β 7.06), "bag" (β 6.43), "cam-era" (β 2.40).

2. Utility: "accessible" (β 6.67)), "affordable" (β 3.48),"bigger" (β 3.25).

3. Problem solving: "allows" (β 3.56), "complicated,"add" (β 3.26), (β 2.69), "can’t" (β 1.77).

4. Experimental: "custom" (β 1.80) , "explore" (β 1.56).

Terms in our sample further indicate that LU products ad-dress certain user groups specficially, such as chefs (β

5.42), aged people (β 3.51), dogs (β 2.23), babies ("dia-per" β 2.20), or open source technology users, such as"android" (β 2.55). As compared to non-successful LUand NLU (Table 4), LU build their product upon "already"(β 2.24) existing solutions ("based" β 1.72) or "add" (β3.26) functionality to something that is known to the user.All in all, inventions based on lead user component resultin rather incremental (competence-enhancing) solutions[17, 7]. In Table 2, we show control variables with non-zeropredictive power in funding outcomes.

To test for the robustness of our results, we also applieda Linguistic Inquiry and Word Count (LIWC) [13] analysis[10]. LIWC uses a hand-built dictionary to conduct text anal-ysis on given input text. The library analyzes texts and cal-culates a percentage value that measures to what extentwords in a selected dataset represent semantic categories.We identified a total of 38 LIWC categories and limit our re-port to eight that showed significant differences among suc-cessful LU and non-successful NLU campaigns (Figure 3).According to LIWC, LU differentiate most on positive load-ings for words related to "relativity", "motion" and "time", aswell as within the "cognitive processes" category. Also, LUseem to exhibit less focus on pecuniary motives and reportmore on prior achievements, differentiation points, as wellas insights into their product. This corroborates prior resultsdepicted in Table 3 and 4. According to our current analy-sis, they show no differences in "Affective Processes", i.e.positive or negative emotions.

Conclusions and Future WorkConsidering that campaign creators engage in various waysto emotionally appeal to potential investors, it might be im-portant to extend our results along the lines of impressionmanagement and emotional assuage. Ongoing research istherefore focused on a more nuanced analysis of speech-

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2686

Page 5: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

(LU) phrases β (LU) phrases β

batteries 7.06 block -5.73accessible 6.67 apple -3.80bag 6.43 almost -3.39chef 5.42 balance -3.35behind 4.17 attached -3.20build 4.12 across -2.97allows 3.56 ability -2.47age 3.51 capture -2.33affordable 3.48 application -2.33add 3.26 company -2.32bigger 3.25 access -2.12complicated 2.69 colors -2.08addition 2.66 built-in -2.03code 2.62 achieve -1.97android 2.55 another -1.76comes 2.50 engineering -1.74camera 2.40 babies -1.74brought 2.34 amazing -1.73creator 2.28 diapers -1.70already 2.24 bottom -1.69dog 2.23 always -1.61diaper 2.20 believe -1.60along 1.99 inch -1.45ago 1.81 fit -1.40custom 1.80 box -1.35best 1.80 away -1.28can’t 1.77 extra -1.28based 1.72 battery -1.26half 1.68 measure -1.23explore 1.56 edge -1.19

Table 3: Left: Top 30 phrases predicting positive outcome of LUcrowdfunding projects. Right: Top 30 phrases predicting negativeoutcome of LU crowdfunding projects. Z-score normalized.

(NLU) phrases β (NLU) phrases β

alternative 9.84 bill -5.48audio 8.30 bikini -5.37aware 6.16 chili -5.15click 5.35 cartridge -3.85inflatable 5.10 bought -3.80add 5.04 dirt -3.71arm 4.64 cookie -3.42ability 4.64 base -3.39app 4.56 bands -3.30alone 4.48 calendar -3.14books 4.41 blocks -3.05allowing 4.39 actual -3.01action 4.26 basically -2.98anything 4.20 book -2.97affordable 3.73 child -2.91bikes 3.67 controllers -2.85cutting 3.57 characters -2.82artist 3.56 boys -2.82helmet 3.49 android -2.81big 3.30 covers -2.66busy 3.23 balance -2.65bike 3.22 comfortable -2.64begin 3.11 current -2.64changing 3.00 attached -2.59animals 2.99 batteries -2.56bella 2.98 allowed -2.49coffee 2.96 girls -2.49addition 2.91 attach -2.46heat 2.71 hoping -2.44fresh 2.70 mounting -2.39

Table 4: Left: Top 30 phrases predicting positive outcome of NLUcrowdfunding projects. Right: Top 30 phrases predicting negativeoutcome of NLU crowdfunding projects. Z-score normalized.

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2687

Page 6: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

to-text content (n-grams, t-SNE word embeddings), as wellas objects and emotions in videos. Not only are we furtherinterested in mapping out whether campaign representa-tions are mirrored by campaign founders skills, but also weaim at testing whether they can live up to their promise ofbringing the product to market, i.e. are able to commercial-ize their project. Similarly, one may ask whether crowdfun-ders find the product as appealing as initially suggested.Sentiment analyses of forum comments on the receivedproduct, for example, could illumine our understanding fur-ther.

Overall, our initial results encourage to continue the re-search on user entrepreneurs in crowdfunding. Two po-tential tools resulting of such research could be (1) an au-dio and video input powered prediction engine for start-uppitches, in particular in the crowdfunding setting, and (2) analgorithm highlighting crowdfunding projects with apparentlead user motivation.

AcknowledgementsWe would like to thank our reviewers, as well as colleaguesat the MIT Media Lab and the TIME Research Group atRWTH Aachen for several comments that improved ourwork. Last but not least, we like to thank Oleksandr Zas-tupailo for preparing the data and all 27 experts for theirprecious contribution to this project.

References[1] Tim Althoff, Cristian Danescu-Niculescu-Mizil, and Dan

Jurafsky. 2014. How to ask for a favor: A case studyon the success of altruistic requests. arXiv preprintarXiv:1405.3282 (2014).

[2] Erkko Autio, Linus Dahlander, and Lars Frederiksen.2013. Information exposure, opportunity evaluation,and entrepreneurial action: An investigation of an on-

line user community. Academy of Management Jour-nal 56, 5 (2013), 1348–1371.

[3] Nihit Desai, Raghav Gupta, and Karen Truong. 2016.Plead or Pitch? The Role of Language in KickstarterProject Success. (2016). http://nlp.stanford.edu/courses/cs224n/2015/reports/15.pdf.

[4] Nikolaus Franke, Eric Von Hippel, and Martin Schreier.2006. Finding commercially attractive user innova-tions: A test of lead-user theory. Journal of ProductInnovation Management 23, 4 (2006), 301–315.

[5] Michael D Greenberg, Bryan Pardo, Karthic Hariharan,and Elizabeth Gerber. 2013. Crowdfunding supporttools: predicting success & failure. In CHI’13 ExtendedAbstracts on Human Factors in Computing Systems.ACM, 1815–1820.

[6] Stefan Haefliger, Peter Jäger, and Georg Von Krogh.2010. Under the radar: Industry entry by user en-trepreneurs. Research Policy 39, 9 (2010), 1198–1213.

[7] Rebecca Henderson. 1993. Underinvestment andincompetence as responses to radical innovation: Evi-dence from the photolithographic alignment equipmentindustry. The RAND Journal of Economics (1993),248–270.

[8] Kickstarter.com. 2017. Statistics. (2017). https://www.kickstarter.com/help/stats.

[9] Andrew L Maxwell, Scott A Jeffrey, and MorenLévesque. 2011. Business angel early stage decisionmaking. Journal of Business Venturing 26, 2 (2011),212–225.

[10] Tanushree Mitra and Eric Gilbert. 2014. The languagethat gets people to give: Phrases that predict successon kickstarter. In Proceedings of the 17th ACM con-ference on Computer supported cooperative work &social computing. ACM, 49–61.

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2688

Page 7: Do User Entrepreneurs Speak Different? Applying Natural ...library.usc.edu.ph/ACM/CHI 2017/2exab/ea2683.pdf · 537 Kickstarter campaigns with expert-evaluation of lead-userness. #

[11] Ethan Mollick. 2014. The dynamics of crowdfunding:An exploratory study. Journal of Business Venturing29, 1 (2014), 1–16.

[12] Ethan Mollick and Ramana Nanda. 2015. Wisdom ormadness? Comparing crowds with expert evaluation infunding the arts. Management Science 62, 6 (2015),1533–1553.

[13] J. W. Pennebaker and M. E. Francis. 1999. LinguisticInquiry and Word Count. Lawrence Erlbaum.

[14] Jeffrey M Pollack, Matthew W Rutherford, and Brian GNagy. 2012. Preparedness and cognitive legitimacy asantecedents of new venture funding in televised busi-ness pitches. Entrepreneurship Theory and Practice36, 5 (2012), 915–939.

[15] Josue Reyes and Cristina Bahm. 2016. Crowdfunding:Applying Collective Indexing of Emotions to CampaignVideos. In Proceedings of the 19th ACM Conferenceon Computer Supported Cooperative Work and Social

Computing Companion. ACM, 385–388.[16] Sonali K Shah and Mary Tripsas. 2007. The accidental

entrepreneur: The emergent and collective processof user entrepreneurship. Strategic EntrepreneurshipJournal 1, 1-2 (2007), 123–140.

[17] Michael L Tushman and Philip Anderson. 1986. Tech-nological discontinuities and organizational environ-ments. Administrative science quarterly (1986), 439–465.

[18] Glen L Urban and Eric Von Hippel. 1988. Lead useranalyses for the development of new industrial prod-ucts. Management Science 34, 5 (1988), 569–582.

[19] Eric Von Hippel. 2005. Democratizing innovation: Theevolving phenomenon of user innovation. Journal fürBetriebswirtschaft 55, 1 (2005), 63–78.

[20] Eric Von Hippel, Susumu Ogawa, and Jeroen PJDe Jong. 2011. The age of the consumer-innovator.MIT Sloan Management Review 53, 1 (2011), 27.

Late-Breaking Work CHI 2017, May 6–11, 2017, Denver, CO, USA

2689