what can a business do with a web index?
TRANSCRIPT
- 1. From Trust Flows Understanding The Deloitte Fast 50 Big Data Company You never heard of Until now.
- 2. @tryMajestic Some Stuff Youll Learn How we built a search engine without $30 billion dollars How you can use it to make lots of: Predictions Insights Money Data Stories
- 3. @tryMajestic Reaching for the Stars
- 4. @tryMajestic An Inspiration of a Search Engine
- 5. @tryMajestic Majestic is a Specialist Search Engine Digital knowledge on a grand scale Dixon Jones
- 6. @tryMajestic The BIG specialist search engine Twitter has 500,000,000 Tweets per day on average In the same day, Majestic crawls well over 2,000,000,000 NEW URLs (and sees 7 billion)
- 7. @tryMajestic How do they do that? Information Retrieval in the Zeta age 1. Data Collection 2. Data Grouping 3. Data Indexing 4. Data Matching
- 8. @tryMajestic How to Collect 7 Billion URLs a Day?
- 9. @tryMajestic How to Analyze 200 Billion URLs a Day?
- 10. @tryMajestic Groups Make Search Much Better Find a Fact Find a Friend Find a Customer Finding Anything LibraryofCongresscirca1940 Research At: info.majestic.com/groupresearch
- 11. @tryMajestic We Group AND ANALYSE pages Topical Trust Flow using decay Algorithm ???
- 12. @tryMajestic The Index: For every page we know Its influence in a simple score Its Context Its context by keyword Its Influence in Context! In a series of simple 0-100 scores
- 13. @tryMajestic Works best with Universal Data set Every signal is small Individually prone to error or opinion At scale the error decreases Confidence increases http://info.majestic.com/universal
- 14. @tryMajestic Data Matching
- 15. @tryMajestic Our Data Stack (For the Techies) Crawler: C# .net / Mono NoSQL Read only file system Java Interrogation Dynamic Front End Perl/Ruby etc Hadoop coming soon
- 16. @tryMajestic So we built it Now Imagine What COULD you do with it?
- 17. @tryMajestic 1: Compare Competitor Backlinks
- 18. @tryMajestic Who is more popular on Twitter? 2: Finding influencers Lady Gaga? Barack Obama? Trust Flow 74 Trust Flow 70
- 19. @tryMajestic 3: Prediction Elections Boris v Ken Obama v Romney
- 20. @tryMajestic 4: Lobbying Senators
- 21. @tryMajestic 5: Data Art (Profiling Companies)
- 22. @tryMajestic What if we Pivot? Hadoop Imagine your OWN version of our web index? A subset of the data, prepopulated for your needs Updated Daily / Weekly / Monthly Stored in Open Source Hadoop instances ready for easy interrogation What could you do then?
- 23. @tryMajestic Data Store Examples
- 24. @tryMajestic
- 25. @tryMajestic
- 26. @tryMajestic
- 27. @tryMajestic Ways you could segment the web All domains hosted in [Choose country or City Here] Most influential sites about [Insert 800 Topics Here] Best Web Pages for [Choose 50 Million Phrases Here] Spamiest pages about [Insert 800 Topics Here] Most influential Pages on [Choose any set of sites] Create a set of pages with [Choose properties here] Got a plan? We have the starting point for web data
- 28. @tryMajestic Some Takeaways How we built a search engine without $30 billion dollars How you can use it to make lots of: Predictions Insights Money Data Stories
- 29. @tryMajestic Out of Trust Flows understanding Real insight into the world wide web from Majestic, the specialist search engine
- 30. From Trust Flows Understanding