data visualizations of hyip dataset

27
Data Visualizations of HYIP Dataset Jie Han Quantifying the World April 23, 2012

Upload: others

Post on 03-Feb-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Visualizations of HYIP Dataset

Data Visualizations of HYIP Dataset

Jie Han

Quantifying the WorldApril 23, 2012

Page 2: Data Visualizations of HYIP Dataset
Page 3: Data Visualizations of HYIP Dataset

Financial Cryptography 2012

http://fc12.ifca.ai/pre-proceedings/paper_27.pdf

This could be you!!!

Page 4: Data Visualizations of HYIP Dataset

Overview

1. What's an HYIP?2. Dataset 3. Processes4. R graph examples5. Google Chart examples6. Some helpful hints

Page 5: Data Visualizations of HYIP Dataset

High Yield Investment Programs (HYIPs)

● Also known as a Ponzi or pyramid scheme● Promise high returns on investment● Pay existing investors with revenue from new

investors● Unsustainable in the long run

Page 6: Data Visualizations of HYIP Dataset

Why are HYIPs a problem?

● Advertised as legitimate investments

● Sophisticated online ecosystem in support of the schemes

Page 7: Data Visualizations of HYIP Dataset

HYIP Website

Page 8: Data Visualizations of HYIP Dataset

HYIP Aggregator Websites

Page 9: Data Visualizations of HYIP Dataset

HYIP Variables

Page 10: Data Visualizations of HYIP Dataset

HYIP Lifetime

Typical life cycle of an HYIP:

Page 11: Data Visualizations of HYIP Dataset

About the Data

● Since 11/17/2010, still running● Collected data from nine "aggregator" websites● Total observations: 141k+● Total HYIPs observed: 1,576+

Page 12: Data Visualizations of HYIP Dataset

Process

Data collection (Python, crontab, mongoDB)

Preliminary analysis (Python, R)

Continue data collection, work on parsing all aggregators (Python)

Look at what we have, decide on what we want (R)

Difficulties in analyzing data -> create interactive data visualizations (Python, Google Charts, JS, HTML)

Use new tools to look for patterns (browser & eyes)

Page 13: Data Visualizations of HYIP Dataset

How an R Chart Gets Generated

Data Collection (Python)

Parse data & insert into db (Python, mongoDB)

Fetch & manipulate data (Python, mongoDB, R)

Output a .pdf image to server

New user input (HTML forms)

Front End

Back End

User interact with data in browser

Background scripts

Page 14: Data Visualizations of HYIP Dataset

How Can We Trust Aggregator Data?

CDF of Standard Deviations of HYIP Lifetimes ● Aggregators agree 80% of the time

Page 15: Data Visualizations of HYIP Dataset

How Long Do HYIPs Last Before Collapsing?

Survival function of HYIP Lifetimes● Most HYIPs collapse within a few weeks

Page 16: Data Visualizations of HYIP Dataset

What Factors Lead to Collapse?

Factors that lead to shorter HYIP lifespans:● Higher advertised rates of return● Shorter mandatory investment terms

Page 17: Data Visualizations of HYIP Dataset

R vs. Google Charts

● Useful if familiar with the dataset

● Good at presenting aggregate summaries

● Large learning curve, especially when you want to do something specific

● More customizable● Most analysis techniques

are available

● Anyone can view & interact with the data

● See a complete data distribution

● Learning curve isn't bad● Not as customizable● Have to wait for updates for

more functionality, or write your own

R Google Charts

Page 18: Data Visualizations of HYIP Dataset

How a Google Chart Gets GeneratedData Collection

(Python)

Parse data & insert into db (Python, mongoDB)

Fetch & manipulate data (Python, mongoDB, R)

Write JS & HTML page (Python, JS, HTML, CSS)

New user input (HTML forms)

User interact with data in browser

Background scripts

Back End

Front End

Page 21: Data Visualizations of HYIP Dataset

Variable Changes Over Time

cherryshares.com, aggregator ratingLink

Page 25: Data Visualizations of HYIP Dataset

General Programming Tips● Spend time on data quality● Organize your code, variable names, and files● Keep records of working examples● Plan out your code to maximize pattern capture● Error-catching, browser consoles, and regexes

are friends● Test out chunks of code before putting them

together● Google Tables take a while to load for large

datasets● Google Charts Playground allows you to test code

in their environment

Page 26: Data Visualizations of HYIP Dataset

Future Work

● Create an interactive web based visualization for our dataset - some examples I made

● Link scams together● Explore larger dataset

Page 27: Data Visualizations of HYIP Dataset

Thanks!