big data: big issues for ip
TRANSCRIPT
Big data : big issues for IP
Véronique MESGUICHConsultant in competitive intelligence
ex copresident, ADBS (Association of information professionals) www.adbs.fr
Consultant & trainer, expert in competitive intelligence
Ex Co-president of ADBS : first european association of information professionals
Co-writer of « Net recherche » : a methodologic guide on how to find relevant information on the Internet
Who am I ?
What is big data ? New types of patents, and new types of patent
information Technical and organizational breakthroughs New skills for information professionals
Big data : a new paradigm for IP
What is big data
The 3 Vs of big data Volume : petabytes of information Velocity : data are created at very high speed, and may be analyzed in real time Variety : big data are heterogeneous
The analysis of big data can improve decision making, predicting and forecasting, productivity,marketing, e-reputation watch, etc...
90% of all data available today were created in the last two years.
One of the biggest promises in big data is the possibility to reuse data produced via different sources, create new services or predict the future, via the analysis of correlations.
Big Data industry is expected to grow in the next few years. The revenue from Big Data could be worth $100 billion by 2018 (source: ABIResearch).
Big data, big opportunities
Data may be... Linked
Data linked via metadata and searchable via semantic queries
DarkDark data are “the data being collected, but going unused despite its value”. (Gartner Group)
Open Open data : data available freely to anyone to use and republish without restrictions from copyright or patents. Open data are not always public data.
Smart Smart data : data useful for decision making
Humans who search and publish on the Internet (e-mails, SMS, photos, videos...) especially on social networks, and use smartphones or “world wide wear”
Sensors or machines create data transmitted via the Internet
Data can be created by a group of persons : establishing who owns what can be very difficult and challenging
Data may be created by...
Source : Forrester 2014
IP Structured documents Long lifecycle Storing Protection Incentive for
innovation Driven by government
and companies
Ip Vs Big data ?
Big data Unstructured data Real time Sharing Reuse Decision making Driven by people and
machines
1970s : Online databases and information services
1980s : Business intelligence, OLAP
1990s : The web appears
2000s : Social networks
2010s : Spread of big data
Impact of the evolution of tools and methods on IP
Unstructured data/real time:We deal not only with selected and structured data, but heterogeous data, structured or non structured, produced in real timeThe algorithms Hadoop/Map Reduce can provide real time collecting, indexing and storing Cloud computing:Cloud architectures are linked to big data. Agile and powerful architectures are required to optimize resources
Technological and organizational breakthroughs
Social and collaborative methods: Crowdsourcing, open innovation : innovations can easily transfer inward and outward.Data visualisation based on semantic analysis : Correlations beetween data can be extracted automatically Automated analysis and discovery : text mining, graph mining, knowledge representation...
Technological and organizational breakthroughs
Key issues : searching
Datasets are very heterogenous and, unlike classical documents, are not necessarily created for a specific purpose by the traditional “gate keepers” (experts, analysts, researchers…)
It requires new skills in searching information
Can the patent system protect datasets, or
data processing ? Are data patentable ? Is copyright applicable
to big data ? Data are created, manipulated, enriched,
reused... How can be patented the process of
assembling, enhancing or organizing data ?
Key issues : new data for IP, new IP for data
The ownership of data, and the right to reuse them.
Do the data belong to their many creators ? Is the concept of copyright adapted to data generated by machines ?
Key issues : the ownership of data
Data scientist : core skills(source: Radar O'Reilly)
Base in statistics, algorithms, data mining, machine learning and mathematics
Knowledge of open-source tools : Hadoop, Java, Python
Making data available to users : prototypes, using external APIs, integration with other services, visualisation
A new librarian : the Data librarian
Data Reference Services Librarian
Data Services Librarian
Social Science Data Librarian
Business and Social Sciences Librarian
Science Research Librarian
Data and eScience Librarian
Science Data Librarian
GIS Librarian
Research Data Management Librarian
Data Curation Librarian...
Quantitative Data Collections Librarian
Librarian for Data Visualization
Assessment Librarian....
Data managementdata management planningissues such as copyright, intellectual property, licensing of data, embargoes, ethics and re-use, privacystoring and managing data during the research project (curation)depositing data in archives at the end of the project, determining retention and disposalopen access and publishing of dataresearch organisation policies affecting dataMetadata managementcreating and maintaining metadatadeveloping and applying metadata standardsUsing data (data as a resource)finding or obtaining data for re-useciting datadata analysis tools and support servicesdata literacy (an extension of information literacy to include the ability to "access, assess, manipulate, summarize and present data"
(Source : Australian National Data Service)
Data librarian : missions
Chief data officer : missions Missions : acquiring,
storing, enriching and leveraging the company’s data assets.
Data inventory Data governance Not a core technical
profile
Chief analytics officerSource : https://infocus.emc.com/william_schmarzo/new-roles-in-the-big-data-world/
Analytic assets: Collaborate with the data science team to inventory analytic models and algorithms throughout the organization.
Analytics valuation: Establish a framework and process for determining the relative value of the organization’s analytic assets.
Intellectual Property management: Develop processes and manage a repository for the capture and sharing of organizational IP (check-in, check-out, versioning).
Patent applications: Manage the patent application and tracking process for submitting patents to protect key organizational analytics IP.
Intellectual Property protection: Monitor industry analytics usage to identify potential IP violations, and then lead litigation efforts to stop or get licensing agreements for IP violations.
Intellectual Property Monetization: Actively look for business partners and opportunities to sell or license organizational analytics IP.
Big data will change not only patents information, but will also generate new types of patents
IP should evolve according to the development of the SMAC model (social, mobility, analytics and cloud)
It will require new skills for
information professionals
Summary