semantic web technologies in data management
TRANSCRIPT
-
8/13/2019 Semantic Web Technologies in Data Management
1/4
Semantic Web
Technologies in Data
Management
Theodore Cacciola
TCacciola LLC
January 21, 2014
-
8/13/2019 Semantic Web Technologies in Data Management
2/4
Semantic Web Technologies in Data Management |
Theodore Cacciola
Page 1 of 3
The Semantic WebHere in, we look at Data Management within the context of the Semantic Web. This is the web as we see it. It
is the structure of standard data formats. Data formats can be in the text form like a Microsoft Office
Document or a PDF or formats can be software driven in the form of Java or C#. The structure of the semantic
web gives users the power to share and collaboratively generate decentralized linked data. As we have
touched on before each form of Data Management within the Semantic Web usually requires some form of
Access Authentication. Access authentication is the process of managing process inputs, in form, securing the
data management through Passwords and questions like What do you know in the form of My First Pet.
This access authentication is used to ensure the security and integrity of viable data entry points being used.
Web-based Data Management often requires RDF inputs, similar to an XML file, for example an XML file
being a Sitemap.xml which does or does not allow access for user to access the files. Typically, within an XML
file the inputs are based on Search Engines using the form of XML to answer a question such as Should I
include which given file. RDF is very similar and is a framework used to define any internet recourse.
Internet resources typically dont use Java, unless they are being used on an Android device, but more
typically the languages commonly used in iOS development such as HTML5, CSS3, and JavaScript. Resource
Description Frameworks or RDFs can describe any data about data for example the date you see on a
website. Sometimes developers will use RDF data to show the end-user their IP address or the exact date and
time using simple programmatic code.
Simple web interfaces such as Dreamweaver and Microsoft Visual Studio allow authorized users through the
process of typically FTP and C-Panel to view and edit an RDF directly within a web browser.
We have talked about this before but again the concept of user authentication levels comes up when we talk
about managing read and write access without a centralized authority, enabling a collaborative authoring
environment suited to the semantic web.
-
8/13/2019 Semantic Web Technologies in Data Management
3/4
Semantic Web Technologies in Data Management |
Theodore Cacciola
Page 2 of 3
Resource Description Framework OverviewEfficient management of RDF data is an important factor in realizing the Semantic Web Vision. In the early
1980s, Steve Jobs among others envisioned a web where file transfers and software adjustments could be
made while a user is using a program. We have come very close to this goal. Web Consortium updates are
made almost instantly. Users can log-in whenever the negatives are the Java and C# assets and how they are
changed. Within W3 (World Wide Web) updates can be made instantly, tested on a localhost and changed
without the end-user even knowing but small updates like that of which you will find on an Android Phone,
iPhone, PC, or Mac rely on program updates, Steve Jobs didnt envision your C# programmed computer to
make you wait for the updates to occur, instead the updates are being transitioned to become almost
automatic.
In the W3C or World Wide Web Consortiums Vision, end-users of the semantic web should be able to be
accommodated by access inputs which issue structured queries over all of the internet, when you complete
an access input or any other input for that matter you should receive well-formed answers to your questions
and in Web 2.0 you commonly do.
Although HTML within the web is a great way to share information, HTML is a front-end technology oriented
towards visual presentation and keyword search. Here is where XML comes into play. XML describes content
and promotes machine-to-machine communication and data exchange which can be easily specialized to
meet the needs of a wide-range of data uses. In terms of Data Uses, XML stands for Extensive Markup
Language and is used within many different interfaces either web or software related. It is visually
represented within Android Applications, used in Software Protocols, along with commonly being used in the
W3C for things such as the aforementioned sitemaps to arrange data. It is the universal format for data
exchange.
This universal format facilitates data integration. Extensive Markup Language comes with a wide variety of
software tools such as parsers, programming interfaces, and manipulation languages that facilitate the
development of XML-Based Applications.
Web Data ManagementTo look back at its definition, Web Data Management or WDM refers to a body of work concerned with
leveraging large collections of structured data that can be extracted from the web. This applies inherently to
web search, think about the cookie, this is a file which is stored on a users computer and one which is
explored throughout the web within XML with the goal of improving web search and methods for surfacing
different types of search answers.
Think of the concept of Google, when Google was established, they needed a way to define the best types of
search results. In the late 1990s searches typically gave poor results and there was no definitive algorithm
which was a concrete method for populating search results. This is where the backlink came into play.
-
8/13/2019 Semantic Web Technologies in Data Management
4/4
Semantic Web Technologies in Data Management |
Theodore Cacciola
Page 3 of 3
What the founders of Google did was give what is called more Domain Authority to pages with more
backlinks, when a page was indexed through XML, they also read the viable HTML which included hypertext
references or links which accumulated better web data management within search results.
What Google and the W3C found later on was that new kinds of search results, and better search results can
be obtained by leveraging a huge collection of independent pieces of information gathered from each
website which the Googlebot accessed.
Later, Web-Centric Extraction came about also known more commonly as open information extraction. These
are extraction systems which can effectively create relational databases most typically used in Web 2.0 in
some SQL form which minimized the amount of data stored within one database similar to a zipped file to
which the contents could be accessed without un-zipping the file or set of files. Later on it came about that
instead of using Web-Centric Extraction one could employ large populations of authenticated users to
generate and clean datasets, which is used in Wiki sites like Wikipedia.