1 random thought on research methods in cs/cis csci 6530 july 1, 2010 kwok-bun yue university of...
DESCRIPTION
Merriam-Webster Research –1 : careful or diligent search –2 : studious inquiry or examination; especially : investigation or experimentation aimed at the discovery and interpretation of facts, revision of accepted theories or laws in the light of new facts, or practical application of such new or revised theories or laws –3 : the collecting of information about a particular subject 7/1/2010Bun Yue: 3TRANSCRIPT
1
Random Thought on Research Methods
in CS/CIS
CSCI 6530July 1, 2010
Kwok-Bun YueUniversity of Houston-Clear Lake
Random
• Random: not organized.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 2
Merriam-Webster
• Research– 1 : careful or diligent search– 2 : studious inquiry or examination; especially :
investigation or experimentation aimed at the discovery and interpretation of facts, revision of accepted theories or laws in the light of new facts, or practical application of such new or revised theories or laws
– 3 : the collecting of information about a particular subject
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 3
For what?
• Finding new things: facts, theories, processes, tools, relationships, techniques.
• Solving problems
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 4
Why Research?
• Solving problems.• Enhancing understanding.• Career enhancement.• Curiosity and fun.• …
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 5
Research Methods
• Discipline dependent.– E.g. medical research: double blind test
with control.• Scientific methods.• Empirical methods.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 6
Starting Research
• What do you need to start your research?– Talk! Talk! Talk!– Think! Think! Think!– Read! Read! Read!
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 7
Asking Questions
• ASK! ASK! ASK!
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 8
Not Asking Questions
• Easy• Comfortable• Familiar• …
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 9
Asking is crucial
• Get a context of the problem from many angles.
• Organize your thought.• Model and refine your understanding.• Discover new information and insight.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 10
Intellectual Curiosity
• A key for deep understanding, important discovery and … fun.
• Sometimes not too output driven: need of ‘down’ time.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 11
• Recommended reading: Surely You're Joking, Mr. Feynman! (Adventures of a Curious Character) by Richard Feynman.
Keeping an open mind
• Keep an open mind as long as possible.– Do not jump to the first solution that you
have come up with.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 12
Research in Physics
• Scientific Methods:1. Observe, ask questions and understand2. Make hypothesis and model3. Make (precise) predictions using the
hypothesis.4. Test the predictions.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 13
Questions in Physics
• Fundamental questions: e.g.– Can the four fundamental forces be unified:
theory of everything?– Where do our universe come back?– What are elementary particles make of?
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 14
Results in Physics
• Theories: e.g.– Superstring theory.– Big bang theory– Quarks
• New facts.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 15
Validations in Physics
• Experiment with predictions by theories.• E.g.: Big bang theory predicts
abundance of light elements.– Positive results: add confidence.– Negative results: reject theory.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 16
Questions in Computing
• Much more diverse. Have aspects from most other areas: engineering, science, humanities, …
• Can create your own ‘universe’. (vs economic, for example)
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 17
Result in CS
• New theories, algorithms, processes, methods, facts, etc.
• New models, problems and application areas.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 18
Validations
• Direct validation• Theoretical analysis• Simulation• Benchmarking• Statistical methods• …
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 19
Planning: Goals
• Output oriented incentives can be too ‘far away’.
• Setting plans and goals.– Create a detail plan of steps and
benchmarks.– Small goals every step.– Consider input-oriented goals.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 20
Early Web Business Model
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 21
BuildWebsites
AttractHuge Traffic
Somethinghappens
Rich!
Thesis
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 22
UnderstandProblem
Design and ImplementSolution
Good thinghappens
Done!
Detailed Plan
• Create a road map with enough details to the final goals.– Preparation.– Planning– Risk Management
• Recommended reading: Ed Viesturs, “No Shortcuts to the Top: Climbing the World's 14 Highest Peaks”
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 23
Areas of My Research Interest
• Internet Computing• XML and semi-structured data • CS and IS education• Concurrent Programming
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 24
(Older) XML Projects
• Storage of XML in relational database (Used as an example)
• XML Metrics
10/5/2005 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 25
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 26
Storing XML in RDB
• Advantages:– Mature database technologies.– May be queried by
• XML technology: e.g. XPath, XQuery.• RDB technology: e.g. SQL.
• Disadvantages: – impedance mismatch: XML and relations
are different data models.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 27
Related Issues
• Effective mapping XML DTDs (~ ordered tree model) to relational schemas.
• Mapping of XML queries (e.g. XQuery) to RDB queries (e.g. SQL).
• Mapping of RDB query results back to XML format.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 28
Related Work and Context
• Mapping – With or without schemas for XML.– With or without user input.
• Schemas for XML:– Document Type Definition (DTD)– XML Schema
• We consider mapping with DTD and without user input.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 29
Naïve Mapping
• An XML element is mapped to a relation.
Example 1a:XML:
<a><b><c><d>hello</d></c></b></a>-> Relations: a, b, c and d.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 30
Problems of Naïve Mapping
• Many relations.• Ineffective queries: multiple query joins.Example 1b:XPath Query: //aSQL Query: need to join the relations a, b,
c and d.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 31
Inlining Algorithms
• First proposed by Shanmugasundaram, et. al.
• Expanded by Lu, Lee, Chu and others.• Extended in various directions by various
researchers, e.g.,– Preserving XML element orders.– Preserving XML constraints.
• Do not consider extensions here.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 32
Basic Idea of Inlining Algorithms
• Inline child element into the relation for the parent element when appropriate.
• Different inlining algorithms differ in inlining criteria.
Example 1c: XML: <a><b><c><d>hello</d></c></b></a>
Inlined Relation: a.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 33
Inlining Algorithms
• Child elements & attributes may be inlined.
• Child elements may not have their own relations.
• Results in less number of relations.• In general, more inlining -> less joins.
10/5/2005 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 34
Inlining Algorithm Structure
1. Simplification of DTD.2. Generation of DTD graphs3. Generation of Relational Schemas
Our work
• Improved on simplification of DTD and generation of DTD graphs.
• Constructed a new aggressive inlining algorithm.
• Student: Alakappan.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 35
Internet Computing
• Web bias (older project)• Web 2.0 framework (IS project)• Content Management Software (CMS):
Joomla (CS/IS Education)• Mashup: Yahoo Pipe (CS/IS Education)
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 36
Measuring Web Bias
• Search engines dominate how information are accessed.
• Search results have major social, political and commercial consequences.
• Are search engines biased?• How bias are them?
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 37
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 38
Previous Works
• To measure bias, results should be compared to a norm.
• The norm may be from human experts.• Mowshowitz and Kawaguchi: the
average search result of a collection of popular search engines as the norm.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 39
Mowshowitz and Kawaguchi
SE1
SEn
URLS1
URLSn
NORMURLS
URLVector1
URLVectorn
union NORMURL
Vector
Bias1
Biasn
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 40
Limitations
• Based on URL Vector -> cannot measure bias quality.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 41
Our Approach
• Use Kleinberg’s HITS algorithm to create clusters, authorities and hubs of the result norm URLs.
• Use them as norm clusters, authorities and hubs.
• Measure distances between norms and individual results as bias.
10/5/2005 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 42
Our Approach
SE1
SEn
URLS1
URLSn
NORMURLS
URLVector1
URLVectorn
union NORMClusterVector
Bias1
Biasn
NORMCluster
ClusterVector1
ClusterVectorn
Recent Projects
• Web 2.0 framework:– A model and framework to study Web 2.0
technologies, implications and trends.– Collaborator: Mr. Tracy Gate.– Publications: Pre-ICIS Workshop and
Communications of AIS.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 43
CMS: Joomla
• Question: Using CMS/Joomla for capstone project.
• Methodology: projects and surveys.• Collaborator:
– Capstone project teams.– Industrial mentor: Dilhar DeSilva
• Publication: Journal of Information Systems Education.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 44
End User Programming
• Use of Yahoo/Pipeline in constructing Web Mashup.
• Methodology: projects and surveys.• Collaborators: students in the XML
class in Summer 2009.• Publication: Journal of Information
Systems Education.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 45
Ongoing projects
• Googlewave as communications/collaboration tools in capstone projects and software project management.
• Collaborators: capstone project students.
• Publications: under preparation.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 46
Open Source Software
• Use of OSS in educational institutes.• Methodology: meta-analysis.• Collaborators: two master students.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 47
Other recent projects
• Assessment• Scholarship• Student Response Systems
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 48
Interested?
• Come and talk with me.
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 49
7/1/2010 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 50
Conclusions
• Good time to do applied computing research in the Web, XML and other areas.
• Style: hands-on supervision + publications.
• Don't forget to donate a scholarship to the School if your future research leads to a windfall.
10/5/2005 Bun Yue: [email protected], http://dcm.uhcl.edu/yue slide 51
Questions?
• Any Questions?• Thanks!