python introduction - iptricardo/ficheiros/python-introductiontopython.pdf · python is widely used...
TRANSCRIPT
What is Information Retrieval?
This presentation was developed by Ricardo Campos, Professor of ICT of the Polytechnic Institute of Tomar and researcher of LIAAD - INESC TEC. Part of the slides used in this presentation were adapted from presentations found in internet and from reference bibliography:
• Dipanjan Sarkar (2016). Text Analytics with Python
• http://nbviewer.jupyter.org/github/jrjohansson/scientific-python-lectures/blob/master/Lecture-0-Scientific-Computing-with-Python.ipynb
• https://www.tutorialspoint.com/python/python_overview.htm
What is Information Retrieval?
AGENDAWhat is this talk about?
History
2Why
1Features
3
Anaconda
5
Advantages
4
PyCharm
6Resources
7Q&A
8
What is Information Retrieval?
Python is a scientific language and it is the first choice of Scientists much due to its large
community of users, easy to find help and documentation.
Extensive ecosystem of scientific libraries and environments:
• numpy: http://numpy.scipy.org - Numerical Python
• scipy: http://www.scipy.org - Scientific Python
• matplotlib: http://www.matplotlib.org - graphics library
No license costs, no unnecessary use of research budget.
Python is especially good for our purposes in that it does not have a lot of “overhead” before getting
started. It is easy to jump in and experiment with Python in an interactive fashion.
What is Information Retrieval?
Python is widely used in several domains including artificial intelligence (AI), game development,
robotics, Internet of Things (IoT), computer vision, media processing, and network and system
monitoring, just to name a few. Although Python can be used for solving a lot of problems, here
are some of the most popular domains:
• Scripting: Python is known as a scripting language. It can be used to perform many tasks, such
as interfacing with networks and hardware and handling and processing files and databases,
performing OS operations, and receiving and sending email. Python is also used extensively
for server-side scripting and even for developing entire web servers for serving web pages.
Popular Domains
What is Information Retrieval?
• Web development: There are a lot of robust and stable Python frameworks out there that are
used extensively for web development, including Django, Flask, Web2Py, and Pyramid.
• Graphical user interfaces (GUIs): A lot of desktop-based applications with GUIs can be easily
built with Python. Libraries and APIs like tkinter, PyQt, PyGTK, and wxPython allow developers
to develop GUI-based apps with simple as well as complex interfaces.
• Systems programming: We can use Python to perform OS operations including creating,
handling, searching, deleting, and managing files and directories. The Python standard library
(PSL) has OS and POSIX bindings that can be used for handling files, multi-threading, multi-
processing, environment variables, controlling sockets, pipes, and processes.
What is Information Retrieval?
• Database programming: Python is used a lot in connecting and accessing data from different
types of databases, be it SQL or NoSQL. APIs and connectors exist for these databases like
MySQL, MSSQL, MongoDB, Oracle, PostgreSQL, and SQLite. In fact, SQLite, a lightweight
relational database, now comes as a part of the Python standard distribution itself.
• Scientific computing: Python really shows its flair for being multipurpose in areas like numeric
and scientific computing. You can perform simple as well as complex mathematical operations
with Python, including algebra and calculus. Libraries like SciPy and NumPy help researchers,
scientists, and developers leverage highly optimized functions and interfaces for numeric and
scientific programming. These libraries are also used as the base for developing complex
algorithms in various domains like machine learning.
What is Information Retrieval?
• Machine learning: Python is regarded as one of the most popular languages today for
machine learning. There is a wide suite of libraries and frameworks, like scikit-learn, h2o,
tensorflow, theano, and even core libraries like numpy and scipy, for not only implementing
machine learning algorithms but also using them to solve real-world advanced analytics
problems.
• Text analytics: Python can handle text data very well, and this has led to several popular
libraries like nltk, gensim, and pattern for NLP, information retrieval, and text analytics. You can
also apply standard machine learning algorithms to solve problems related to text analytics.
This ecosystem of readily available packages in Python reduces time and efforts taken for
development.
What is Information Retrieval?
Python was developed by Guido van Rossum in the late eighties and early nineties
at the National Research Institute for Mathematics and Computer Science in the
Netherlands.
Guido Van Rossum published the first version of Python code (version 0.9.0) at
alt.sources in February 1991.
In 2008, Python 3 was released on an almost-unthinkable premise - a complete
overhaul of the language, with no backwards compatibility. The decision was
controversial, and born in part of the desire to clean house on Python.
What is Information Retrieval?
• clean and simple language: Easy-to-read and intuitive code, easy-to-learn minimalistic
syntax. Reading a good Python program feels almost like reading English.
public class Hello {public static void main(String[] args){
System.out.println("Hello world!");}}
Hello.java
12345
print "Hello world!"
hello.py
12345
What is Information Retrieval?
• expressive language: Fewer lines of code, fewer bugs, easier to maintain.
• dynamically typed: No need to define the type of variables, function arguments or return
types.
• Free and Open Source software.
• Portable: due to its open-source nature, Python has been ported to many platforms. All your
Python programs can work on several platforms without requiring any changes.
What is Information Retrieval?
• interpreted: You just run the program directly from the source code. Internally, Python
converts the source code into an intermediate form called bytecodes and then translates this
into the native language of your computer and then runs it.
This also makes your Python programs much more portable, since you can just copy your
Python program onto another computer and it just works!
What is Information Retrieval?
• Object Oriented: Python supports procedure-oriented programming as well as
object-oriented programming. Python has a very powerful but simplistic way of
doing OOP, especially when compared to big languages like C++, C# or Java.
• Interactive: You can actually sit at a Python prompt and interact with the
interpreter directly to write your programs.
• Beginner’s Language: Python is a great language for the beginner-level
programmers and supports the development of a wide range of applications
from simple text processing to WWW browsers to games.
• Databases: Python provides interfaces to all major commercial databases.
What is Information Retrieval?
• The main advantage is ease of programming, minimizing the time required to develop,
debug and maintain the code.
• Besides the standard library, thousands of third-party libraries are readily available on the
Internet, encouraging open source and active development. The official repository for
hosting third-party libraries and utilities for enhancing development in Python is the Python
Package Index (PyPI). Access it at https://pypi.python.org and check out the various
packages. Currently there are over 118,000 packages you can install and start using.
• The pseudo-code nature of Python is one of its greatest strengths. It allows you to
concentrate on the solution to the problem rather than the language itself.
What is Information Retrieval?
This package provides a lot of advantages, especially for Windows users, where installing some
of the packages like numpy and scipy can sometimes cause issues.
Anaconda comes with conda, an open source package and environment management system,
and Spyder (Scientific Python Development Environment), an IDE for writing and executing your
code.
Anaconda is a complete Python distribution with over 700 packages, known as the Anaconda
Python distribution, from Continuum Analytics, which is built specially for data science and
analytics, at https://www.anaconda.com/download/.
Setup
IMPORTANT NOTE: Anaconda cannot be installed under usernames with spaces or accents (e.g.,
c:\simão. Thus, if your username has an accent you should install Anaconda under a different
folder. In order to do so, execute the install file of Anaconda with administrative privilegies
What is Information Retrieval?
Once the installation is complete, start the
jupyter notebook.
Jupyter notebook
What is Information Retrieval?
Jupyter notebook is an HTML-based
notebook environment for Python,
similar to Mathematica or Maple.
Jupyter notebook
The working directory of Jupyter is:
c:\user\NomeUser
What is Information Retrieval?
If you want to change that directory do the following:
(1) On anaconda command line type in the following: jupyter notebook --generate-config
This will generate a file jupyter/jupyter_notebook_config.py on the folder indicated during the
execution of the command.
Open that file and search for: c.NotebookApp.notebook_dir
Specify your new working directory (e.g., ‘H:\\JupyterNotebooks’) and remove the #
IMPORTANT NOTE: note that, the folder where you intend to keep your files (e.g.,
‘H:\\JupyterNotebooks’) cannot have spaces or accents.
Jupyter notebook – Change Default Directory
What is Information Retrieval?
If you want to change that directory do the following:
2. Search for Jupyter Notebook on Windows search feature - right click – open “localização do
ficheiro”
Right click – Propriedades - Atalho
Em iniciar colocar o endereço definido anteriormente (e.g., ‘H:\\JupyterNotebooks’)
No destino (no final) remover: %USERPROFILE%
Jupyter notebook - Change Default Directory
What is Information Retrieval?
If you want to change the browser that the system is opening with:
• Open (once again) the file jupyter_notebook_config.py
• Replace the “#c.NotebookApp.browser” by the following code
• For Chrome:
import webbrowser
webbrowser.register('chrome', None, webbrowser.GenericBrowser('C:\Program Files
(x86)\Google\Chrome\Application\chrome.exe'))
c.NotebookApp.browser = 'chrome'
Jupyter notebook - Change Default Browser
What is Information Retrieval?
If you want to change the browser that the system is opening with:
• Open (once again) the file jupyter_notebook_config.py
• Replace the “#c.NotebookApp.browser” by the following code
• For Firefox:
import webbrowser
webbrowser.register('firefox', None, webbrowser.GenericBrowser('C:\\Program Files
(x86)\\Mozilla Firefox\\firefox.exe'))
c.NotebookApp.browser = 'firefox'
Jupyter notebook – Change Default Browser
What is Information Retrieval?
https://github.com/ipython-contrib/jupyter_contrib_nbextensions
No Anaconda Prompt faça:
pip install jupyter_contrib_nbextensions
jupyter nbextensions_configurator enable --user
jupyter contrib nbextension install --user
Depois no jupyter escolha as opções CodeFolding, Collapsible Headings and Table of Contents(2)
Jupyter extensions
What is Information Retrieval?
under the tab nbextensions on jupyter notebook you will also need to configure “Table of
Contents(2)”
Increase the Maximum level of nested sections to 5
Skip H1 headings
Add a table of Contents cell at the top of the notebooks
Display toc Windows/sidebar at startup
Jupyter extensions – Table of Contents (2)
What is Information Retrieval?
PyCharm Community (https://www.jetbrains.com/pycharm/download/#section=windows)
After installing you might want to have a look at a series of videos available at YouTube (search
for “Getting Started with PyCharm"
I also suggest you to have a look at the following link which discusses:
- Choosing interpreter
- Creating a virtual environment
- Creating a python file
https://www.jetbrains.com/help/pycharm/creating-and-running-your-first-python-project.html
What is Information Retrieval?
Execute your own code online on jupyter / docker (https://mybinder.org/):
What is Information Retrieval?
• http://www2.ic.uff.br/~vanessa/
• https://www.programiz.com/python-programming
• https://pypi.python.org/pypi
• http://www.openbookproject.net/thinkcs/python/english2e/
• http://mcsp.wartburg.edu/zelle/python/
• https://www.codecademy.com/en/tracks/python
• https://automatetheboringstuff.com/
• https://www.coursera.org/learn/python
• https://developers.google.com/edu/python/?csw=1
• https://docs.python.org/3.3/library/index.html
• http://www.nltk.org/book/
What is Information Retrieval?
• How to Think Like a Computer Scientist: Interactive Edition
http://interactivepython.org/runestone/static/thinkcspy/index.html
• Problem Solving with Algorithms and Data Structures using Python: Interactive Edition
http://interactivepython.org/runestone/static/pythonds/index.html
• Programs, Information and People: Interactive Edition
http://interactivepython.org/runestone/static/pip2/toc.html#
What is Information Retrieval?
• Introducing to Programming using Python by Y. Daniel Liang
• Python for Informatics – Exploring Information, by Charles Severance
• Python for Everybody, by Charles Severanchttp://do1.dr-
chuck.com/pythonlearn/EN_us/pythonlearn.pdfe ()
• Programação em Python - Fundamentos e Resolução de Problemas by Ernesto Costa
• Think Python (free book: http://greenteapress.com/wp/think-python/)