data science tool set numpy & scipy · 2/4/2019  · scipy crash course functions scipy...

36
DATA SCIENCE TOOL SET NUMPY & SCIPY

Upload: others

Post on 23-Mar-2020

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 3: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

3

AGENDA

0

1

2

3

4

5

6

7

8

Toolsets for Python

NumPy SciPy

Page 4: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

Prepared from:https://docs.scipy.org/doc/numpy/user/quickstart.html

NumPy’s main object is the homogeneous multidimensionalarray.

It is a table of elements (usually numbers), all of the sametype, indexed by a tuple of positive integers.

In NumPy dimensions are called axes.

NumPy’s array class is called ndarray.

Page 5: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

BASICS

ndarray.ndim ndarray.shape ndarray.size ndarray.dtype ndarray.itemsize ndarray.data

import numpy as npndarray = np.arange(15).reshape(3, 5)ndarray

print (ndarray.ndim)print (ndarray.shape)print (ndarray.size)print (ndarray.dtype)print (ndarray.itemsize)print (ndarray.data)

Page 6: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

ARRAY CREATION

import numpy as npa = np.array([2,3,4])print (a)b = np.array([(1.5,2,3), (4,5,6)])print (b)c = np.array( [ [1,2], [3,4] ], dtype=complex )print(c)

import numpy as npnp.zeros( (3,4) )np.ones( (2,3,4), dtype=np.int16 ) np.empty( (2,3) ) b = np.random.random((2,3))

np.arange( 10, 30, 5 )

np.arange( 0, 2, 0.3 )

np.linspace( 0, 2, 9)

Page 7: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

ARRAY CREATION

import numpy as npndarray = np.arange(15).reshape(3, 5)x=np.zeros_like(ndarray)x

np.ones_like

np.empty_like

np.random.rand(3,2): Random values in a given shapefrom a uniform distribution over [0, 1)

Page 8: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

BASICS OPERATIONS

Unary operations a.sum() a.min() a.max()

Binary operations A * B # elementwise product A @ B # matrix product A.dot(B) # another matrix product

A = np.array( [[1,1],[0,1]] )B = np.array( [[2,0],[3,4]] )A*BA@B

Page 9: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

BASICS OPERATIONS

Axis parameter b.sum(axis=0) # sum of each column b.min(axis=1) # min of each row b.cumsum(axis=1) # cumulative sum along each row

b = np.arange(12).reshape(3,4)b

Page 10: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

UNIVERSAL FUNCTIONS

add average round sort sqrt ceil exp floor

Page 11: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

UNIVERSAL FUNCTIONS

max: Return the largest item in an iterable orthe largest of two or more arguments.

maximum: Element-wise maximum of arrayelements.

Min

Minimum

nonzero: Return the indices of the elementsthat are non-zero.

Page 12: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

UNIVERSAL FUNCTIONS

Mean

median

corrcoef: Return Pearson product-moment correlation coefficients.

cov: Estimate a covariance matrix, given data and weights.

std: Compute the standard deviation along the specified axis.

var: Compute the variance along the specified axis.

Page 13: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

UNIVERSAL FUNCTIONS

all: Test whether all array elements along a given axis evaluate to True. any argmax: Returns the indices of the maximum values along an axis. argmin argsort sum: Sum of array elements over a given axis. trace: Return the sum along diagonals of the array. transpose where: Return elements chosen from x or y depending on condition. re: Regular expression operations

Page 14: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

INDEXING, SLICING AND ITERATING

Page 15: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

INDEXING, SLICING AND ITERATING

def f(x,y): return 10*x+y

b = np.fromfunction(f,(5,4),dtype=int)b

Page 16: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

INDEXING, SLICING AND ITERATING

List to Array

from numpy import array# list of datadata = [11, 22, 33, 44, 55]# array of datadata = array(data)print(data)print(type(data))

a = np.arange(10, 1, -2)

a[np.array([3, 1, 2 ])]

a[np.array([1, 3, -2])]

a[slice(1,4,2)]

a[1:4 :2]

Page 17: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

INDEXING, SLICING AND ITERATING

x[1,2,...] is equivalent to x[1,2,:,:,:], x[...,3] to x[:,:,:,:,3] x[4,...,5,:] to x[4,:,:,5,:]

a=np.array([[1,2],[3,4]])a

for row in a:print(row)

for element in a.flat:print(element)

Page 18: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

SHAPE MANIPULATION

reshape resize

a = np.floor(10*np.random.random((3,2)))a

Page 19: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

STACKING

vstack hstack

a = np.floor(10*np.random.random((2,2)))a

b = np.floor(10*np.random.random((2,2)))b

np.vstack((a,b))

np.hstack((a,b))

Page 20: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

COPIES AND VIEWS

No Copy at All: Simple assignments make no copy of array objects or of their data.

View or Shallow Copy: The view method creates a new array object that looks at the same data.

Deep Copy: The copy method makes a complete copy of the array and its data.

Page 21: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

COPIES AND VIEWSNo Copy at All

Page 22: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

COPIES AND VIEWSView or Shallow Copy

a = np.arange(12).reshape(3,4)c = a.view()c is a

c.base is a # c is a view of the data owned by a

c.shape = 2,6 # a's shape doesn't change

a.shape

c[0,4] = 1234 # a's data changes

Page 23: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

NUMPYCRASH COURSE

COPIES AND VIEWSDeep Copy

a = np.arange(12).reshape(3,4)d = a.copy() # a new array object with new data is createdd is a

Page 24: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

24

AGENDA

0

1

2

3

4

5

6

7

8

Toolsets for Python

NumPy SciPy

Page 25: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

INTRODUCTION

Prepared From:https://docs.scipy.org/doc/scipy/reference/tutorial/index.html

SciPy is a collection of mathematical algorithms andconvenience functions built on the Numpy extension ofPython.

SciPy is organized into subpackages covering differentscientific computing domains.

Page 26: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

SCIPY ORGANIZATION

Sub-package Description

cluster Clustering algorithms

constants Physical and mathematical constants

fftpack Fast Fourier Transform routines

integrate Integration and ordinary differential equation solvers

interpolate Interpolation and smoothing splines

io Input and Output

linalg Linear algebra

ndimage N-dimensional image processing

Page 27: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

SCIPY ORGANIZATION

Sub-package Description

odr Orthogonal distance regression

optimize Optimization and root-finding routines

signal Signal processing

sparse Sparse matrices and associated routines

spatial Spatial data structures and algorithms

special Special functions

stats Statistical distributions and functions

Page 28: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

FUNCTIONS

Scipy sub-packages need to be imported separately.

The top level of scipy also contains functions from numpy and numpy.lib.scimath.

from scipy import linalg, optimize

import numpy as npnp.some_function()

from scipy import some_modulesome_module.some_function()

Page 29: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

BASIC FUNCTIONS

np.cast['f'](np.pi) np.r_ np.c_ np.select

Page 30: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

POLYNOMIALS

poly1d class Accepts coefficients or polynomial roots to initialize a

polynomial Operations

Algebraic expressions Integration Differentiation Evaluaation

from numpy import poly1dp = poly1d([3,4,5])print(p)

print(p*p)

Page 31: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

LINEAR ALGEBRA

scipy.linalg contains all the functions in numpy.linalg. plus some other advanced functions.

very fast

import numpy as npfrom scipy import linalgA = np.array([[1,2],[3,4]])A

linalg.inv(A)

Page 32: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

LINEAR ALGEBRA

Finding Inverse Solving linear system

Page 33: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

LINEAR ALGEBRA

Solving linear system

import numpy as npfrom scipy import linalgA = np.array([[1, 3,5], [2,5,1],[2, 3, 8]])A

b = np.array([[10], [8],[3]])b

linalg.inv(A).dot(b) #Slow

np.linalg.solve(A, b)

Page 34: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

SCIPYCRASH COURSE

LINEAR ALGEBRA

Finding Determinant: linalg.det Matrix Exponential: linalg.expm Matrix Logarithm: linalg.logm Trigonometric functions: linalg.sinm, linalg.cosm, and linalg.tanm Hyperbolic trigonometric functions: linalg.sinhm, linalg.coshm , and linalg.tanhm

Special matrices: block diagonal scipy.linalg.block_diag circulant scipy.linalg.circulant Pascal scipy.linalg.pascal …

from scipy.linalg import pascalpascal(4)

pascal(4, kind='lower')

Page 35: DATA SCIENCE TOOL SET NUMPY & SCIPY · 2/4/2019  · SCIPY CRASH COURSE FUNCTIONS Scipy sub-packages need to be imported separately. The top level of scipy also contains functions

35

EXPLORE PACKAGES

15

15 17 15SCIPY.STATS

SCIPY.NDIMAGE

SCIPY.SPATIAL

SCIPY.SIGNAL