DATA SCIENCE TOOL SET
NUMPY & SCIPY
3
AGENDA
0
1
2
3
4
5
6
7
8
Toolsets for Python
NumPy SciPy
NUMPYCRASH COURSE
Prepared from:https://docs.scipy.org/doc/numpy/user/quickstart.html
NumPy’s main object is the homogeneous multidimensionalarray.
It is a table of elements (usually numbers), all of the sametype, indexed by a tuple of positive integers.
In NumPy dimensions are called axes.
NumPy’s array class is called ndarray.
NUMPYCRASH COURSE
BASICS
ndarray.ndim ndarray.shape ndarray.size ndarray.dtype ndarray.itemsize ndarray.data
import numpy as npndarray = np.arange(15).reshape(3, 5)ndarray
print (ndarray.ndim)print (ndarray.shape)print (ndarray.size)print (ndarray.dtype)print (ndarray.itemsize)print (ndarray.data)
NUMPYCRASH COURSE
ARRAY CREATION
import numpy as npa = np.array([2,3,4])print (a)b = np.array([(1.5,2,3), (4,5,6)])print (b)c = np.array( [ [1,2], [3,4] ], dtype=complex )print(c)
import numpy as npnp.zeros( (3,4) )np.ones( (2,3,4), dtype=np.int16 ) np.empty( (2,3) ) b = np.random.random((2,3))
np.arange( 10, 30, 5 )
np.arange( 0, 2, 0.3 )
np.linspace( 0, 2, 9)
NUMPYCRASH COURSE
ARRAY CREATION
import numpy as npndarray = np.arange(15).reshape(3, 5)x=np.zeros_like(ndarray)x
np.ones_like
np.empty_like
np.random.rand(3,2): Random values in a given shapefrom a uniform distribution over [0, 1)
NUMPYCRASH COURSE
BASICS OPERATIONS
Unary operations a.sum() a.min() a.max()
Binary operations A * B # elementwise product A @ B # matrix product A.dot(B) # another matrix product
A = np.array( [[1,1],[0,1]] )B = np.array( [[2,0],[3,4]] )A*BA@B
NUMPYCRASH COURSE
BASICS OPERATIONS
Axis parameter b.sum(axis=0) # sum of each column b.min(axis=1) # min of each row b.cumsum(axis=1) # cumulative sum along each row
b = np.arange(12).reshape(3,4)b
NUMPYCRASH COURSE
UNIVERSAL FUNCTIONS
add average round sort sqrt ceil exp floor
NUMPYCRASH COURSE
UNIVERSAL FUNCTIONS
max: Return the largest item in an iterable orthe largest of two or more arguments.
maximum: Element-wise maximum of arrayelements.
Min
Minimum
nonzero: Return the indices of the elementsthat are non-zero.
NUMPYCRASH COURSE
UNIVERSAL FUNCTIONS
Mean
median
corrcoef: Return Pearson product-moment correlation coefficients.
cov: Estimate a covariance matrix, given data and weights.
std: Compute the standard deviation along the specified axis.
var: Compute the variance along the specified axis.
NUMPYCRASH COURSE
UNIVERSAL FUNCTIONS
all: Test whether all array elements along a given axis evaluate to True. any argmax: Returns the indices of the maximum values along an axis. argmin argsort sum: Sum of array elements over a given axis. trace: Return the sum along diagonals of the array. transpose where: Return elements chosen from x or y depending on condition. re: Regular expression operations
NUMPYCRASH COURSE
INDEXING, SLICING AND ITERATING
NUMPYCRASH COURSE
INDEXING, SLICING AND ITERATING
def f(x,y): return 10*x+y
b = np.fromfunction(f,(5,4),dtype=int)b
NUMPYCRASH COURSE
INDEXING, SLICING AND ITERATING
List to Array
from numpy import array# list of datadata = [11, 22, 33, 44, 55]# array of datadata = array(data)print(data)print(type(data))
a = np.arange(10, 1, -2)
a[np.array([3, 1, 2 ])]
a[np.array([1, 3, -2])]
a[slice(1,4,2)]
a[1:4 :2]
NUMPYCRASH COURSE
INDEXING, SLICING AND ITERATING
x[1,2,...] is equivalent to x[1,2,:,:,:], x[...,3] to x[:,:,:,:,3] x[4,...,5,:] to x[4,:,:,5,:]
a=np.array([[1,2],[3,4]])a
for row in a:print(row)
for element in a.flat:print(element)
NUMPYCRASH COURSE
SHAPE MANIPULATION
reshape resize
a = np.floor(10*np.random.random((3,2)))a
NUMPYCRASH COURSE
STACKING
vstack hstack
a = np.floor(10*np.random.random((2,2)))a
b = np.floor(10*np.random.random((2,2)))b
np.vstack((a,b))
np.hstack((a,b))
NUMPYCRASH COURSE
COPIES AND VIEWS
No Copy at All: Simple assignments make no copy of array objects or of their data.
View or Shallow Copy: The view method creates a new array object that looks at the same data.
Deep Copy: The copy method makes a complete copy of the array and its data.
NUMPYCRASH COURSE
COPIES AND VIEWSNo Copy at All
NUMPYCRASH COURSE
COPIES AND VIEWSView or Shallow Copy
a = np.arange(12).reshape(3,4)c = a.view()c is a
c.base is a # c is a view of the data owned by a
c.shape = 2,6 # a's shape doesn't change
a.shape
c[0,4] = 1234 # a's data changes
NUMPYCRASH COURSE
COPIES AND VIEWSDeep Copy
a = np.arange(12).reshape(3,4)d = a.copy() # a new array object with new data is createdd is a
24
AGENDA
0
1
2
3
4
5
6
7
8
Toolsets for Python
NumPy SciPy
SCIPYCRASH COURSE
INTRODUCTION
Prepared From:https://docs.scipy.org/doc/scipy/reference/tutorial/index.html
SciPy is a collection of mathematical algorithms andconvenience functions built on the Numpy extension ofPython.
SciPy is organized into subpackages covering differentscientific computing domains.
SCIPYCRASH COURSE
SCIPY ORGANIZATION
Sub-package Description
cluster Clustering algorithms
constants Physical and mathematical constants
fftpack Fast Fourier Transform routines
integrate Integration and ordinary differential equation solvers
interpolate Interpolation and smoothing splines
io Input and Output
linalg Linear algebra
ndimage N-dimensional image processing
SCIPYCRASH COURSE
SCIPY ORGANIZATION
Sub-package Description
odr Orthogonal distance regression
optimize Optimization and root-finding routines
signal Signal processing
sparse Sparse matrices and associated routines
spatial Spatial data structures and algorithms
special Special functions
stats Statistical distributions and functions
SCIPYCRASH COURSE
FUNCTIONS
Scipy sub-packages need to be imported separately.
The top level of scipy also contains functions from numpy and numpy.lib.scimath.
from scipy import linalg, optimize
import numpy as npnp.some_function()
from scipy import some_modulesome_module.some_function()
SCIPYCRASH COURSE
BASIC FUNCTIONS
np.cast['f'](np.pi) np.r_ np.c_ np.select
SCIPYCRASH COURSE
POLYNOMIALS
poly1d class Accepts coefficients or polynomial roots to initialize a
polynomial Operations
Algebraic expressions Integration Differentiation Evaluaation
from numpy import poly1dp = poly1d([3,4,5])print(p)
print(p*p)
SCIPYCRASH COURSE
LINEAR ALGEBRA
scipy.linalg contains all the functions in numpy.linalg. plus some other advanced functions.
very fast
import numpy as npfrom scipy import linalgA = np.array([[1,2],[3,4]])A
linalg.inv(A)
SCIPYCRASH COURSE
LINEAR ALGEBRA
Finding Inverse Solving linear system
SCIPYCRASH COURSE
LINEAR ALGEBRA
Solving linear system
import numpy as npfrom scipy import linalgA = np.array([[1, 3,5], [2,5,1],[2, 3, 8]])A
b = np.array([[10], [8],[3]])b
linalg.inv(A).dot(b) #Slow
np.linalg.solve(A, b)
SCIPYCRASH COURSE
LINEAR ALGEBRA
Finding Determinant: linalg.det Matrix Exponential: linalg.expm Matrix Logarithm: linalg.logm Trigonometric functions: linalg.sinm, linalg.cosm, and linalg.tanm Hyperbolic trigonometric functions: linalg.sinhm, linalg.coshm , and linalg.tanhm
Special matrices: block diagonal scipy.linalg.block_diag circulant scipy.linalg.circulant Pascal scipy.linalg.pascal …
from scipy.linalg import pascalpascal(4)
pascal(4, kind='lower')
35
EXPLORE PACKAGES
15
15 17 15SCIPY.STATS
SCIPY.NDIMAGE
SCIPY.SPATIAL
SCIPY.SIGNAL