26 development
TRANSCRIPT
![Page 1: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/1.jpg)
Hadley Wickham
Stat405Development
Monday, 30 November 2009
![Page 2: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/2.jpg)
1. Floating point math
2. Optimisation
3. Continuing education
4. Feedback
Monday, 30 November 2009
![Page 3: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/3.jpg)
Your turn
Perform the following calculations in R. Are the answers what you expect?
seq(0.1, 0.9, by = 0.1) - 1:9 / 10
sqrt(2)^2 - 2
What is the property of these numbers that might cause the problem?
Monday, 30 November 2009
![Page 4: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/4.jpg)
# Each number must be stored in a finite amount of space
# => each number can only have a finite number of digits
# => floating point math does not work like normal math
(1e-16 + 1) == 1
(1e-16 + 1) * 10 == 1e-16 * 10 + 1 * 10
1e9 + 2 - 0.1 - 1e9
1e10 + 2 - 0.1 - 1e10
1e11 + 2 - 0.1 - 1e11
1e12 + 2 - 0.1 - 1e12
1e13 + 2 - 0.1 - 1e13
1e14 + 2 - 0.1 - 1e14
Monday, 30 November 2009
![Page 5: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/5.jpg)
# By default R only shows 7 significant digits
# If the trailing digits are zero, the number will be rounded
(1 / 237)
(1 / 237) * 237
(1 / 237) * 237 - 1
seq(0.1, 0.9, by = 0.1)
seq(0.1, 0.9, by = 0.1) - 1:9 / 10
# Tricky to get to print exactly:
formatC((1 / 237) * 237, digits = 20)
formatC(seq(0.1, 0.9, by = 0.1), digits = 20)
Monday, 30 November 2009
![Page 6: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/6.jpg)
# When working with floating point numbers (numeric)
# (but not integers, this is the one place where the
# difference is important) never test for equality with ==
a <- seq(0.1, 0.9, by = 0.1)
b <- 1:9 / 10
all(a == b)
all.equal(a, b)
all(abs(a - b) < 1e-6)
# Similarly, need to be careful with < and > etc
Monday, 30 November 2009
![Page 7: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/7.jpg)
# Places where this matters:# # * sums# * calculating the standard deviation# * inverting a matrix (condition)# * linear models!# * maximum likelihood estimation
Monday, 30 November 2009
![Page 8: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/8.jpg)
Optimisation
If, and only if, your code is too slow
First use system.time() to figure out exactly how long things are taking: you need this so you can check your changes actually speed things up
Then see what is taking the longest amount of time with the profr package
Monday, 30 November 2009
![Page 9: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/9.jpg)
General advice• Start with the slowest part of your code
• Use built-in R functions, where possible
• Use vectorised functions, where possible
• Think through your basic algorithm
• Knowledge of basic CS algorithms and data structures v. helpful
Monday, 30 November 2009
![Page 10: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/10.jpg)
Monday, 30 November 2009
![Page 11: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/11.jpg)
Continuing education
Learn more about R.
Learn more about your other tools.
Professional development
Monday, 30 November 2009
![Page 12: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/12.jpg)
Mailing list
Sign up to R-help: https://stat.ethz.ch/mailman/listinfo/r-help
Make sure to set up filters
Skim interesting subjects and read them
Don’t be afraid to post (use a pseudonym if necessary)
Monday, 30 November 2009
![Page 13: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/13.jpg)
Read books
Phil Spector. Data Manipulation with R.
William N. Venables and Brian D. Ripley. Modern Applied Statistics with S.
Frank E. Harrell. Regression Modelling Strategies.
Jose C. Pinheiro and Douglas M. Bates. Mixed-Effects Models in S and S-Plus.
Monday, 30 November 2009
![Page 14: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/14.jpg)
Read papers
The R Journal: http://journal.r-project.org/
The Journal of Statistical Software: http://www.jstatsoft.org/
Monday, 30 November 2009
![Page 15: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/15.jpg)
Learn your tools
• Touch typing
• Text editor
• Command line
• Caffeine
Monday, 30 November 2009
![Page 16: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/16.jpg)
Professional development
The aspects of being a statistician, apart from knowing statistics.
Principally communication: written, spoken, visual and electronic.
Take every opportunity you can to practice these skills.
Monday, 30 November 2009
![Page 17: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/17.jpg)
Electronic
Written
Spoken
Website
Blog
Papers
Reviews
Vita/Resume
VideoSlidecast
Posters
Code
Long talk
Short talk
Oral exam
Visual
Bibliography
Teaching
Graphics
Monday, 30 November 2009
![Page 18: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/18.jpg)
Written
Particularly important if you want to be an academic, or if you‘re PhD student, or want to become one.
“Style: Toward Clarity and Grace”
Sign up for the thesis writing workshops when they come around.
Develop a regular habit.
Monday, 30 November 2009
![Page 19: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/19.jpg)
My habit
• Roll out of bed at 7am
• Boil water
• Make tea
• Drink tea
• Write for an hour
Monday, 30 November 2009
![Page 20: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/20.jpg)
Spoken
Seize every opportunity to practice.
Make use of Tracy Volz - [email protected]. She is a fantastic resource - if you had to pay for her, you wouldn’t be able to afford it.
Monday, 30 November 2009
![Page 21: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/21.jpg)
Monday, 30 November 2009
![Page 22: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/22.jpg)
265,000 emails134,000 unread!
value
0
200
400
600
800
1000
1200
2007 2008 2009 2010
unreadread
Monday, 30 November 2009
![Page 23: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/23.jpg)
value
0
200
400
600
800
1000
1200
2007 2008 2009 2010
unreadread
Monday, 30 November 2009
![Page 24: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/24.jpg)
from
read/all
0.0
0.2
0.4
0.6
0.8
1.0
2007 2008 2009 2010
Monday, 30 November 2009
![Page 25: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/25.jpg)
value
50
100
150
200
250
300
350
2007 2008 2009 2010
directsent
Monday, 30 November 2009
![Page 26: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/26.jpg)
value
50
100
150
200
250
300
350
2007 2008 2009 2010
directsent
Monday, 30 November 2009
![Page 27: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/27.jpg)
http://www.43folders.com/izero
Merlin Mann
There is no way you will ever be able to respond to — let alone read in exquisite detail — every email you ever receive for the rest of your life. If you take issue with this, just wait six months, because, believe me, we’re all getting a lot more email (and other sundry demands on our attention) every day. What seems like a doddle today is going to get progressively
more difficult — even insurmountable — unless you put a realistic system in place now.
Inbox Zero
Monday, 30 November 2009
![Page 28: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/28.jpg)
Your time is priceless (and wildly limited)
You need an agnostic system for dealing with mail that isn’t based on
nonces, exceptions, and guilt.
[The] ultimate goal is for you to spend less time playing with your email and
more time doing stuff. Monday, 30 November 2009
![Page 29: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/29.jpg)
Key concepts
Regularly empty your inbox
Minimal response
Delete, delete, delete
Filters
Email dashes
Monday, 30 November 2009
![Page 30: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/30.jpg)
Response does not need to be proportional to request
“Good idea. I’ll add it to my to do list.”
“Here’s a link that might be what you’re looking for…”
http://www.43folders.com/2006/03/13/email-cheats
“Do you still need this?”
“I don’t know”
[Delete]
Monday, 30 November 2009
![Page 31: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/31.jpg)
Delete!
Most minimal response is none.
“Just remember that every email you read, re-read, and re-re-re-re-re-read as it sits in that big dumb pile is actually incurring mental debt on your behalf.”
Be brutally honest - if you’re not going to do anything with the email delete it now.
Monday, 30 November 2009
![Page 32: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/32.jpg)
Filters grey mail
“noisy, frequent, and non-urgent items which can be dealt with all at a pass and later.”
facebook, comments, university/department memos, newsletters, mailing lists
Good catch all: contains unsubscribe
http://www.43folders.com/2006/03/13/filtersMonday, 30 November 2009
![Page 33: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/33.jpg)
[email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] subject:(weekly message), [email protected], list:"k2i-members.rice.edu", list:"mailman.rice.edu"
[email protected], [email protected], [email protected], [email protected], [email protected]
from:([email protected]) from:([email protected])
1300/3500 (5/day!)
Monday, 30 November 2009
![Page 34: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/34.jpg)
Patricia Wallace, a techno-psychologist, believes part of the allure of e-mail—for adults as well as teens—is similar to that of a slot machine. “You have intermittent, variable reinforcement,” she explains. You are not sure you are going to get a reward every time or how often you will, so you keep pulling that handle.”
Monday, 30 November 2009
![Page 35: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/35.jpg)
Email dashes
Don’t have your email open all day. Schedule times when you respond to emails.
You can tackle emails a lot faster when you batch them up.
Lack self control (like me)? Try an internet blocker: http://macfreedom.com/
http://www.43folders.com/2006/03/15/email-dashMonday, 30 November 2009
![Page 36: 26 Development](https://reader033.vdocuments.us/reader033/viewer/2022060107/5549c601b4c90564768b5878/html5/thumbnails/36.jpg)
Feedbackhttp://hadley.wufoo.com/
forms/stat405-final-feedback/
Monday, 30 November 2009