php streams: lucky dip
DESCRIPTION
A number of tid-bits of streams information of varying levels of difficultyTRANSCRIPT
PHP Streamsa Lucky Dip
Wez Furlong<[email protected]>
About the author
• PHP Core Developer since 2001
• Author of the Streams layer
• I hold the title “King” of PECL
• Author of most of PDO and its drivers
• Day-job is developing the fastest MTA on Earth
Lucky Dip!?
• Streams is a big topic area
• Every uses them
• A lot of people misuse them
• tid-bits from basic to advanced level
What is a stream?
• View of some kind of data
• Presented in chunks
• Readable
• Writable
• Sometimes seekable
File based streams
• Most common (include/require)
• Represent data held in a filesystem
Typical file reading code
$fp = fopen(‘myfile.txt’, ‘r’);
while (!feof($fp)) { $data .= fread($fp, 8192);}
$fp = fopen(‘myfile.txt’, ‘r’); $data = fread($fp, filesize(‘myfile.txt’));
Comparisons
file_get_contents() and stream_get_contents()
get all data into a variable
Fastest, most efficient
Comparisons
while (!feof($fp)) fread($fp, 8192)
Reads chunks
Most memory efficient way
8192 matches internal chunk size
Comparisons
while (!feof($fp)) fgets($fp)
Reads lines
Most memory efficient way
Slowest way to read chunks
Slowest way to read whole file
Comparisons
foreach (file($filename) as $line)
Reads lines
Fastest way
Loads whole file into memory
flock()
• Locks a file
• Shared of exclusive
• Advisory
• Only works if everyone uses flock()
• Mandatory on Windows!
• Sounds useful for web apps that use files
Using flock()
Reader:
$fp = fopen($filename, ‘r’);
flock($fp, LOCK_SH);
# it’s now safe to read the file
$data = fread($fp, 8192);
flock($fp, LOCK_UN);
Using flock()
Writer:
$fp = fopen($filename, ‘w+’);
flock($fp, LOCK_EX);
# we’re the only writer now
fwrite($fp, $data);
flock($fp, LOCK_UN);
flock() blocks
• flock() is “safe” because it will block until the lock is obtained
• Blocking is like sleeping
• While you’re asleep, you’re not doing any work
• Sometimes you want to do work while you wait
Using flock()
Non-blocking:
$fp = fopen($filename, ‘w+’);
if (flock($fp, LOCK_EX|LOCK_NB)) {
# we got the lock
} else {
# do something productive while we wait# for the lock
}
flock() summary
• Can be useful to ensure consistency
• Caveat: doesn’t work on most network filesystems
• Caveat: doesn’t work in threaded servers
• ISAPI, NSAPI, win32 apache
• threaded apache 2
• Caveat: some Linux kernels have broken flock()
Network Streams: sockets
• “Wormholes”
• according to the Unix Socket FAQ
• Bi-directional
• data goes both ways
• Differences in behavior from file streams
fsockopen()
An HTTP request:
$fp = fsockopen(‘www.php.net’, 80);
fwrite($fp, “GET / HTTP/1.0\r\n” . “Host: www.php.net\r\n\r\n”);
$data = fread($fp, 8192);
Alternative HTTP request:
$fp = fopen(‘http://www.php.net’, ‘r’);
$data = fread($fp, 8192);
This is an example of a wrapper
Network Streams: feof()
• Common error:
• Assuming that feof() means “connection_closed()”
• feof() returns true when:
• A read fails and the buffer is empty
• or: buffer is empty and no data has been received within the socket timeout
• Lack of data is often a temporary condition for network streams
Short reads, short writes
Another common mistake:
$fp = fopen(‘http://www.php.net’, ‘r’);echo fread($fp, 100000000);fclose($fp);
Reads are non-greedy
Think in terms of 8k chunks
Check the return value from fwrite()!
Speaking the lingo
Common mistake: “PHP hangs!”
$fp = fsockopen(‘www.php.net’, 80);fwrite($fp, “GET / HTTP/1.0\r\n” . “Host: www.php.net\r\n”);$data = fread($fp, 8192);
Sockets block
• Sockets are blocking by default
• Blocking is like sleeping
• Can cause long delays in your scripts
• PHP has 2.5 ways to control them
1. Non-blocking mode
• stream_set_blocking($fp, false);
• Reads/writes will “fail” instead of blocking
• Check the return values from fread() and fwrite()
• zero or zero-length: try again later
• try again means: try sending that data again
• implement your own buffering
• feof() loses meaning
Awful non-blocking code
function get_sock($host, $port) { $s = fsockopen($host, $port); stream_set_blocking($s, false); return $s;}
$hosts[] = get_sock(...); $hosts[] = get_sock(...); $hosts[] = get_sock(...); # repeat 20 times
# read all data without blocking do { foreach ($hosts as $s) echo fread($s, 8192);} while (true);
2. Timeouts
• stream_set_timeout($fp, $sec, $usec);
• Default 60 seconds
• Waits for up-to the timeout period
• Returns data if it is available before then
• Otherwise
• Sets ‘timed_out’ meta data field
• Returns an empty string (or zero)
Non-blocking vs. timeouts
• Timeouts
• Easiest when working with a single socket
• Non-blocking
• Useful for doing multiple things at once
• Painful when using what we’ve seen so far
2.5: stream_select()
• Non-blocking and timeouts combined!
• Sockets don’t have to be non-blocking
• Can wait for read/write/exceptional data
• stream_select() tells you when what you want to do will not block
stream_select()
$fp = fsockopen(‘www.php.net’, 80);fwrite($fp, “GET / HTTP/1.0\r\n” .“Host: www.php.net\r\n\r\n”);# wait up to 2 seconds$n = stream_select($r = array($fp), $w = null, $e = null, 2); if ($n) $data = fread($fp, 8192);else echo “timed out”;
Awful code made nicer
$sites[] = open_site(‘www.php.net’);$sites[] = open_site(‘pecl.php.net’);$sites[] = open_site(...); # repeat 20 times$n = stream_select($r = $sites, $w = null, $e = null, 30);if ($n) foreach ($r as $fp) { $x = fread($fp, 8192); if (strlen($x)) echo $x; else { # it closed unset($sites[array_search($fp, $sites)]); }}
Meta data
• stream_get_meta_data();
• People mis-used it
• The manual says “Hands off!”
• Especially eof and unread_bytes
• “I do not think it means what you think it means”INIGO, The Princess Bride
pipes
• A bit like sockets
• But they’re uni-directional
• data flows only one way
• Often used to pipe the output of one process into another
• popen() and proc_open()
• STDIN, STDOUT, STDERR
popen()
$pipe = popen(‘ls -l’, ‘r’);fpassthru($pipe);
$mail = popen(‘sendmail -t -i’, ‘w’);fwrite($mail, ‘Subject: hello...’);
proc_open()
• More powerful than popen()
• Cwd
• Environment
• Multiple pipes
• Files
• Suppress GPF dialog (Win32)
• pty
proc_open()
$p = proc_open( ‘--command--fd 0 --status-fd 1’, array( 0 => array(‘pipe’, ‘r’), array( 1 => array(‘pipe’, ‘w’), array( 2 => array(‘pipe’, ‘w’), $pipes, # will hold the pipe handles ‘/’, # child cwd will be / array( # set the environment ‘LOGNAME’ => ‘wez’, ‘HOME’ => ‘/home/wez/’), array(‘suppress_errors’ => 1));
proc_open()
fwrite($pipes[0], ‘some gpg command’);do { $r = array($pipes[1], $pipes[2]); if (stream_select($r, $w = null, $e = null, 30)) { foreach ($r as $p) { $data = fread($p, 8192); $prefix = $p == $pipes[1] ? ‘out’ : ‘err’; echo “$prefix: $data\n”; } }} while (true);
Transports and Wrappers
• Three primitives in PHP Streams:
• Streams themselves (“objects”)
• Transports (“classes”)
• Wrappers (“glue”)
Transports
• Are streams that implement the internal transports API
• fsockopen(): tcp://, udp://, ssl://
• send/receive the data you send/receive
Wrapper
• Glue code that acts as a factory for speaking a protocol
• fopen(): http://, ftp://
• Work happens, and you are handed back a stream
User-defined streams
PHP allows you to create your own stream implementations in PHP
It’s actually a hybrid wrapper/stream
class mystream { # methods go here}
stream_wrapper_register(‘myFile’, ‘mystream’);
User-defined streams
class mystream { var $fp; function stream_open($path, $mode, $options, &$opened_path) { $url = urldecode(substr($path, 9)); $this->fp = fopen($url, $mode); $opened_path = $path; return $this->fp; }}
User-defined streams
... function stream_close() { $this->fp = null; return true; } function stream_read($count) { return fread($this->fp, $count); } function stream_write($data) { return fwrite($this->fp, $data); }...
User-defined streams
... function stream_tell() { return ftell($this->fp); } function stream_eof() { return feof($this->fp); } function stream_seek($offset, $whence) { return fseek($this->fp, $offset, $whence) == 0 ? true : false; }...
User-defined streams
... function stream_flush() { return fflush($this->fp); } function stream_stat() { return fstat($this->fp); } function stream_lock($action) { return flock($this->fp, $action); }...
User-defined streams
... function unlink($path); function rename($old, $new); function mkdir($path, $mode, $opts); function rmdir($path, $opts); function url_stat($path, $flags); function dir_opendir($path, $opts); function dir_readdir(); function dir_rewinddir(); function dir_closedir();...
Questions?
• My blog: http://netevil.org
• These slides on my blog and on slideshare.net