capturing data streams 13

26
Cyrion Technologies PCL Corner Joachim E. Deußen Page 1 22.05.2007 Capturing printer data streams Version 1.3 (22.05.2007) Dipl. - Ing. Joachim Deußen Overview Windows UNIX Macintosh Port redirection Network capturing Network tools Copyright © 2005 by Dipl. - Ing. Joachim E. Deußen. All rights reserved. All trademarks belong to their respective owners.

Upload: steve-teoh

Post on 07-Apr-2015

424 views

Category:

Documents


11 download

TRANSCRIPT

Page 1: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 1 22.05.2007

Capturing printer data streams

Version 1.3 (22.05.2007)

Dipl. - Ing. Joachim Deußen

Overview

Windows

UNIX

Macintosh

Port redirection

Network capturing

Network tools

Copyright © 2005 by Dipl. - Ing. Joachim E. Deußen. All rights reserved. All trademarks belong to their respective owners.

Page 2: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 2 22.05.2007

Overview

When you supporting printers or you are creating a project involving printers it becomes necessary to exactly know which data is send to a printer. This data normally is called a printer data stream. Sometimes people refer to this data as the print job, but we will see later, that this is not always, what we want. We are really only interested, what data arrives at the printers inputs like USB-, parallel- or network-ports. As discussed in another article only this data stream can be used for deep analysis of a printing problem other than the bare connectivity. So we must differentiate between two problems:

1. No data arrives at the printers input ports connectivity check 2. Wrong data arrives at the printers input ports data stream analysis

This article discusses the various possibilities to get such a printer data stream for analysis. The analysis itself is discussed elsewhere.

Printer

Connectivity

UserDocument

Application

Printer Driver

PrinterData Stream

PrinterData Stream

Source

Drain

Port

First let us set a basic workflow that is more or less used on all operating systems:

1. Document 2. Application 3. Printer driver or similar operating system instance 4. Connectivity via a virtual port (USB, parallel, FireWire, Network etc.) 5. Printer

Page 3: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 3 22.05.2007

Or in words: A document is send from an application to an operating system instance that creates a printer data stream. This instance normally is something called a printer driver. The data stream then is send using some kind of inter-connectivity to the printer, where the actual output (the pages) is produced. Since we are interested in the data stream we have to look at (3) for that. This is the point where the actual data stream comes into existence. It is transferred using some inter-connectivity method to the printer, where it is converted into printed pages. So to get the data stream we have to intercept it between (3) and (4) or between (4) and (5). Let us call the first point the source and the second point the drain. Capturing directly at the source is the most convenient way to get those printer data streams. Often features of the generating application or printer driver can be used to re-direct the data stream into a file that then can be transported for analysis.

Page 4: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 4 22.05.2007

Windows

Print to file Almost known to everybody is the possibility of Microsoft Windows to print to a file from certain applications. You will most often find such an option in the print dialog of the application – such as Microsoft Word etc.

Page 5: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 5 22.05.2007

When you select that option, the printer data stream will be produced and the file will be re-directed to a file system file, regardless of the original port the printer driver is connected to. Print to FILE: If you have an older or proprietary application that has no Print-to-file option in its print dialog there is always another alternative to print to a file: The virtual printer port called FILE:

User/Document

Application

Printer Driver Port PrinterData Stream

If you look in the list of available ports (to be found in the printer drivers properties page; on the tab <ports>), you will find in all versions of Windows a port called FILE: Upon selecting this port for a printer driver, you can print from any application to this printer and it will ask you for a destination where the printer data stream will be saved to. If you redirect the printer data stream to a FILE port you must ensure, that you write printer data and not windows data to a file. Windows print data – also called EMF (enhanced meta file) is the default, if you print to a network printer attached not directly to your printer, but to a printer server. Please note that the EMF passing the printing sub-system is not the standard EMF as defined in public documents by Microsoft:

Page 6: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 6 22.05.2007

You can probably locate the information about the EMF structs in the spool file format. But note that one major caveat with this approach is that since the spool file format is MS proprietary, it might change in future releases without notice. So if your driver is dependent on a particular spool file format, it could break on future releases. This is the reason for us recommending that you not rely on the spool file format. //Ashwin, Microsoft

So if you grab EMF by incident (or on purpose) the chance of having a proper file is very, very low! On Windows XP and up, you will always get RAW printer data if you print to FILE: regardless if you set the print processor to RAW or EMF. So the following procedure (even though documented from within Windows XP applies only to Windows NT 4 and Windows 2000 based computers).

So check first if the print processor WINPRINT is set to RAW and not to any of the EMF data types!

Page 7: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 7 22.05.2007

Point-and-Print redirection: If you print in a corporate environment then Windows workstations normally use a Windows printer server to spool their printer data streams. This method is also called point-and-print. To return control faster to the calling application the Windows workstation is not generating the target printer language, but an intermediate language called enhanced metafile EMF (See the above notice on the true nature of this file format).

Page 8: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 8 22.05.2007

This EMF file then is spooled to the printer server, where the actual conversion from windows internal representation (EMF) to the target printer language (RAW) is performed. If we like to capture now a data stream from a Windows workstation, we can use the same technique as described in the prior paragraph to re-direct the data stream to a file using the virtual printer port FILE: on the Windows printer server.

On true point-and-print connections (also called RPC connections, established between Windows NT-based computers) the popup-dialog for the filename is shown on the client computer and not on the server. Also the file is saved in RAW format on the client and not on the server! If you use SMB connections (from Windows 9x based computers or by using the “Local Port” for connecting to the shared printer) the dialog box is shown on the server and the file is also saved on the server. Print Services for UNIX: Unknown to most people is the fact, that Windows NT based operating systems offer an LPD service. For Windows NT 4.0 this is called “TCP/IP Print Service” while newer versions (Windows 2000, Windows XP and Windows 2003) call it “Print Services for UNIX”. The Windows Print services are the first drain printer data stream capture method discussed in this paper.

Page 9: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 9 22.05.2007

You can install the LPD-Service when you select the following:

• START • Settings • Control Panel • Add or Remove Programs • Add/Remove Windows Components

Navigate to the option “Other Network File and Print Services” and select then [Details].

Now check “Print Services for UNIX”. Confirm the selection with [OK] and then choose [Next >] to start the installation of the additional components.

Page 10: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 10 22.05.2007

After completion, the LPD-Service is installed, but not started. To enable the service, you must open the Computer Management Console (to do this, choose “Manage” from the pop-up menu of “My Computer” and then select “Services”)

Either start the service manually or select “Automatic” for the “Startup type”. Now, how to use the TCP/IP Print Server? By default the LPD server is reacting on all local IP interfaces on port 515. This means, if you have more then one network card or more than one IP address assigned to a network card, the LPD services is accessible from all these interfaces.

Page 11: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 11 22.05.2007

If you ever have configured a LPR port, you know that you need an IP address and a printer - or queue - name for this. Windows NT4 uses the names of shared printers for this. Imagine you have installed a printer “HP LaserJet 4100” in your printer-and-fax folder.

So if you have shared a printer to the windows network under the name “HPLJ4100” and then you enable the TCP/IP Print Service, you can also print to that printer with any LPR Port using the IP-Address of the sharing computer and the printers share name as the queue – name.

On Windows 2000/XP/2003 and up there are some extensions to this behaviour:

• You can alternatively use the printer name “HP LaserJet 4100” as a queue-name (take care that your LPR can handle spaces in queue-names or rename the printer accordingly).

• The printer does not need to be shared anymore to be accessed. • If a printer is shared anyway, you can use both share-name and

printer-name. Since the LPD Service is conform to RFC 1179 you can print from any operating system having an LPR port to that service: From other windows workstations or servers

• from Mac OSX, • from any UNIX, • from an AS/400 or • from an application like SAP R/3 or anything else.

Page 12: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 12 22.05.2007

Now recalling our knowledge from chapter (4) we know that the shared printer itself can be re-directed to print to a file by using the virtual printer port FILE: on the Windows NT-based server or workstation that is offering the TCP/IP print service a.k.a. LPD service.

Page 13: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 13 22.05.2007

Thus we can use this method for instance to capture print jobs from any operating system or server, that has not print-to-file option, like AS/400 or SAP R/3. Since this is not considered a point-and-print connection the save dialog-box will pop-up on the server. A rather clever method is like follows: You must capture the data stream from an SAP R/3 application to an old printer with an IP-Address of “11.22.33.44” and an LPD service running on port 515 and a queue-name of “RAW”. Now do this:

• Disconnect the original printer from the network. • Configure a laptop with the IP-Address of the old printer. • Install a PCL printer like a HP LaserJet 4100 or similar. • Use the FILE: port for this printer. • Share that printer with the name “RAW” to the network. • Enable or start the TCP/IP print server on the laptop. • Connect the laptop to the network. • Print from SAP R/3 to the <printer>.

Now you will be ask to enter a file name on the laptop, because the print is not gone to the old printer but re-directed to your laptop and then to your hard disk for later analysis. MiniLPD: There is another LPD Service that can be used on windows operating systems: MiniLPD. If can be found on a CANON web page (I don’t know if it is an official or unofficial site; it seems to be run by some CANON engineers): http://www.digitalissues.co.uk

Page 14: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 14 22.05.2007

Download MiniLPD.zip, create a new folder and unzip the program into that directory. You can do this for instance by using the “Extract to folder …” option of the WinZip context menu. When you start MiniLPD you may receive an error message: Winsock Error 10078. This is because your “TCP/IP Print Server” (as discussed in the previous chapter) is active and blocks access to port 515 which is needed for MiniLPD to run. So stop the “TCP/IP Print Service” first!

Now you can start MiniLPD and it will listen to ANY queue name on EVERY Interfaces port 515 of the computer.

You can test MiniLPD by sending a testjob using the Windows-own LPR.exe command from within a command shell. If you have a file called “testpage.prn” then use: C:\> LPR –S <IP address> -P ANY testpage.prn Where you replace <IP address> with one of your computers IP addresses. As stated before, the queue name is not important, MiniLPD reacts on any name.

Page 15: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 15 22.05.2007

Now look in the folder where MiniLPD.exe is located: You will find two new files. The BIN-file is the actual print job and the CTL-file is the file with the LPD-specific control commands. The BIN-file is what we want so just delete the CTL-file.

MiniRAW: There is also a tcp-raw capture tool accompanying miniLP, called miniRAW now. This can be run on Windows OS only. It is listening on TCP -port 9100. If can be found on a CANON web page (I don’t know if it is an official or unofficial site; it seems to be run by some CANON engineers): http://www.digitalissues.co.uk Download MiniRAW.zip, create a new folder and unzip the program into that directory. You can do this for instance by using the “Extract to folder …” option of the WinZip context menu.

Now you can start MiniRAW and listen on TCP-Port 9100. There is no possibility to change that port, but it is the far most common RAW-printing port, so there should no need to do so.

You can test MiniRAW by just creating a Standard TCP/IP-Port port, assigning it to a printer driver and then send a Test page down to it:

Page 16: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 16 22.05.2007

Now look in the folder where MiniRAW.exe is located: You will find a new file.

The BIN-file is the actual print job. This file is what we need. The naming scheme is YYMMDDHHMMSS.bin.

cti:Downloader 2005: If you need to capture raw IP transmissions like Port 9100 ports or LPD communication on port 515, you can use the Downloader 2005 by CTi for this purpose. This program features a raw port capture function for any port and an LPD service. First select a destination where the files should go and – if you like – select a local network interface to listen to. By default all incoming communication will be monitored. Then select the capture method LPD or RAW and if necessary the port.

Page 17: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 17 22.05.2007

Now start the port monitoring and capturing with [Execute]. For every incoming data stream it will create a separate file. The naming convention is according to the date and time set by your operating system.

Page 18: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 18 22.05.2007

If you capture LPD data, you can have one file that includes data and control-commands, or two separate files (like with MiniLPD). You must stop the capture by pressing the [Stop] button. The program can not decide when to stop the capture and return to normal function.

Page 19: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 19 22.05.2007

UNIX UNIX and all its relatives are ancient operating systems. A common printer driver system is non-existing and thus we have a very basic approach to printing at all. Normally the application produces some kind of printer data stream and then sends this using the LPR, TELNET or FTP protocol to the printer. There are many locations where these files can be captured, but since there is no real printer driver system, it is currently impossible to show how this can be accomplished in general.

Page 20: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 20 22.05.2007

MacOSX On the Macintosh the Postscript language is the only supported printer language. So you need a Postscript enabled printer to actually print something. The operating itself does not support any other printing language such as PCL5 or PCL 6. The Mac OSX standard print dialog offers by default the possibility to produce a PDF- or Postscript-file.

Page 21: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 21 22.05.2007

Port redirection Port redirecting is a method that is also known from NAT (Network address translation) gateways. I.e. that one computer is put in the middle of the communication and forwards every data that is received on one port to another port and vice versa. In NAT gateways it allows for many computers to use one outgoing (internet) line. By using Port redirectors in a printing environment we can capture the data stream from the source (the computer system) to the drain (the printer). Port redirection is a drain capture method, since it requires reconfiguring of the source computer system or the replacement of the drain printer system, but in contrary to the above introduced methods, we can capture all communication protocols with this, not just RAW-IP and LPR/LPD data streams.

RAW

5001

RAW

5001

5001

RAW

In the above illustration you see a host printing an IPDS data stream to an IPDS-enabled printer on Port 5001. To capture this data stream you can insert a port redirector on a laptop that is capturing the data stream, saving it to hard disk and forwarding all communication directly to the printer. This forwards also all bi-directional TCP communication and in the same time you not only get the data stream, but the matching printout.

Please note that currently there are no UDP port forwarders available!

RelayTCP: DLCsistemas (http://dlcsistemas.com) has a freeware application called RelayTCP. This can act as a port redirector. The command line version can be used to capture one TCP communication

Page 22: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 22 22.05.2007

C:\> releaytcp <listenport> <remoteip> <remoteport> -d For <listenport> insert the port you want to capture the data. <remoteip> and <remoteport> are the printers port and IP-address. The –d option instructs RelayTCP to save the data streams to a file in the same directory. Example: Original Capture Host 10.0.0.15 10.0.015 Printer 10.0.0.200 10.0.0.199 Laptop 10.0.0.200

1. Change the IP-address of the printer to a free IP address 10.0.0.199 2. Give the IP-address of the printer to the laptop 10.0.0.200 3. Open a command prompt on the laptop and issue

Relaytcp 5001 10.0.0.199 5001 -2

4. Now print something to the printer. The data is send to the laptop, captured to a file and forwarded to the real printer. All bi-directional communication is also forwarded and the host reacts normally.

5. Now remove the laptop and change all back to the original state. RelayTCP can also be used to implement more than one port forwarding facility. You can install it as a service under Windows NT and above and configure more than one port. Please read the documentation for instruction how to accomplish this. Interactive TCP relay – ITR ITR from Imperva (http://www.imperva.com/) is a more visual implementation of a TCP forwarder. To ITR instead of RelayTCP for the above example enter the following settings:

To save the communication to a file, check “Save Log”. And if the communication is time-critical, check “Don’t show messages” to set ITR to stealth mode.

Page 23: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 23 22.05.2007

Network capturing The previous described methods have either captured the printer data stream at the source or at the drain. But what if we can not use either one of the described methods? If your printer is connected to a network one could use the direct network data stream and extract the printer data stream from this. This method is called network capturing. The most common problem with today’s network capture software is the use of switches as internetworking devices. A switch by design send incoming data on one port only out on that port where the destination device is connected.

So if we like to capture a data stream between the source (the computer system) and the drain (the printer) then we must either replace the switch temporary with a hub that send all the data to all the connected devices, or we must persuade the switch to send the data also to our computer system with the capture software running. If you use small office switches you will have to use a hub for this. If you have more sophisticated, manageable switches, they have normally the possibility to fall back to hub-mode or feature a monitor port or promiscuous port. A system connected to this monitor/promiscuous port receives all data packets send through the switch.

Page 24: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 24 22.05.2007

Now there is a second step to be taken prior to using network capture software: activating the network cards promiscuous mode! Since network cards and hubs were designed the same time (when there was no switch around) the designers build a packet filter into the card, to ease the load of the higher levels of network protocols that where implemented as software. So only packets that are addressed to the card itself are delivered to the higher protocol stack levels; all others are dismissed silently. To use a standard computer as a network capturing station, this network card filter must be bypassed: This is called promiscuous mode.

Only in promiscuous mode the network capturing software can see the packets on the network, that are addressed to other computers and systems attached to the network. Otherwise the software will only see the packets addressed to the computer itself and this is the same as using the switch in its native mode.

Page 25: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 25 22.05.2007

Sometimes the network capture software is able to do this mode switching by itself, but this of course has its limits for the newer and newest network cards especially if they are integrated on a motherboard. Because it means the network capture software must know the card and how to do it. So sometimes the network capture software uses a special driver like the WinPCap driver for this. But using sophisticated network capture software is far beyond the scope of this paper. Please refer to the user manual of your favourite network capture and analysis software for more information. In the screen shot you will see an example of Packetyzer capturing some network data:

In the addendum you will find some (free) network capture and analysis tools that may help you in getting your data stream.

Page 26: Capturing Data Streams 13

Cyrion Technologies PCL Corner

Joachim E. Deußen Page 26 22.05.2007

Network tools

Netcat – The network swiss army knife http://www.vulnwatch.org/netcat/ Ethereal – A network protocol analyser http://www.ethereal.com/ Packetyzer – Windows Interface for Ethereal http://www.networkchemistry.com/products/packetyzer/ RelayTCP – TCP/IP relay software http://www.dlcsistemas.com/ Interactive TCP relay – TCP/IP relay software http://www.imperva.com/application_defense_center/tools.asp MiniLPD – small footprint LPD-to-file capture tool MiniRAW – TCP-RAW-communication (Port 9100) capture tool http://www.digitalissues.co.uk cti:downloader 2006 – multi-purpose printer troubleshooting tool http://www.cyrtech.de/