Previous Contents

Appendix C:   Perl Scripts

Here I describe some Perl scripts that use the output from dvii. To use these scripts you must have Perl installed on your computer. Perl is available on nearly every computing platform for no cost. If you work on a Unix or Unix-like platform Perl is probably already installed. For more information on obtaining Perl, go to the Comprehensive Perl Archive Network (CPAN) at www.cpan.org.

C.1   fontdiff.pl

C.1.1   Introduction

The Perl script fontdiff.pl looks at two dvi files and compares the fonts used in each displaying the differences. This can be useful when you have compiled a TeX file on two different computer systems and you are worried that the fonts on the two systems may be installed differently.

Let us look at an example.

We have two dvi files: test1.dvi and test2.dvi. Here are the fonts listings for these two files:
d:\tex\dvii>dvii -f test1.dvi
f:[50/cmbx10/1100]::1af22256
f:[33/cmsl10/1000]::70ae304a
f:[0/cmr10/1000]::4bf16079

d:\tex\dvii>dvii -f test2.dvi
f:[51/cmbx10/1100]::1af22256
f:[50/cmr10/1100]::4bf16079
f:[0/cmr10/1000]::4bf16079
If we run the Perl script fontdiff.pl on these two files we get the following:
d:\tex\dvii>perl fontdiff.pl test1.dvi test2.dvi
Fonts in test1.dvi NOT in test2.dvi:
------------------------------
  f:[NN/cmsl10/1000]::70ae304a
------------------------------

Fonts in test2.dvi NOT in test1.dvi:
------------------------------
  f:[NN/cmr10/1100]::4bf16079
------------------------------

Fonts in common that have DIFFERENT checksums:
------------------------------
------------------------------
Note that by default the fontdiff.pl script ignores font numbering when finding font differences. If you want to see which fonts are in common add the -l option:
d:\tex\dvii>perl fontdiff.pl -l test1.dvi test2.dvi
Fonts in test1.dvi NOT in test2.dvi:
NOTE: fonts marked with * are in BOTH files
------------------------------
* f:[NN/cmbx10/1100]::1af22256
  f:[NN/cmsl10/1000]::70ae304a
* f:[NN/cmr10/1000]::4bf16079
------------------------------

Fonts in test2.dvi NOT in test1.dvi:
NOTE: fonts marked with * are in BOTH files
------------------------------
* f:[NN/cmbx10/1100]::1af22256
  f:[NN/cmr10/1100]::4bf16079
* f:[NN/cmr10/1000]::4bf16079
------------------------------

Fonts in common that have DIFFERENT checksums:
------------------------------
------------------------------

C.1.2   How are fonts ``different''?

What does it mean for two TeX fonts to be "different"? There are four characteristics of a TeX font called from a .dvi file:
  1. name (e.g., cmr10 or ptmb)
  2. font number
  3. scaling (e.g., 1000 or 1200)
  4. checksum
The Perl script fontdiff.pl can ignore any of these parameters (except font name) when finding font differences. Here are the options:
-c : ignore checksum when finding font differences
-C : do NOT ignore checksum when finding font differences (DEFAULT)
-n : ignore font number when finding font differences (DEFAULT)
-N : do NOT ignore font number when finding font differences
-s : ignore font scaling when finding font differences
-S : do NOT ignore font scaling when finding font differences (DEFAULT)
-l : list fonts for each file
Continuing the example started in the above section, if we don't want to ignore font numbering we would use the -N option:
d:\tex\dvii>perl fontdiff.pl -N test1.dvi test2.dvi
Fonts in test1.dvi NOT in test2.dvi:
------------------------------
  f:[50/cmbx10/1100]::1af22256
  f:[33/cmsl10/1000]::70ae304a
------------------------------

Fonts in test2.dvi NOT in test1.dvi:
------------------------------
  f:[51/cmbx10/1100]::1af22256
  f:[50/cmr10/1100]::4bf16079
------------------------------

Fonts in common that have DIFFERENT checksums:
------------------------------
------------------------------
Note that more fonts are considered different than before: in file test1.dvi the font cmbx10 scaled at magnification 1000 has font number 50 while in test2.dvi that same font has font number 51.

Note. If you look back at the output of the example at the end of the previous section, you will see that the fonts are listed with their font numbers replace with NN; this is because, unless overridden by the -N option, font numbers are ignored and hence we do not even want to see them.

Here is the output when we ignore everything we can ignore: checksum, scaling, and font number:
d:\tex\dvii>perl fontdiff.pl -n -s -c test1.dvi test2.dvi
Fonts in test1.dvi NOT in test2.dvi:
------------------------------
  f:[NN/cmsl10/SSSS]::70ae304a
------------------------------

Fonts in test2.dvi NOT in test1.dvi:
------------------------------
------------------------------

C.2   specials.pl

This Perl script takes as it single argument a dvi file and outputs a dvips-style list of pages containing \special's.

Let us look at what happens with our old friend text.dvi. Here is a list of the \special's:
d:\tex\dvii>dvii -s test.dvi
s:[2/2]:: A short special
s:[3/3]::
abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567893.141592
653589793238462643383279502884197169399375105820974944592
s:[4/4]:: PSfile 1.eps
s:[4/4]:: PSfile 2.eps
s:[4/4]:: PSfile 3.EPS
s:[4/4]:: PSfile dog1.gif
s:[4/4]:: PSfile cat.eps
s:[6/-3]:: Some control characters: []
and here is the output from the specials.pl script:
d:\tex\dvii>perl specials.pl test.dvi
-pp2,3,4,-3
You could then take this string and print just the pages containing the \special's using dvips:
d:\tex\dvii>dvips -pp2,3,4,-3 test.dvi
Although the most common \special's encountered are those for including an EPS (Encapsulated Postscript File) figures, and in most cases those are the only \specials's, it may happen that there are pages containing other kinds of \specials's (for example when the document incorporates color). So, how can you get a list of just those pages that have included figures? The answer is to use the --grep option to display only those \special's that contain the string PSfile (the standard indicator of an included EPS file). So, to print only those pages that contain an included EPS figure:
d:\tex\dvii>perl specials.pl --grep PSfile test.dvi
-pp4

C.3   details.pl

The Perl script details.pl uses the dump opcode option -d of dvii (see section 3.3 to generate opcode statistics. For example, to list the op codes in order of a dvi file, type details.pl -d as in this example:
details.pl -d test

o:pre
o:bop
o:push
o:down3
...
o:post
where ... are the intervening opcodes.

To get statistics on how many times each opcode is called, use the -D option
details.pl -D test

set_char_0:0
set_char_1:0
...
pop:25
right1:0
right2:7
...
(I have only shown a few of the 255 opcodes.)

Finally, if you want a succinct summary of the opcodes used, use the -u option, or no option at all:
details.pl -u test

SUMMARY
Number of characters set (total) : 314
  at or below position 127       : 314
  above position 127             : 0
Number of rules                  : 1
Number of nops                   : 0
Number of bops                   : 6
Number of eops                   : 6
Number of font changes           : 9
Number of specials               : 8
Number of font definitions       : 3
Number of undefined opcodes      : 0
The ``Number of characters set'' reflects the total number of characters set, that is, if you were to type this document on a typewriter how many times you would have to hit a key. (Sort of.)

Note that strictly speaking this script could just about as easily use dvitype to do its work, but where is the fun in that?

C.4   changes.pl

The page difference procedure (see page X) is automated in the changes.pl script. Note that this script needs the file difference utility diff to be available.

Let file1 be the TeX file consisting of
% file1.tex
a\eject b\eject c\eject d\eject e\eject f\end
while file2 is the file
% file2.tex
a\eject b\eject C\eject d\eject e\eject f\end
So, by eye we can see that the two files differ on the third page. If we compile the two files and then use the changes.pl script, we get
changes.pl file1.dvi file2.dvi
4c4
< p:[3/3]::9C8E26478F1B039011D2F28DA18B18CC
---
> p:[3/3]::9C8E26478F1AE39011D2F28DA18B18CC

Previous Contents