When revising this, see also resources/IndexGoferCompanion.html

IndexGofer - A Companion

Building an index for your new book is like cleaning a garage; when you get done it looks jut like it should. All the hard work is suddenly invisible. IndexGofer offloads the clerical work, leaving you only the intellectual effort.

This document is a companion to using IndexGofer. When you click a document icon on the IndexGofer workflow diagram, this document is opened to the corresponding section. Each section contains instructions on using the associated program and additional Background information you may need to understand.

Read Me First

◆ Here's what you do

You begin by choosing a list of terms that should appear in the index. IndexGofer helps by picking out the proper noun (capitalized) phrases. Then after the page proofs have arrived, you go through the IndexEdit process. It displays each page and you pick which terms should appear in the index with that page number. Once chosen, they are "entries" for that page. Finally you click a button and the index appears.

◆ Trigger phrases

At the heart of IndexGofer is a unique concept: "trigger phrases". When you are editing the index, IndexGofer show you each page of your book together with the current index entries for that page. Just before showing a page, IndexGofer scans it and suggests entries where this page should be listed in the final index. You can delete any inappropriate entries and add those that have been missed.

There is no magic. Before starting with IndexGofer's IndexEdit tool you created a list of possible index terms. With each you provided a "trigger phrase". In scanning a page, IndexGofer looks for each trigger phrase. If it finds one, the corresponding index term is added as an entry for that page.

Say one term is "George Washington". You might list it with triggers "washington", "general", and "father of the country". If you chose these triggers carefully, "Washington" will be proposed as an entry for only those pages that discuss him. You want enough triggers that Washington is suggested for each page where he is discussed, but not for many pages where he is not discussed. If any of your triggers appears frequently without reference to Washington, you will have to delete Washington on each inappropriate page. It will be a challenge if the book also talks about the city or state of Washington.

◆ Talk to your Publisher

When you talk to your publisher about the index you need to send an email like the following. Or ask the same question in an email.


I plan to create an index for my book with ZweiBieren's IndexGofer software. That system recommends that I ask for a couple of items from you right now.

First, what file format do you need for the index file. HTML and MicroSoft Word are both easy with IndexGofer. Other formats may be a problem.

Second, what style should we use? Chicago Manual of Style? or some other? Do you have document for authors describing indexing and indexes? The standard IndexGopher appears in the sample below.

Third, are the proofs in PDF? If so can you send us a sample set of proofs from some other project? We need to test to be sure the proofs can be read by IndexGofer. Please don't go to any lengths to do this. If you don't have a sample set of proofs; IndexGofer may succeed when faced with my book's proofs. In the past, IndexGofer has had problems with encoded PDFs and PDFs written with options that cannot be understood by IndexGofer. Exposing these problems before my own proofs are ready can avoid eventual hangups.

(your humble author)

Sample Index

Achen, Christopher, 233


Afrobarometer, 205, 335

American NationalElection Studies (ANES), 57

AMLO: See López Obrador, Andrés Manuel


Calderón, Felipe, 85-87, 90-91, 119

campaign volatility, 28, 67-70; aggregate level, 78; individual level, 24, 73

◆ Install IndexGofer

2. Install IndexGofer

The simplest approach is to install IndexGofer into the project directory. Visit the IndexGofer download page and click the button to Download IndexGofer. When the browser prompts for a destination directory, give the project directory. After downloading, double click the newly downloaded file. One item installed is the IndexGofer logo as a link to the IndexGofer application. Click this logo to start IndexGofer.

IndexGofer requires the Java runtime environment (JRE). See the appendix.

${help.pages.list.TDbody} IndexGofer main window (blurred)

IndexGofer, at the right, displays the pages of a book and the index entries for each page. The index entries for the current page are highlighted with a yellow background.

With IndexGofer, you scroll though the book adding/deleting entries for each page. You can also edit and extend the list of terms available to be assigned to pages. as entries.

Index Terms window, blurred Index terms are listed in a separate window like the one at the right. Double clicking an item adds it as an entry for the current page. Typing in the list window scrolls it to the first term starting with what you have typed. You can have multiple terms window scrolled to different places in the list. The TermsEditor window
TermsEditor (new in Version 3.3) manages the list of index terms. It displays the terms alphabetically and offers tools for adding, revising, and deleting terms. A good place to start is with the File menu's "Scan for Proper Nouns ..." command. You select text files and it scans them for word strings that might be proper nouns, instantly creating lots of houn phrases; the majority of which should be in the index.
Finally, the Create Index menu command combines the entries from all pages to produce an index. See sample at the right. At present other index styles are produced manually. Send your index to me along with the desired format.
labor force
composition of, 67, 94, 96, 103, 131, 188n8–9, 191n10
growth in, 103, 108, 110, 111, 117, 174, 177, 180
labor party, xi, 14, 15, 20, 21, 122, 124–127, 184n3
labor union
membership, 4, 15, 16, 35, 36, 52, 57, 59, 74, 81, 93, 94, 159, 189n21, 194n19
An item that may appear in the index. It will not appear if no page has an entry for it.
term that has been chosen for a page. That term will appear in the index and will have the current page among its page numbers. For instance, if "labor party" is a term and it is added as an entry for page 20. In the final index, the entry for "labor party" will list 20 as one of the pages.
project directory
The directory with all files for generating an index for one book. The files include the list of all terms, one text file for each chapter, the index entries for each chapter, and the final output index file.

Entries in the left column above are links to more help:
Prepare to Use IndexGofer  
${help.pages.list.TDbody} To prepare for  creating an index with IndexGofer:
  1. Choose a project directory
  2. Install IndexGofer
  3. Adapt the book text
  4. Generate a list of potential index terms


◆ Demo

To better understand how index terms are assigned to pages, now is a good time to run the IndexGofer Demo

You will see a wide main window and a narrow terms list window, both above a help window.  The main window has three columns: the page number, the text of the page, and a list of the terms assigned to that page.
IndexGofer main window with areas marked

The other window (at the right) is a list of all terms so far defined for the project. ☀Double click a term and it is added to the list for the current page. That is, the page with the yellow background for its terms list. ☀Click a term in a page's list and type the DELETE key. The term is removed from the page.

Below the two windows appears the help window, also available from the Help menu. It always describes whatever the mouse is pointing at. The beige area is basic stuff and the pink area is more advanced. ☀Hold down the Control key for how-to-type-accent-characters and the Alt key for  a list of the shortcut keys.

A sample terms list window

When you first move to view a page and its terms area becomes yellow, that terms area will suddenly be populated with terms (wow!) and some phrases in the page's text are colored blue or red. The phrases in the text correspond to trigger phrases associated with the terms. If you ☀click a blue phrase, the corresponding term is highlighted in the yellow area and its trigger phrases are listed at the bottom of the terms list window.

Red highlights in the page text are for phrases that trigger two or more terms. For instance Washington could be the trigger for phrases "George Washington," "Washington, D.C.," and "Washington, state of." ☀Click a red phrase. The terms it triggers are shown in the bottom of the terms window. ☀Double click one to add it to the terms in the yellow area.

Very often the trigger string for a term is the same as the term itself. Sometimes it is part of the term as Franklin may be the trigger for "Franklin, Benjamin." But there is no rule; "peace" may be a trigger for "truce," or vice versa. As an author you need to decide what terms will appear in the index and what text phrases will be the trigger for each term.


1. Choose a Project Directory

IndexGofer runs in a project directory devoted to indexing one book. To begin, this directory must be populated with a few files - IndexGofer itself, the book chapters  xxxx.txt, and a list of index terms, indexterms.txt. See the following sections.

After running indexgofer more files will appear in the project directory

xxx-index.txt - one for each chapter. These contain the index entries assigned to each page. If you really need to change an index term that has already been assigned, you can edit these files.

index.txt or index.html - One of these files is created by the Create Index ... option in the File menu. It contains the generated index. Manual processing is generally required to reformat it for submission to the publisher.


◆ Distill your manuscript to plain text

The next steps scan plain UTF-8 text - text without markup. So you need to convert your manuscript to that form. ("UTF-8" originally stood for UCS Transformation Format 8; but now its just called "you-tee-eff-eight").

Once your text is in UTF-8, do not edit it. Many editors will blithely trash accented characters in your text.

Instructions follow for converting your text to UTF-8 via a word processor. BUT. Perhaps you have been busy and put this off (aka "procrastinated") so long that a .pdf of your book has arrived from the publisher. If so, you may be able to use Paginator to extract the UTF-8 text. Try the steps in the Paginator section.

Most manuscript editors can produce UTF-8 text. Here's how to do it with MS Word 2010. Other systems may be similar.

  1. Open the manuscript with MSWord.
  2. Select from the File menu and click its Save as ... option
  3. Click in the "Save as type" box and then click the option for "Plain text (*.txt)" (On older versions of MS Word you may have to click the down arrow circled here in red.)
    SaveAs dialog box from MS windows
  4. Enter the name "chapter__" in the File name box, where __ is the chapter number. If you are converting the entire manuscript, name the output "chapter0".
  5. Click "Save"
  6. A dialog box appears requesting a character encoding. Click the dot next to "Other encoding," scroll the list, and select "Unicode (UTF-8)." (If there is no dialog box, don't worry. No special encoding is needed for your text and it will be acceptable as UTF-8.)
  7. SaveAs dialog box from MS windows

Extracting the Proper-Nouns Index Terms

3. Adapt the Book Text

Convert the publisher's page-proof file to text.  Most editors provide an option for this. IndexGofer expects the "UTF-8" encoding; if your text has no special characters, this will be the same as ASCII. (But beware, even without European alphabets Microsoft Word uses non-ASCII characters for quotes. So you really need UTF-8.)

Break the text file into conveniently sized chunks; IndexGofer calls these chunks "chapters," but any division is acceptable, including the "division" that puts the entire text into one file. The file extension must be ".txt".

IndexGofer needs the page numbers to put into the index. You provide these by inserting code lines in the .txt chapter files. Before the text for each page insert a line having ONLY


where xxx is the page number for the subsequent page. The number may decimal or roman. It may be ppnmm for note mm on page pp. Non-numeric values may also succeed. More about the chapter text files is provided in the Admin Guide.


convert iso-8859-1 to utf-8
iconv -f iso-8859-1 -t utf-8 accented-authors.txt > accented-authors-utf8.txt

◆ Partition the text

IndexGofer has two schemes for finding proper nouns and they operate on distinct portions of the text. In this task you prepare for this by separating the text into three partitions: Text, Bibliography, and Other. IndexGofer scans the Text for proper nouns. You massage the Bibliography to extract author names. Nothing is done with the Other, so put in it captions, figures, tables, equations, and anything else where capital letters do not signify proper nouns.

In the Partitioner task, you are marking the text with one of three colors.

White Text scanned for proper nouns
Green Bibliography lines begin with author names
Pink Other ignored

Each time you click the mouse, it makes the clicked paragraph be the start of a partition and changes the color in that partition: white to green, green to red, and back to white. Clicking in the last partition will change the color all the way to the bottom of the file. Okay, click the start of the next partition and it will have a different color. Go through the whole file until all partitions have the right color for their contents.

Maybe your chapters are all text, with no bibliography, captions, tables. Perhaps the chapter title should be in the Other partiton (if it has extra capital letters) and otherwise the whole file should be white, which is how it starts. IndexGofer is a tad anal and wants you to run Partitioner on that file anyway. Just start partitioner on it and then quit or move on to the next file.

Partition boundaries are saved in files called chapter__-partitions.txt.

Click a chapter__.txt file in this table to partition it.

Chapter text file its -partitions file
(Rows of chapter__.txt and chapter__-partitions.txt files)


When you have partitioned all your chapters, you can have the text partitions scanned for proper noun phrases. Click:

Details on the hueristic scanning process are in ProperNounScan.

You can again edit the terms list:

Editing here is important because the proper noun scan is imperfect. Among other reasons, it cannot reliably distinguish sentence-starting capitals from sentence-starting noun phrases. Look for phrases that make no sense and delete them with the red octagon at the end of their row. Fret not the names of authors, they are the work of the next step.

Goal: This task is complete if you have chapter__-partitions.txt for each chapter__.txt and have added proper noun candidate phrase with the button second above.

Marking the Text


Proper Nouns List

. Those found are added to the list of index terms. This is a hueristic scan that identifies nouns and noun phrases by capitalization. Since capitalization also indicates sentence start, errors occur: some phrases are missed; some spurious phrases are found. The hueristics are especially inappropriate for reference and bibliography sections. These should NOT be included in the text file. Recommendation: Run this option first and prune the list before going on. More about proper nouns below.

To get started building an index terms list it may be useful to have a list of the proper nouns that appear in the text. The installation includes a rudimentary program for sifting your text for proper nouns. No such program will be perfect and this one is a tad simplistic. Here are some of the phrases it extracted from one manuscript:

H. L. Mencken of the Baltimore Sun
Obama Justice Department
Obama and McCain
Obama and the Democratic Party
Office of Faith-Based Initiatives
pro-Israel AIPAC
Roe v. Wade

Some of the principles the tool employs are these:

Results will be best if the references section is NOT scanned with this tool. Author names are usually last-name-comma-first-name, which will be parsed as two names by this tool. I suggest emacs or Excel for processing references.

When execution begins, the proper noun scanner will prompt for the name of the file to scan. IndexGofer accepts some words in lower-case within noun phrases. The default list is all pronouns, articles, and prepositions. Words can be added to this list by putting them in a file called phraseControl.txt, one word per line.


Choosing / Creating the Index Style


First choose a basic style. Press the little triangle to see the options and their samples.

main term, 12, 13; subterm, 123-5
maintenance, 5, 6; See also repairs


◆ Identify concept terms

Before the proofs arrive is the time to build the list of terms that will appear in the index. First among these is the concept terms. What is your book about? What is a reader likely to look for in the index? Remember that the index reader may not use the same words for a concept as you do; try to anticipate as many ways of looking up a concept as possible.  For war, say, a reader may look under: conflict, belligerence, strife, struggle, battle, conflagration, bloodshed, combat, hostility. At this time there is no need to list proper nouns; IndexGofer provides tools to extract those; as described in the next few sections. (You can run those tools now, if you like.)

I recommend listing the concept terms in a spreadsheet. Terms will occupy the first two or three columns:

The trigger phrase is what IndexGofer looks for when it scans a page. If the trigger is found in the text, the term/subterm on that line is added as an index entry for that page. Here's a sample. The "//" in the first line marks it as a comment.

// Trigger Term Subterm
Washington Washington, George  
Father Washington, George  
lie Washington, George lies
  Washington, George teeth
lie Truth  
truth Truth  

The third and fourth terms are subterms of "Washington, George." Note that the main term is repeated in the entry for each subterm. This is required by IndexGofer. It is also valuable to you if you want to sort the table in other ways, such as by trigger phrase. All but one entry has a trigger phrase; the fourth has no trigger, so that term will never be automatically suggested when the main IndexGofer application scans a page. Although never suggested by a scan, it will be available for you to add as a term for a page.

The seventh and subsequent columns are ignored. Comments may be entered there.

Additional columns are used for cross reference entries; entries that will appear in the index as "see such and such" or "see also such and such."  These are indicated with the word "SEE" in a column. If the previous two columns have text, they are a term and a subterm. The subterm may be omitted. The one column after SEE is the main term that is refered to. If the next column has a word, the reference is to that subterm. Here are some examples cross reference entries:

   Commander-in-Chief SEE Washington, George army
Founder SEE Washington, George    
Truth eternal SEE Tao cosmic

These rows will result in the following entries in the index:

Commander-in-Chief seeWashington, George, army

Founder see Washington, George

Truth 12, 34, ...

eternal see Tao, cosmic

You needn't worry about being complete; you can add terms as you assign index terms to pages. However, deleting terms is not as easy; if a term has already been assigned to a page, that term will appear in the index, even if it is deleted from the list of terms. The final index can be edited to fix such problems.

If you want a part of an entry to be italic, surround it with <i> amd </i>, as in "<i>Ledbetter</i> decision". Blank lines are ignored and so are those whose first non-blank character is not a letter, digit, or <. The recommended way to start a comment line is "//".

As you work, save the file as a spreadsheet file, concept-terms.xlsx (or .xls). You can save your work and continue at a later time. When the spreadsheet is complete save it one last time. Then use the spreadsheet program's SaveAs... option from its File menu to copy the file to a plain-text file named file named concepts.txt. In the Save As dialog box you must pull down the menu from the Save as type box and select the option "Unicode Text (*.txt)"; as indicaed by the red pointer in this picture:

Excel SaveAs dialog with Unicode choice highlighted

To parse concepts.txt into the available terms list, click:

If you later want to add terms, you can replace concepts.txt and click the button again. Duplicating terms is okay: copies are automatically removed.

GOAL: This task is complete when you have incorporated concepts.txt into indexterms.txt.

Generating Index Terms for Concepts

4. Generate Index Terms

Creating indexterms.txt

Here's how to get started with an empty indexterms.txt. Run IndexGofer (by double clicking its icon). If there is no indexterms.txt file, IndexGofer will prompt you to let it create one in the current directory. Accept the offer. Now click the "Switch to Terms Editor" button and start entering terms.

To get a lot of terms automatically, Choose "Scan for Proper Nouns..." in the File menu. The terms created are capitalized phrases from throughout the text. Since senteences start with capital letters that are NOT part of noun phrases, some spurious phrases wil be collected. Bad new: this is annoying; good(ish) news: you get to practice deleting terms.

When providing a text to scan for proper nouns, omit the reference or bibliography section. Capitalization will create spurious terms. Instead, edit the references/bibliography to make a list of authors, one per line. (Lines without commas will be converted by assuming that the last full word on the line is the last name.) Supply this file via the Read Author's List ... option of the File menu.


Editing the Index Terms

◆ Construct and edit the terms list

As you scan pages of your text to assign index terms, you will choose from the list in file indexterms.txt; in the previous step you constructed the first version and will make additions in later steps. The list will need work for many reasons, not least because reviewing it in another form will suggest changes.To review and edit the list, click:

The terms editor has its own help file; here are some highlights:

Goal: This task is at least started when you have indexterms.txt.
Build the Terms List  

Indexing with IndexGofer means assigning index terms as entries for each page. Cenral to this task is the list of index terms that can be assigned. This list is kept in the project directory as file indexterms.txt, as described in the Admin guide.

The project directory must have a file indexterms.txt. The format is described in the Admin Guide; it may be edited with any text-oriented editor. However, IndexGofer now offers TermsEditor for creating and revising indexterms.txt.

tems editor windowThe TermsEditor window, at right, has, top-to-bottom, menu bar, the Find / Create line, the table of index terms, and a message line.

Typing into the Find/Create box causes the table to scroll to the term begining with the current string. Click the Add Term button to add a new main term with the name in the Find/Create box.

Editing Cells

"Opening" a cell - When you click on a cell in one of the text columns, it "opens" for editing; a box surrounds it and the background is white. You can edit text in an open cell with all the usual text editing operations: mouse selection, text typing, backspace, and all the others. The cell is closed by typing enter or clicking elsewhere.

If a cell is selected, but not open, it can be opened by clicking or by typing F2.

Unscrolling - If the contents of a cell are changed, it is resorted to its new alphabetic location in the table. This often scrolls the table. To return to the previous scroll position, type alt-left arrow. (This same keystroke is used in browsers for going back to the previous page.)

Closing cells - An open cell can be closed by typing ENTER.

If a cell is open, and has changed in value, no other operation can be done until the cell is closed. Attempting another operation will close the cell, but not do the operation. To remind you that a cell has closed, every time a cell closes a small whoosh is sounded. So if you try to do an operation and hear a whosh, you will know the operation did not get done.

Special keystrokes

A number of special key strokes are defined.

Arrow keys - If no cell is open, the arrow keys will move the selection from one cell to the next. In an open cell the arrows move the cursor through the text.

ENTER - If a cell is open, ENTER will close it. If no cell is open, and a non-text (insert or delete) column is selected, the ENTER key will perform the insert or delete operation dictated by the column.

F2 - Open cell - If a cell is selected and not open, F2 will open it. (This is the same as in Excel.)

^Z  - Undo - Same as the Undo operation in the Edit menu.

^Y - Redo - Same as the Redo operation in the Edit menu.

^N - Convert name - Same as the Convert Name operation in the Edit menu.

^S - Save - Same as the Save operation in the File menu.

F1 - Help - Same as the Show Help operation in the Help menu. (Use the Browse User's Guide option in the Help menu to open this Guide.)

table of how to type accents Typing accented characters

For an accented letter, hold the control key and type the prefix, release control and type the letter. Similarly for upper case. See the table to the right.

On US keyboards, quote is an upper case apostrophe, tilde is an upper case grave, and circumflex is an upper case 6.

The TermsEditor windowColumns

Here are the columns of the table.

4 Index Terms. A main term is unindented or ditto marks to repeated the term on the line above it. A subterm is indent and prefixed with a colon. A "referer" term is indented and surrounded with (^ and ). Referer terms are not available to be assigned as entries to pages. Instead, when the index is generated there will be an entry for the referer. Its contents will be "See" followed by the parent main term.

Click on an item in column 4 and it is opened for editing. If the text is revised, the term will be moved in the table to its proper alphabetic position.

A word about acronyms - If a main term has an acronym, as in

Gross State Product (GSP)

then a referer from the acronym to the term is automatically generated. The acronym will not appear in the termslist window, but will appear in the generated index.

2 Trigger phrases. When a page is made "current", the background for its entries becomes yellow. At the same time the page text is scanned to see if it has any trigger phrases in terms newly defined since the page was last scanned. When a trigger phrase is found, the phrase in the text is colored. If the trigger applies to only one term, the phrase is colored blue and the term is added to the pages entries. If more than one term applies, the phrase is colored red, but no entries are added.

1 "Add phrase" arrow. Clicking an arrow in the left column causes an additional instance of the term to be added and its phjrase is opened to type in a phrase. (Until a phrase is typed in, the phrase internally has the value "~~~".)

3 "Add subterm arrow. Click an arrow in the third column and a new subterm line is added to the main term and opened for editing. (Until a subterm is entered, the subterm internally has the value "~~~".)

5 Clicking a red X in the fifth column will the term on that line. Main terms can only be deleted if they have no subterms.

Referers are added in column 4 with the AddReferer menu item. It is in the Edit menu in the menu bar and also on a popup. A referer line is added and opened for editing. (Until a referer is entered, it internally has the value "~~~".)


In the screen shots, one letter is underlined in each row. This is the menu item's "mnemonic key". Typing that key while the menu is visible will perform that menu item.

Terms Editor File menu

File menu

Save Saves the terms to indexterms.txt. (This will be done automatically when switching back to IndexGofer.)

Scan for Proper Nouns ...Prompts for the name of a .txt file and scans it for proper nouns, adding those found to the terms table. See Getting Ready.

Read Author's List ... Prompts for the name of a file with one name per line and addds the names to the terms table. See Getting Ready.

Exit Saves the terms list and closes the IndexGofer and TermsEditor applications.

Terms Editor File menuEdit menu

Undo xxx Reverses the effect of the last operation. The xxx names the sort of operation. Major operations like Scan for Proper Nouns and Read Authors List canot be undone.

Redo xxx If an operation has been unone, this operation does it again. A single operatno can be Undo-ed and Redo-ed any number of times, but the number of saved operations is no more than 25.

Delete Term The current row is deleted. Terms canot be deleted until all subterms and referers have been deleted.

Add Phrase A new row is inserted for the term on the current line. The phrase column is blank and opened for editing.

Add Subterm A new row is inserted below the current line. The term field is opened as a subterm (with a leading colon.)

Add Referer A enw row is inserted below the current row and its term field is opened as a referrer, inside (^ and ).

Convert Name The term field is modified as though it were a name in form first-middle-last. The new version is in form last-comma-first-middle.That is, the last full word is moved to the front and a comma is placed after it. Typing control-N has the same effect.

Terms Editor File menu

Help menu

Show Help Opens the Context Help window. As you move the mouse over the Terms Editor window, the Context Help window is scrolled to a description of the item under the mouse. To scroll in the Concext Help window, type the F1 key; it jumps the mouse to the Context Help window with out intervening mouse motion. Then you can scroll within the Context Help window to read descriptions.

Browse User's Guide Opens your local browser showing your local copy of the IndexGofer User's Guide. (If the local copy downloaded with InstallIndexGofer is not available, the website copy will be shown.)

About IndexGofer Displays a small dialog box with the IndexGofer versino number, the current directory, and the current file.

Switch to IndexGofer button

Clicking the Switch to IndexGofer button closes the table and returns to viewing pages and their index entries. At this time, the index terms are written to their file, indexterms.txt. The terms are also scaned for consistency; redundant terms are deleted and necessary terms are added. Examples of necessary terms include the main term for a subterm or a mnemonic. The design of the TermsEditor makes it unlikely that any such modifications are needed.




Paginating and Extracting an HTML

If paginator cannot be made to work, or if the proofs are not even in .pdf, you can fall back to plain text. Extract the text however you can. For .pdf files this means selecting text and typing control-C. Then edit the text to insert page number tags. Before each page insert a line containing "$@nnn", where nnn is the page number. Save this file as pages.txt. IndexEdit will read from this page.

Processing the PDF File from the Publisher

◆ Check the Sample PDF

If your publisher sends a sample .pdf file, put it in your project folder. Click it in this list.

(List of .pdf files in project folder)

You should see a page of the document in the Paginator application:

SaveAs dialog box from MS windows

If the text is not visible, there may be an error message to explain the problem. Possibly the file is encrypted or locked. Negotiation with the publisher may help. If the file cannot be read, you may have a similar problem when your own proofs arrive. There is a work-around that will add an hour or two.

◆ Choose index entries for each page

This is it. Time to populate the index. 

IndexGofer will show you a scrolled document with all the pages of each chapter:

IndexGofer main window with areas marked

The yellow area lists the terms that have been selected for that page. In the page text, phrases in blue have automatically generated phrases. Phrases in red are linked to several phrases, as showon at the bottom of the terms list window:

For each chapter__.txt, processing produces a chapter__-index.txt file. You need to process every page, but IndexGofer does not keep track of which pages you have processed. This table shows your chapters and the -index files that have been produced.

Click a chapter__.txt file in the table to begin or resume processing it for index entries.

Chapter file Resulting -index file
(List of chapter__.txt and chapter__-index.txt files.)

To open the help window, click in the IndexGofer window and type F1.

GOAL: This task is complete when have checked every page of every chapter__.txt. IndexGofer will then have created a chapter__-index.txt for every chapter.

Assigning Terms to Pages

Create Index Entries for a Chapter

Once the IndexGofer window is open, select from the File menu the option for Open Chapter. Some terms will be automatically added to pages because trigger phrases appear on those pages. You may delete these entries. To add other entries, double click on a term in a Terms List window.

IndexGofer will automatically save your work every few minutes and when you exit. To be safe, you can choose the "Save" item from the File menu or type control-S.


◆ Paginate the text

To do its work, IndexGofer needs a file where the text of each page is preceded by a marker giving the page number. A marker is a line beginning "$@" and followed by the page number. If possible, the markered text file is generated from a PDF file sent by the publisher. Otherwise, markers must be inserted in a text version of the book, as described in Inserting Page Number.

When the PDF arrives, place a copy in your project folder. (I recommend keeping the original pristine in a separate folder reserved solely for source material.) Paginating text from the PDF is the task of the Paginator tool. To start paginator, click the name of a PDF file in this list.

(List of .pdf files in project folder)

You will see a window like this, with empty page number cells on the left, page images on the right, and some tools between.

SaveAs dialog box from MS windows

To assign a number to the visible page, type the number into the "Pg #" box. Usually a book begins with front-matter pages numbered in lower case Roman numerals. So click in the top-left page number cell and type an "i" in the "Pg #" box. This amuses me because Paginator immediately numbers all pages in Roman numerals. Now click page number cells until you find a body page and then click on preceding pages to select the one that should be numbered "1"; sometimes it is a blank page. Type "1" into the "Pg #" box. Now, in the best case, your have numbered all the pages. Click pages near the end to be sure the number sequence has not been besmirched with unnumbered pages. (It has happened.)  (For even more amusement, you can have Paginator number the pages in Arabic numerals. Type control-alt-digit for each.)

By default, Paginator will extract all the text on every page, including headers and footers. To have it extract just the text, drag the edges of the rectangle named "text" until it surrounds the portion of the page that contains the text. (You can also drag the "header" rectangle to surround the page header. Paginator will parse out the page numbers and check them against your manually set numbers.)

When you extract the text, you need only those pages that have text you will index. You can skip pages that contain the table of contents, tables, figures, and the references section. To do so, select the page numbers and click the Skip Pages button. To select multiple consecutive pages, click the first number and shift-click the last. To select scattered pages, contol-click each.

While choosing index terms for pages, you may want to work on the entire manuscript or you may want to break it up and work on one chapter or section at a time. To create a separate file starting at a given page, select that page and click the "Chap. Name" box. Chapter files are numbered sequentially: chapter0.txt, chapter1.txt, ...

All your work must be done in one session. When you quit, you will be prompted to see if you want to extract chapters according to what you have done.

Your project folder has these chapter files with page markers::
(List of chapter__.txt files with page markers)

Other kinds of numbers

I some cases you want to have index entries refer to figures, tables, or notes. Publisher guidelines usually want these in italic, bold, or the form 123n4 (for note 4 on page 123). Paginator does not support these forms, but they they can be used. Please refer to Note Numbers for details of including such numbers in your chapter__.txt files.
GOAL: This task is complete when every chapter__.txt file contains page marker lines: $@number.


◆ Choose index entries for each page

This is it. Time to populate the index. 

IndexGofer will show you a scrolled document with all the pages of each chapter:

IndexGofer main window with areas marked

The yellow area lists the terms that have been selected for that page. In the page text, phrases in blue have automatically generated phrases. Phrases in red are linked to several phrases, as showon at the bottom of the terms list window:

For each chapter__.txt, processing produces a chapter__-index.txt file. You need to process every page, but IndexGofer does not keep track of which pages you have processed. This table shows your chapters and the -index files that have been produced.

Click a chapter__.txt file in the table to begin or resume processing it for index entries.

Chapter file Resulting -index file
(List of chapter__.txt and chapter__-index.txt files.)

To open the help window, click in the IndexGofer window and type F1.

GOAL: This task is complete when have checked every page of every chapter__.txt. IndexGofer will then have created a chapter__-index.txt for every chapter.

Choose Terms for Each Page   ${help.pages.list.TDbody} The main IndexGofer window names the current file and directory in the title bar. There follows three rows: menu bar, the pades table, and a message line at the bottom.
the IndexGofer main window

The three columns of the pages table are the page number, the contents of the page, and the index terms that have been selected for that page. As the page was read in, IndexGofer scanned it for trigger phrases (as given in indexterms.txt). In the image above, the phrases "race to the bottom" and "slavery" resulted in index entries of the same. "Interstate competition" and "Levi" resulted in "labor costs, state" subhead "interstate competition" and "Levi, Margaret." The term United States of America was added with the Add Entry command. The phrase "labor costs" is red because that phrase is the trigger for two different index terms. Neither was automatically listed, so you need to review red phrases to see if any index terms should be added for that page. Selecting the entire red phrase will make the Index Terms window scroll to the alphabetically first term in the Index Terms window. Selecting a blue phrase will cause the selection to jump to the index entry made for that term.

The index entries on the "active" page are hi-lit in yellow. Additions and removal of index entries occur there. When you scroll the text, IndexGofer makes one of the visible pages active and colors its entries section in yellow.As the text is scrolling you will see empty entry areas. That is because the text is not scanned for trigger phrases until the page is made active (and thus has a yellow area).

Entering Accented Characters

Letters for European alphabets can be entered with prefix control characters. For example, type control-apostrophe and the letter a to enter á (a-acute). The supported letters are these

type a control ↓ and then a letter →

a e i o u y A E I O U Y c C n N
acute control-' (apostrophe) á é í ó ú ý ? É ? Ó Ú ?        
umlaut control-" (double-quote) ä ë ï ö ü ÿ Ä Ë ? Ö Ü          
circumflex control-6 (digit-6) â ê î ô û   Â Ê Î Ô Û          
grave control-` (grave) à è ì ò ù   À È Ì Ò Ù          
tilde control-~ (tilde) ã     õ     Ã     Õ         ñ Ñ
cedilla control-, (comma)                         ç Ç    
slash control-/ (slash)       ø           Ø            
ring control-o (letter-o) å           Å                  

Command Buttons

The menu bar has four buttons for the commands of IndexGofer:.

Rescanning is usuaully unnecessary. Every time a page is made active it is scaned for terms that have been added since the last time the page entries were modified. However, once an entry has been deleted for a page the only way to get it back is by selecting the entry in a terms window and using the Add Entry button.

Commands can be invoked from menus, and also from the keyboard:

Add Entry

Insert or Control-a
or double-click on term in Index Terms window

Remove Entry
Delete or Control-d
Create new index term ...
Save entries

With Create new index term, you can add a new term or add a crossreference. For adding a term you will see three fields:
Adding an index term
The trigger phrase is one or more words; when a page is scanned, IndexGofer looks for these phrases. If one is found, its term is added to the entries for the page. The index entry is the main heading field together with the optional sub heading field. A new term is rejected if all three fields exactly match those of an existing term. Oterwise the term is added at its alphabetic location in the Index Terms window and immediately inserted in the indexterms.txt file. It is not added to the active page; to do os, type the INSERT key.

Clicking the "Cross reference" tab at the top of the dialog box brings up the fields for entering a cross reference:
Four fields for entering a cross-reference: the term where the reference will appear and the term it refers to.
The "for nickname" term is the term in the index where this cross reference will appear; The "See ..." term is the one that is referred to. The "under" term might be NEA and the "See" term "National Education Association (NEA)" Then the index would have entries
National Education Association (NEA) 12, 20, 44
NEA. See National Education Association
(Note the special case for acronyms. The trailing instance of "(NEA)" is stripped from the entry for NEA, but appears in the other entry.)

The 'file' menu

The File Menu

Open Chapter - Prompts for a new chapter and opens it. The file must be a text file with extension .txt. Pages in the text must each be preceded with a line having $@xxx, where xxx is the page number. The directory for Chapter files is remembered from one editing session to the next.

Save Entries - For chapter xxx.txt, this command creates file xxx-index.txt and stores into it all the index entries. It remembers which entries you have deleted. The chapter is rescanned every time it becomes active, but deleted entries do not come back. Entries are saved automatically when you open another chapter, or you exit the program, or when a five minute timer fires.

Create Index ... - You are prompted with a list of all the ...-index.txt files in the current directory. When you click "Index in text" or "Index in html", the checked files are read, the entries are sorted, and an index is created in index.txt or index.html, respectively. The html file can be edited with Microsoft Word to convert it to some other format. Or with emacs to modify line endings conveniently.

New Terms Window - A new instance of the Index Terms window is opened. All such windows look and behave alike, except that they may be scrolled differently and each may have its own set of selected entries. The selection is visible only when the window has the input focus.

Exit - IndexGofer saves any entries. For filename.txt, entries are saved to filename-index.txt.  Entries are automatically saved when you switch to another file or exit the program.  They are also saved every five minutes,

The 'Entries' menu

The Entries Menu

Insert Term - D

Delete Term - D

New Term ... - D

Rescan Page - Dasd


The 'file' menuThe Windows Menu

A- Displ

A- Displ

A- Displ


The 'file' menuThe Help Menu

Show Help - Brings up a window displaying the ContextHelp file. As the mouse moves across the IndexGofer windows, the help window scrolls to describe what is under the mouse. F1 will also raise the ContextHelp window. In addition, it jumps the mouse to that window without changing the main window; thus you can explore the Context help.

Enter demo mode - In demo mode, IndexGofer works on a single built-in file and set of index terms. Creating an index shows it on the screen instead of saving it to a file.  Things to try:

Choose menu item File/CreateIndex and either html or text.
See the nice index.
Click "shadow" at the bottom of the terms list window.
It turns blue.
Click on a page in the main window.
Its index entries turn yellow.
Click the Add Entry button at the top.
"shadow" gets added to the entries in the yellow area.
Choose File/CreateIndex again.
Now the index has an entry for "shadow" and a cross reference to it.

If you add "dark" as a term on some page(s), more cross references will appear. (Cross references do not appear unless the term they point at has associated entries.)

Browse User's Guide - asasdasd

About IndexGofer - Displays some mildly useful information, especially the current directory and file name. You should report the version number in error reports.

The bottom lines of the About window display the current directory and current file.

"Index Terms" Window

Available Index Terms window        Any term in the"Index Terms" window can be assigned to any page in the text.  Scroll through the list. Select a term. It turns blue. Click the Add Entry button, and that term becomes an entry for the current page. Select two or more consecutive terms. They get blue. Click the Add Entry button, and they all become entries for the page.

If you want a new term, use the Create new index term ... button. If you want another copy of the entire window, use New Terms Window in the File menu. The contents of the window are derived from indexterms.txt in the same directory as the open chapter.



◆ Shazam! The index appears

Click here to create your index.

Your index has been stored in the project folder as "index.html." It should also now be showing in your browser. You can edit index.html with MicroSoft Word to adapt it as you wish.

Goal: This task is complete when index.html exists.

That's it! Cheers and cupcakes to you.

If IndexGofer helped, tell a colleague.  And I'd be delighted to hear from you, good or bad.


Click here to create your index.

Your index has been stored in the project folder as "index.html." It should also now be showing in your browser. You can edit index.html with MicroSoft Word to adapt it as you wish.

Goal: This task is complete when index.html exists.

That's it! Cheers and cupcakes to you.

If IndexGofer helped, tell a colleague.  And I'd be delighted to hear from you, good or bad.


After terms have been chosen for each page, it is time to make the index. At the end of the IndexGoferEscort document, click to "Create Index" button. File index.html will be created. The GenIndex program will have created styles embedded in the document, in a separate index.css file, or both. To view the generated index use your browser to visit the generated index.html.


Formatting the Index in HTML

One way to get the index formatted is via a CSS stylesheet. The actual generated index.html file starts:

<dl style="margin:0;">
<dt class='indexgofermainterm'>labor force</dt>
<dd class='indexgofersubterm'>composition of, 67, 94,
96, 103, 131, 188n8&ndash;9, 191n10</dd>

Each main term is of class indexgofermainterm and each subterm is class indexgofersubclass. By adjusting the stylesheet, the appearance can change. The IndexGofer default style is:

   <style type="text/css">  	
      .indexgofermainterm { padding-left: 1em; text-indent: -1em; }  
      .indexgofersubterm { padding-left: 1em; text-indent: -1em; }


Install IndexGofer  

Installing IndexGofer

The download will also have created a shortcut IndexGofer shortcut image in the same directory. You can click it to start IndexGofer. Or copy the Icon to your desktop, another directory, or the start program menu and click it there.

IndexGofer requires JRE, the Java runtime environment for J2SE 1.6 or later. Check your java version at http://www.java.com/en/download/. If your system has not got the latest Java, the site will offer to download it.

Input files to IndexGofer are ASCII text files. They are described in the Getting Ready page and in further detail below.

Installation on MSWindows

First check that you have the runtime for J2SE 1.6 or later. (Check at http://www.java.com/en/download/.)

Create an installation directory, such as
C:\Program Files\physpics\IndexGofer.
Download InstallIndexGofer.jar to the installation directory and double click on it. Three items are installed: IndexGofer.jar, a shortcut to IndexGofer, and a directory containing the icon used by the shortcut. You can now delete InstallIndexGofer.jar.

To run IndexGofer double click on IndexGofer.jar or on the shortcut. If clicking the shortcut fails, see the "Advanced" section below.

For convenience copy the shortcut to your desktop or a project directory where you are making an index. To create an entry for IndexGofer in your Start Menu, drag the desktop icon into the "Start" button, pause for the menu to appear, and then continue dragging to the desired place in the menu.

To view the full help file locally, you can download InstallIndexGoferGuide.jar Double click to do the install. If you install it in the same directory as IndexGofer.jar, the Help menu item "Browse full Help" will fetch it from your file system instead of the web.

ADVANCED for MSWindows

If typing "java" on the command line does NOT produce

	"Usage: java [-options] class [args...]"
and forty more lines, you may need to reinstall Java. Another option is to explicity name the Java directory in the command. If Java is installed in c:\Program Files\Java\jdk1.6 then the command to run IndexGofer is "c:\Program Files\Java\jdk1.6\bin\java" -jar "xxx\IndexGofer.jar" where xxx is the installation directory you chose.

You may want a desktop icon where you can drop a file to edit its index entries. Here's how. Put the following in a file indexgofer.bat:

	start "IndexGofer" /min java.exe -jar "c:\mydir\IndexGofer.jar" %*
Create an icon (with "paste shortcut") and edit its Properties to change the "Target" to the location and name your new indexgofer.bat file. When you drop a file on the icon, IndexGofer will open and start with that file. The approach above creates a terminal window. Deleting that window will terminate IndexGofer. To avoid having the terminal, change indexgofer.bat to:
	start "IndexGofer" "c:\mydir\IndexGofer.jar" %*

To ease the task of adding line numbers to the source file, I wrote a GNU emacs macro. The current page number is in an emacs register. To set it, give the command

	C-u number \C-x r n p
To insert the current page number and increment the number, invoke the macro:
Here is the macro definition:
	(fset 'page-number
	   [return return ?$ ?@ return left ?\C-x ?r ?+ ?p 
			 ?\C-x ?r ?g ?p  ?\C-e] )
	(global-set-key [26] 'page-number)

Other platforms (tested on Mac)

Launching from desktop icons and the Start Menu (Microsoft Windows and Unix running GNOME 2.0+)

Java Web Start technology can automatically create shortcuts for your application on the desktop and in the Start Menu for Web-deployed applications developed with Java technology. You can use the Java Control Panel to control the shortcut settings. Shortcuts can also be added by using the Java Web Start Cache Viewer, using the install shortcut menu item.

Using Java Web Start Software Behind a Proxy Server/Firewall

Java Web Start software must be configured with the correct proxy settings in order to launch applications from outside your firewall. Java Web Start software will automatically try to detect the proxy settings from the default browser on your system (Internet Explorer or NetscapeTM browsers on Microsoft Windows, and Netscape browsers on the Solaris Operating Environment and Linux). Java Web Start technology supports most web proxy auto-configuration scripts. It can detect proxy settings in almost all environments.

You can also use the Java Control Panel to view or edit the proxy configuration.

To find the Java Cache Viewer is an art. Start by launching the Java Control Panel; your desktop may have a shortcut to it, or you will have to find it under the name javacpl in the bin/ directory of the Java runtime (jre) installation. Click the "General" tab and then the button labelled "View" under "Temporary Internet Files". (The control panel layout has changed many times, so look around if you don't find it under exactly the names listed here.)




Appendix: File Formats


To remember the project directory, IndexGofer creates a file ".IndexGofer.ini" in the user's home directory. Thus after the first run, any IndexGofer binary will open the project directory last used. To switch to another directory, use the file:Open menu item to open a chapter file in that directory.

Chapter text files: xxxx.txt

Each section of the book needs to be in the project directory as a text file. Use UTF-8 if an encoding is necessary to report all characters (especially European alphabets and 6's / 9's quotation marks).

Each page of text must begin with a line containing  "$@" and the page number:
          Chapter 1.
          Call me Ishmael. ...

The initial .txt file for the book can created by "Save as" from most word processors. In Microsoft Word, the option appears in the dialog box as "Plain Text (*.txt)". (If your word processor lacks this amenity, email me.) When the document contains special characters, MS Word will prompt you for an encoding. Choose "UTF-8" or "Unicode(UTF-8)." After creating the text file, break it up into sections and add page number lines with a text editor. Wordpad works well. Or emacs, if you have it.

IndexGofer does rudimentary formating on text:

Headings are bold and centered.

More about indexterms.txt follows.


The lines of indexterms.txt mostly define index terms. The simplest form is
      phrase  WHITE  term
where WHITE is some combination of tabs and spaces. Since phrase and term can each have spaces, WHITE must be at least one tab or two spaces. More are okay.

The phrase is employed when IndexGofer scans a chapter text; it scans for instances of the phrase and where it finds one, inserts the corresponding term as an entry for the page. When inserting terms from the Index Terms window, only the term is employed.

Phrase words can contain only letters, hyphens, and apostrophes. Other characters are ignored.  The phrase can be omitted and then that term is never automatically added to a page by the initial scan. If the phrase is left out, there must be leading white space, as in
     WHITE term

For narrower categories, index terms are often subdivided with subterms. An index term with a subterm is written in the form
    phrase WHITE term SPACE COLON SPACE subterm

The corresponding index entry will appear as
       subterm  xx, xx, ... (page numbers)

Besides terms, indexterms.txt may contain blank and comment lines. Comments begin with "//". One comment line can have the form
    // title: title words ...
When the index is generated in html, this book title will appear as the page title for the html page.

The first book indexed had phrases for both New York and New York Times. This works because the longest phrase found is the one used. But "York Times" would not work; the text "New York Times" would be recognized as "New York" and not as an instance of "York Times".

As terms are added, they are appended to indexterms.txt. Preceding each new term is a comment like this

// 1308343209406 end of session Fri Jun 17 16:40:09 EDT 2011
Only the long number matters. It is the internal form for the time when the entry was created.

Cross reference entries

Cross references are index entries that direct the reader to look at other index entries. They appear in the index as "see ..." and "see also ...," as in
NYT. See New York Times
    home ownership and, 25n3
    equality, struggle for (see racial equality, struggle for)
    political party polarization, 102-3 (see also polarization, racial)
    See also Eastern Europeans; Asians.
These are incorporated in indexterms.txt with lines having the form
 index term .SEE. index term
where either index term may be just a main heading, or may be a main heading, " : ", and a sub heading.  For instance
NEA : members .SEE. National Education Association (NEA) : membership
which will generate in the index as
National Education Association (NEA)
membership 23, 25, 167-71
members (see National Education Association, membership)

Neither the term before or after .SEE. can have an associated phrase. To assign a phrase, put in another line that gives the phrase and its index term.

For the indexterms.txt line "xxx .SEE. yyy", the Index Terms window will have a listing of yyy.


Appendix - Parsing author names

Author's List

Author names are sometimes included in the index. The reference or bibliography section is usually a comprehensive list of these. Unfortunately the large variety of rules for punctuating references makes it non-trivial to identify them by program. Fortunately, they are not hard to isolate with any competent text editor like emacs. To enter author names into IndexGofer, put them in a file with one author per line. Lines without commas will be converted to last-name-comma-given-names by assuming that the last name is a single word. Where this is not the case, enter the name in that format.

◆ Parse author names

Authors of cited works are usually indexed and so the bibliography is a potent source of proper nouns. Sadly. bibliography conventions are a bit too idiosyncratic for automatic name extraction. So you have to do it yourself. The desired result is a file authors-list.txt with one author name per line. (Lines without commas will be converted by moving the last word to first and following it with a comma.) If you can make such a list from other tools, go ahead. Put your list in authors-list.txt.  If it has characters other than a-z and A-Z, be sure it is encoded in UTF-8 (or Unicode).

To parse the author names from the bibliography partition, click to run

This task extracts the Bibliography partition into authors-list.txt. and shows it to you for editing. As you work, it modifies the file. You can interrupt your work and resume later.

Editing: You can treat AuthorParse as a text editor. The goal is one author name per line. Most keys behave as usual, including BACKSPACE, DELETE, HOME, END, and the arrow keys. Mouse clicks change the selection as usual. In addition, AuthorParse defines special keys for morphing typical bibliography entires into author names:

advances the selection to the next punctuation group and its adjacent spaces. Typical groups are comma-space, semi-colon-space, or period-space. The program usually ignores initials -- space-capitalletter-period-space.
as usual, ENTER replaces the selection with a newline. If the selection is a punctuation group after an author name, this action has the effect of leaving the author name on a line of its own.
ENTER (after replacing a punctuation group)
To delete the rest of a line, type ENTER again immediately after an ENTER that ended an author name.

So if a citation is

Heilemann, John, and Mark Halperin. Game Change. New York: HarperCollins, 2010.

And the selection is at the beginning of the line, you would reduce thecitation to two lines containing "Heilemann, John" and "Halperin, Mark" by typing:

TAB select the ", " after Heilemann
TAB select the ", and "

convert ", and " to a newline, leaving "Heilemann, John" on its own line

TAB select the ". " after Halperin
ENTER convert the ". " to a newline and convert name to "Halperin, John"
ENTER delete the rest of the citation and any surrounding newlines

After this the cursor is at the beginning of the next bibliography entry, ready to go again.

For other deletions, select the text and type the DELETE key. In particular, to delete lines, I mouse click at the left of the first line, type control-down-arrow to select additonal lines, and then type DELETE.

A few other key combinations have special definitions.  Look for them on the right in the Edit menu. To convert names from first-last to last-comma-first type control-n. To do the reverse, type control-u. To process bibliography entries from another file, choose Append File ... from the File menu. The chosen file will be copied to the end of the working file, authors-list.txt.

When you have converted all of authors-list.txt, click this button to insert those names in the terms list.

To see the current list of terms:

To convert names from first-last to last-comma-first, type control-n.  (Control-u is not implemented here.)  After typing control-n, the list will be scrolled to the place where the revised entry fits. To scroll back to where you were, type alt-left-arrow. (This is the same key that browsers offer as a shortcut for the BACK button.)

GOAL: This task is complete when you have added the list of authors to indexterms.txt.