A struct src * is a source for fetching or scanning a stream of (8 bit)
characters. There may be a string component and a file component.
Fetching characters with getSrc takes characters from the string until it is empty,
and then from the file. When the stream is exhausted. getSrc returns -1. The ungetSrc method prepends the character to the string component, creating it if need be. Some methods provide tools for scanning the stream of charcters.
If a string argument to any method must be retained, a copy is made; the original string that was passed
as an argument remains the responsibility of the client.
Create and discard struct src objects
struct src *newSrc(int initlen);
Create a new struct src with no file or string.
struct src *newSrcFromString(char *string);
Create a new struct src containing a given string.
void closeSrc(struct src* s);
Finish using a struct src. Free s's string, close s's file, and free s itself.
void resetSrc(struct src *s);
Empty a struct src, but do not free it. Reset s's string to zero length and close s's file. For code clarity it is usually preferable to call closeSrc(s) and create a new value with newSrc.
struct src *cloneSrc(struct src *s);
Return a new struct src whose string is a copy of s's string.
The file is NULL in the returned value.
struct src *encodedSrc(char *txt);
Return a static struct src containing an html encoded version of txt, where
These characters are not encoded: 0-9 A-Z a-z - + * / ( ) _ . , ' ! $
Space is converted to +
Other characters are encoded to %xx where xx is the hex for the character.
Slashes are unencoded only because we may be
encoding an entire URL. THIS IS BOGUS.
Fetch From Stream
A struct src is a stream of characters. They are typically fetched one at a time with getSrc.
int getSrc(struct src *s);
Get the next character from s;
from s's string if not empty, or from s's file. At EOF return -1 and close the file.
void markSrc(struct src *s);
Remember the location of the current next character.
This will be the beginning of the text
copied with a later retrieveSrc().
char *retrieveSrc(struct src *s);
Returns a new string having the characters between
the position at the last call to markSrc() and the character most recently fetched with getSrc.
void ungetSrc(struct src *s, char c);
Insert a character at the front of the string
to be returned by getSrc.
void unAddSrc(struct src *s, char *victim);
Remove from the end of s the number of characters in victim.
(No check is made to see that the removal is actually equal to victim.)
Append to String
The string appended is inserted between the string and the file (if any).
void putSrc(struct src *s, char c);
Add a single character to s;
this character will be fetched by getSrc between s's string and its file.
void addSrc(struct src *s, char *newchars);
Append characters to the string portion of s.
They will be fetched before the file, if any.
This is a convenence method for doing
a bunch of putSrc method calls.
void concatSrc(struct src *s, int n, arg2, ...);
Append to s each of the n strings: arg2, arg3, ...
void addIntSrc(struct src *s, int j);
Append to src the decimal character representation of j.
int insertSrcFile(struct src *s, char *fname);
Read an entire file and append the contents at the end of s's string.
Return -1 for error, or otherwise the number of characters inserted.
Match string contents
The string portion of the struct src is examined, but not the file.
int deblankSrc(struct src *s);
Advance the s past all white space. The next getSrc() will return the next character after the whitespace.
That same next character is also returned by this method.
char *searchSrc(struct src *s, char *target);
Return a pointer to the first instance of target in s's string, or else NULL. See beheadSrc.
int startsWithSrc(struct src *s, char *target);
Report whether s begins with the target string, ignoring case differences. Return 1 for a match, and 0 otherwise.
int endsWithSrc(struct src *s, char *target);
Check whether target matches the end of the
string portion of s. Ignore differences in case.
Return 1 for a match, and 0 otherwise.
int matchSrcTail(struct src *s, char *target);
An alternate name for endsWithSrc.
Access Fields
void setSrcFile(struct src *s, FILE *newf);
Set the file for s to read from.
If the s already has a file,
it is closed and discarded;
an error is printed. Src takes responsibility for newf and will close it.
char *getSrcString(struct src *s);
Return the string portion of s.
This value is internal to the struct src and
MUST NOT BE MODIFIED or FREED by the client.
int lenSrc(struct src *s);
Return the length remaining in s.
int eofSrc(struct src *s);
Return 1 if s has no string and is at end of file, else 0.
void beheadSrc(struct src *s, char *newstart);
Remove the front of s's string by setting s's start to newstart. If newstart is not in s's string, nothing is done. The only recommended use of this method is when newsrc is the result of a call to searchSrc.
Create a new string containing the concatenation of the given arguments.
char *strappend(char *dest, char *appendee);
Copy appendee to dest, returning the location of the end of the string.
That is, the returned value can serve as dest for another strappend.
int startsWith(char *s, char *target);
Return 1 if s starts with target, 0 otherwise.
int endsWith(char *s, char *target);
Return 1 if s ends with target, 0 otherwise.
char *escapeSrc(char *orig, char **encoding);
Return a copy of orig with certain specific characters replaced by
strings from the encoding.
Caller is responsible for freeing the returned string.
The second argument is an encoding vector, as an array of strings. Usually the encoding is a value returned by getEscapeSrcEncoding.
Each string in the array is the encoding for a character;
in order, the encoded characters are:
\a (BEL)
\b (BS)
\t (TAB)
\n (NL)
\v (VT)
\f (FF)
\r (CR)
! (bang)
" (quote)
# (hash)
' (apos)
? (query)
\ (backsl)
Unicode‑prefix
UNPRINTABLE
"Unicode‑prefix" is the prefix to replace \u.
The UNPRINTABLE string is a code letter for treatment of unprintables:
"?" means replace with "?"
"x" means use \0xdd
"u" means \uxxxx
Returns one of the
predefined encoding vectors suitable for passing to escapeSrc.
Appropriate index constants are defined in src.h. They are as given in this table (where white-space is BS TAB NL VT FF CR):
index constant
characters escaped with \x
Unicode
prefix
Unprint-
able
SRC_ESCAPE_IDENTITY
(none)
\u
?
SRC_ESCAPE_MAKEFILE
# backslash
\u
?
SRC_ESCAPE_PHP
apostrophe backslash
\u
\0xdd
SRC_ESCAPE_JAVASCRIPT
apostrophe backslash quote white-space
\u
\uxxxx
SRC_ESCAPE_C
apostrophe backslash quote white-space ? BEL
\x
\0xdd
SRC_ESCAPE_SHELL
! apostrophe backslash
\u
?
In all cases, any character not noted is replaced with itself.