RWWString(3C++) RWWString(3C++)
NameRWWString - Rogue Wave library class
Synopsis
#include <rw/wstring.h>
RWWString a;
Description
Class RWWString offers very powerful and convenient facilities for
manipulating wide character strings. This string class manipulates wide
characters of the fundamental type wchar_t. These characters are
generally two or four bytes, and can be used to encode richer code sets
than the classic "char" type. Because wchar_t characters are all the
same size, indexing is fast. Conversion to and from multibyte and ASCII
forms are provided by the RWWString constructors, and by the RWWString
member functions isAscii(), toAscii(), and toMultiByte(). Stream
operations implicitly translate to and from the multibyte stream
representation. That is, on output, wide character strings are converted
into multibyte strings, while on input they are converted back into wide
character strings. Hence, the external representation of wide character
strings is usually as multibyte character strings, saving storage space
and making interfaces with devices (which usually expect multibyte
strings) easier. RWWStrings tolerate embedded nulls. Parameters of type
"const wchar_t*" must not be passed a value of zero. This is detected in
the debug version of the library. The class is implemented using a
technique called copy on write. With this technique, the copy
constructor and assignment operators still reference the old object and
hence are very fast. An actual copy is made only when a "write" is
performed, that is if the object is about to be changed. The net result
is excellent performance, but with easy-to-understand copy semantics. A
separate RWWSubString class supports substring extraction and
modification operations.
Persistence
Simple
Example
#include <rw/rstream.h>
#include <rw/wstring.h>
main(){
RWWString a(L"There is no joy in Beantown");
a.subString(L"Beantown") = L"Redmond";
cout << a << endl;
return 0;
}
Page 1
RWWString(3C++) RWWString(3C++)
Program output:
There is no joy in Redmond.
Enumerations
enum RWWString::caseCompare { exact, ignoreCase };
Used to specify whether comparisons, searches, and hashing functions
should use case sensitive (exact) or case-insensitive (ignoreCase)
semantics..
enum RWWString::multiByte_ { multiByte };
Allow conversion from multibyte character strings to wide character
strings. See constructor below.
enum RWWString::ascii_ {ascii };
Allow conversion from ASCII character strings to wide character strings.
See constructor below.
Public ConstructorsRWWString();
Creates a string of length zero (the null string).
RWWString(const wchar_t* cs);
Creates a string from the wide character string cs. The created string
will copy the data pointed to by cs, up to the first terminating null.
RWWString(const wchar_t* cs, size_t N);
Constructs a string from the character string cs. The created string
will copy the data pointed to by cs. Exactly N characters are copied,
including any embedded nulls. Hence, the buffer pointed to by cs must be
at least N* sizeof(wchar_t) bytes or N wide characters long.
Page 2
RWWString(3C++) RWWString(3C++)
RWWString(RWSize_T ic);
Creates a string of length zero (the null string). The string's capacity
(that is, the size it can grow to without resizing) is given by the
parameter ic.
RWWString(const RWWString& str);
Copy constructor. The created string will copy str's data.
RWWString(const RWWSubString& ss);
Conversion from sub-string. The created string will copy the substring
represented by ss.
RWWString(char c);
Constructs a string containing the single character c.
RWWString(char c, size_t N);
Constructs a string containing the character c repeated N times.
RWWString(const char* mbcs, multiByte_ mb);
Construct a wide character string from the multibyte character string
contained in mbcs. The conversion is done using the Standard C library
function ::mbstowcs(). This constructor can be used as follows:
RWWString a("306374315313306374", multiByte);
RWWString(const char* acs, ascii_ asc);
Construct a wide character string from the ASCII character string
contained in acs. The conversion is done by simply stripping the high-
order bit and, hence, is much faster than the more general constructor
given immediately above. For this conversion to be successful, you must
be certain that the string contains only ASCII characters. This can be
confirmed (if necessary) using RWCString::isAscii(). This constructor
Page 3
RWWString(3C++) RWWString(3C++)
can be used as follows:
RWWString a("An ASCII character string", ascii);
RWWString(const char* cs, size_t N, multiByte_ mb);
RWWString(const char* cs, size_t N, ascii__ asc);
These two constructors are similar to the two constructors immediately
above except that they copy exactly N characters, including any embedded
nulls. Hence, the buffer pointed to by cs must be at least N bytes long.
Type Conversion
operator
const wchar_t*() const;
Access to the RWWString's data as a null terminated wide string. This
datum is owned by the RWWString and may not be deleted or changed. If
the RWWString object itself changes or goes out of scope, the pointer
value previously returned will become invalid. While the string is
null-terminated, note that its length is still given by the member
function length(). That is, it may contain embedded nulls.
Assignment Operators
RWWString&
operator=(const char* cs);
Assignment operator. Copies the null-terminated character string pointed
to by cs into self. Returns a reference to self.
RWWString&
operator=(const RWWString& str);
Assignment operator. The string will copy str's data. Returns a
reference to self.
RWWString&
operator=(const RWWSubString& sub);
Assignment operator. The string will copy sub's data. Returns a
reference to self.
RWWString&
operator+=(const wchar_t* cs);
Page 4
RWWString(3C++) RWWString(3C++)
Append the null-terminated character string pointed to by cs to self.
Returns a reference to self.
RWWString&
operator+=(const RWWString& str);
Append the string str to self. Returns a reference to self.
Indexing Operators
wchar_t&
operator[](size_t i);
wchar_t
operator[](size_t i) const;
Return the ith character. The first variant can be used as an lvalue.
The index i must be between 0 and the length of the string less one.
Bounds checking is performed -- if the index is out of range then an
exception of type RWBoundsErr will be thrown.
wchar_t&
operator()(size_t i);
wchar_t
operator()(size_t i) const;
Return the ith character. The first variant can be used as an lvalue.
The index i must be between 0 and the length of the string less one.
Bounds checking is performed if the pre-processor macro RWBOUNDS_CHECK
has been defined before including <rw/wstring.h>. In this case, if the
index is out of range, then an exception of type RWBoundsErr will be
thrown.
RWWSubString
operator()(size_t start, size_t len);
const RWWSubString
operator()(size_t start, size_t len) const;
Substring operator. Returns an RWWSubString of self with length len,
starting at index start. The first variant can be used as an lvalue.
The sum of start plus len must be less than or equal to the string
length. If the library was built using the RWDEBUG flag, and start and
len are out of range, then an exception of type RWBoundsErr will be
thrown.
Public Member Functions
RWWString&
append(const wchar_t* cs);
Page 5
RWWString(3C++) RWWString(3C++)
Append a copy of the null-terminated wide character string pointed to by
cs to self. Returns a reference to self.
RWWString&
append(const wchar_t* cs, size_t N,);
Append a copy of the wide character string cs to self. Exactly N wide
characters are copied, including any embedded nulls. Hence, the buffer
pointed to by cs must be at least N*sizeof(wchar_t) bytes long. Returns
a reference to self.
RWWString&
append(const RWWString& cstr);
Append a copy of the string cstr to self. Returns a reference to self.
RWWString&
append(const RWWString& cstr, size_t N);
Append the first N characters or the length of cstr (whichever is less)
of cstr to self. Returns a reference to self.
size_t
binaryStoreSize() const;
Returns the number of bytes necessary to store the object using the
global function:
RWFile& operator<<(RWFile&, const RWWString&);
size_t
capacity() const;
Return the current capacity of self. This is the number of characters
the string can hold without resizing.
size_t
capacity(size_t capac);
Hint to the implementation to change the capacity of self to capac.
Returns the actual capacity.
Page 6
RWWString(3C++) RWWString(3C++)
int
collate(const RWWString& str) const;
int
collate(const wchar_t* str) const;
Returns an int less then, greater than, or equal to zero, according to
the result of calling the POSIX function ::wscoll() on self and the
argument str. This supports locale-dependent collation.
int
compareTo(const RWWString& str,
caseCompare = RWWString::exact) const;
int
compareTo(const wchar_t* str,
caseCompare = RWWString::exact) const;
Returns an int less than, greater than, or equal to zero, according to
the result of calling the Standard C library function ::memcmp() on self
and the argument str. Case sensitivity is according to the caseCompare
argument, and may be RWWString::exact or RWWString::ignoreCase.
RWBoolean
contains(const RWWString& cs,
caseCompare = RWWString::exact) const;
RWBoolean
contains(const wchar_t* str,
caseCompare = RWWString::exact) const;
Pattern matching. Returns TRUE if cs occurs in self. Case sensitivity
is according to the caseCompare argument, and may be RWWString::exact or
RWWString::ignoreCase.
const wchar_t*
data() const;
Access to the RWWString's data as a null terminated string. This datum
is owned by the RWWString and may not be deleted or changed. If the
RWWString object itself changes or goes out of scope, the pointer value
previously returned will become invalid. While the string is null-
terminated, note that its length is still given by the member function
length(). That is, it may contain embedded nulls.
size_t
first(wchar_t c) const;
Returns the index of the first occurrence of the wide character c in
self. Returns RW_NPOS if there is no such character or if there is an
Page 7
RWWString(3C++) RWWString(3C++)
embedded null prior to finding c.
size_t
first(wchar_t c, size_t) const;
Returns the index of the first occurrence of the wide character c in
self. Continues to search past embedded nulls. Returns RW_NPOS if there
is no such character.
size_t
first(const wchar_t* str) const;
Returns the index of the first occurrence in self of any character in
str. Returns RW_NPOS if there is no match or if there is an embedded
null prior to finding any character from str.
size_t
first(const wchar_t* str, size_t N) const;
Returns the index of the first occurrence in self of any character in
str. Exactly N characters in str are checked including any embedded
nulls so str must point to a buffer containing at least N wide
characters. Returns RW_NPOS if there is no match.
unsigned
hash(caseCompare = RWWString::exact) const;
Returns a suitable hash value.
size_t
index(const wchar_t* pat,size_t i=0,
caseCompare = RWWString::exact) const;
size_t
index(const RWWString& pat,size_t i=0,
caseCompare = RWWString::exact) const;
Pattern matching. Starting with index i, searches for the first
occurrence of pat in self and returns the index of the start of the
match. Returns RW_NPOS if there is no such pattern. Case sensitivity is
according to the caseCompare argument; it defaults to RWWString::exact.
size_t
index(const wchar_t* pat, size_t patlen,size_t i,
caseCompare) const;
size_t
index(const RWWString& pat, size_t patlen,size_t i,
caseCompare) const;
Page 8
RWWString(3C++) RWWString(3C++)
Pattern matching. Starting with index i, searches for the first
occurrence of the first patlen characters from pat in self and returns
the index of the start of the match. Returns RW_NPOS if there is no such
pattern. Case sensitivity is according to the caseCompare argument.
RWWString&
insert(size_t pos, const wchar_t* cs);
Insert a copy of the null-terminated string cs into self at position pos.
Returns a reference to self.
RWWString&
insert(size_t pos, const wchar_t* cs, size_t N);
Insert a copy of the first N wide characters of cs into self at position
pos. Exactly N wide characters are copied, including any embedded nulls.
Hence, the buffer pointed to by cs must be at least N*sizeof(wchar_t)
bytes long. Returns a reference to self.
RWWString&
insert(size_t pos, const RWWString& str);
Insert a copy of the string str into self at position pos. Returns a
reference to self.
RWWString&
insert(size_t pos, const RWWString& str, size_t N);
Insert a copy of the first N wide characters or the length of str
(whichever is less) of str into self at position pos. Returns a
reference to self.
RWBoolean
isAscii() const;
Returns TRUE if it is safe to perform the conversion toAscii() (that is,
if all characters of self are ASCII characters).
RWBoolean
isNull() const;
Returns TRUE if this string has zero length (i.e., the null string).
size_t
last(wchar_t c) const;
Page 9
RWWString(3C++) RWWString(3C++)
Returns the index of the last occurrence in the string of the wide
character c. Returns RW_NPOS if there is no such character.
size_t
length() const;
Return the number of characters in self.
RWWString&
prepend(const wchar_t* cs);
Prepend a copy of the null-terminated wide character string pointed to by
cs to self. Returns a reference to self.
RWWString&
prepend(const wchar_t* cs, size_t N,);
Prepend a copy of the character string cs to self. Exactly N characters
are copied, including any embedded nulls. Hence, the buffer pointed to
by cs must be at least N*sizeof(wchart_t) bytes long. Returns a
reference to self.
RWWString&
prepend(const RWWString& str);
Prepends a copy of the string str to self. Returns a reference to self.
RWWString&
prepend(const RWWString& cstr, size_t N);
Prepend the first N wide characters or the length of cstr (whichever is
less) of cstr to self. Returns a reference to self.
istream&
readFile(istream& s);
Reads characters from the input stream s, replacing the previous contents
of self, until EOF is reached. The input stream is treated as a sequence
of multibyte characters, each of which is converted to a wide character
(using the Standard C library function mbtowc()) before storing. Null
characters are treated the same as other characters.
istream&
readLine(istream& s, RWBoolean skipWhite = TRUE);
Page 10
RWWString(3C++) RWWString(3C++)
Reads characters from the input stream s, replacing the previous contents
of self, until a newline (or an EOF) is encountered. The newline is
removed from the input stream but is not stored. The input stream is
treated as a sequence of multibyte characters, each of which is converted
to a wide character (using the Standard C library function mbtowc())
before storing. Null characters are treated the same as other
characters. If the skipWhite argument is TRUE, then whitespace is
skipped (using the iostream library manipulator ws) before saving
characters.
istream&
readString(istream& s);
Reads characters from the input stream s, replacing the previous contents
of self, until an EOF or null terminator is encountered. The input
stream is treated as a sequence of multibyte characters, each of which is
converted to a wide character (using the Standard C library function
mbtowc()) before storing.
istream&
readToDelim(istream&, wchar_t delim=(wchar_t)'0);
Reads characters from the input stream s, replacing the previous contents
of self, until an EOF or the delimiting character delim is encountered.
The delimiter is removed from the input stream but is not stored. The
input stream is treated as a sequence of multibyte characters, each of
which is converted to a wide character (using the Standard C library
function mbtowc()) before storing. Null characters are treated the same
as other characters.
istream&
readToken(istream& s);
Whitespace is skipped before storing characters into wide string.
Characters are then read from the input stream s, replacing previous
contents of self, until trailing whitespace or an EOF is encountered. The
trailing whitespace is left on the input stream. Only ASCII whitespace
characters are recognized, as defined by the standard C library function
isspace(). The input stream is treated as a sequence of multibyte
characters, each of which is converted to a wide character (using the
Standard C library function mbtowc()) before storing.
RWWString&
remove(size_t pos);
Removes the characters from the position pos, which must be no greater
than length(), to the end of string. Returns a reference to self.
Page 11
RWWString(3C++) RWWString(3C++)
RWWString&
remove(size_t pos, size_t N);
Removes N wide characters or to the end of string (whichever comes first)
starting at the position pos, which must be no greater than length().
Returns a reference to self.
RWWString&
replace(size_t pos, size_t N, const wchar_t* cs);
Replaces N wide characters or to the end of string (whichever comes
first) starting at position pos, which must be no greater than length(),
with a copy of the null-terminated string cs. Returns a reference to
self.
RWWString&
replace(size_t pos, size_t N1,const wchar_t* cs, size_t N2);
Replaces N1 characters or to the end of string (whichever comes first)
starting at position pos, which must be no greater than length(), with a
copy of the string cs. Exactly N2 characters are copied, including any
embedded nulls. Hence, the buffer pointed to by cs must be at least
N2*sizeof(wchart_t) bytes long. Returns a reference to self.
RWWString&
replace(size_t pos, size_t N, const RWWString& str);
Replaces N characters or to the end of string (whichever comes first)
starting at position pos, which must be no greater than length(), with a
copy of the string str. Returns a reference to self.
RWWString&
replace(size_t pos, size_t N1,
const RWWString& str, size_t N2);
Replaces N1 characters or to the end of string (whichever comes first)
starting at position pos, which must be no greater than length(), with a
copy of the first N2 characters, or the length of str (whichever is
less), from str. Returns a reference to self.
void
resize(size_t n);
Changes the length of self, adding blanks (i.e., L' ') or truncating as
necessary.
Page 12
RWWString(3C++) RWWString(3C++)
RWWSubString
strip(stripType s = RWWString::trailing, wchar_t c = L' '); const RWWSubString
strip(stripType s = RWWString::trailing, wchar_t c = L' ')
const;
Returns a substring of self where the character c has been stripped off
the beginning, end, or both ends of the string. The first variant can be
used as an lvalue. The enum stripType can take values:
stripType Meaning
leading Remove characters at beginning
trailing Remove characters at end
both Remove characters at both ends
RWWSubString
subString(const wchar_t* cs, size_t start=0,
caseCompare = RWWString::exact);
const RWWSubString
subString(const wchar_t* cs, size_t start=0,
caseCompare = RWWString::exact) const;
Returns a substring representing the first occurrence of the null-
terminated string pointed to by "cs". Case sensitivity is according to
the caseCompare argument; it defaults to RWWString::exact. The first
variant can be used as an lvalue.
RWCString
toAscii() const;
Returns an RWCString object of the same length as self, containing only
ASCII characters. Any non-ASCII characters in self simply have the high
bits stripped off. Use isAscii() to determine whether this function is
safe to use.
RWCString
toMultiByte() const;
Page 13
RWWString(3C++) RWWString(3C++)
Returns an RWCString containing the result of applying the standard C
library function wcstombs() to self. This function is always safe to
use.
void
toLower();
Changes all upper-case letters in self to lower-case. Uses the C library
function towlower().
void
toUpper();
Changes all lower-case letters in self to upper-case. Uses the C library
function towupper().
Static Public Member Functions
static unsigned
hash(const RWWString& wstr);
Returns the hash value of wstr as returned by
wstr.hash(RWWString::exact).
static size_t
initialCapacity(size_t ic = 15);
Sets the minimum initial capacity of an RWWString, and returns the old
value. The initial setting is 15 wide characters. Larger values will use
more memory, but result in fewer resizes when concatenating or reading
strings. Smaller values will waste less memory, but result in more
resizes.
static size_t
maxWaste(size_t mw = 15);
Sets the maximum amount of unused space allowed in a wide string should
it shrink, and returns the old value. The initial setting is 15 wide
characters. If more than mw characters are wasted, then excess space
will be reclaimed.
static size_t
resizeIncrement(size_t ri = 16);
Sets the resize increment when more memory is needed to grow a wide
string. Returns the old value. The initial setting is 16 wide
Page 14
RWWString(3C++) RWWString(3C++)
characters.
Related Global Operators
RWBoolean
operator==(const RWWString&, const wchar_t* );
RWBoolean
operator==(const wchar_t*, const RWWString&);
RWBoolean
operator==(const RWWString&, const RWWString&);
RWBoolean
operator!=(const RWWString&, const wchar_t* );
RWBoolean
operator!=(const wchar_t*, const RWWString&);
RWBoolean
operator!=(const RWWString&, const RWWString&);
Logical equality and inequality. Case sensitivity is exact.
RWBoolean
operator< (const RWWString&, const wchar_t* );
RWBoolean
operator< (const wchar_t*, const RWWString&);
RWBoolean
operator< (const RWWString&, const RWWString&);
RWBoolean
operator> (const RWWString&, const wchar_t* );
RWBoolean
operator> (const wchar_t*, const RWWString&);
RWBoolean
operator> (const RWWString&, const RWWString&);
RWBoolean
operator<=(const RWWString&, const wchar_t* );
RWBoolean
operator<=(const wchar_t*, const RWWString&);
RWBoolean
operator<=(const RWWString&, const RWWString&);
RWBoolean
operator>=(const RWWString&, const wchar_t* );
RWBoolean
operator>=(const wchar_t*, const RWWString&);
RWBoolean
operator>=(const RWWString&, const RWWString&);
Comparisons are done lexicographically, byte by byte. Case sensitivity
is exact. Use member collate() or strxfrm() for locale sensitivity.
RWWString
operator+(const RWWString&, const RWWString&);
RWWString
operator+(const wchar_t*, const RWWString&);
Page 15
RWWString(3C++) RWWString(3C++)
RWWString
operator+(const RWWString&, const wchar_t* );
Concatenation operators.
ostream&
operator<<(ostream& s, const RWWString& str);
Output an RWWString on ostream s. Each character of str is first
converted to a multibyte character before being shifted out to s.
istream&
operator>>(istream& s, RWWString& str);
Calls str.readToken(s). That is, a token is read from the input stream
s.
RWvostream&
operator<<(RWvostream&, const RWWString& str);
RWFile&
operator<<(RWFile&, const RWWString& str);
Saves string str to a virtual stream or RWFile, respectively.
RWvistream&
operator>>(RWvistream&, RWWString& str);
RWFile&
operator>>(RWFile&, RWWString& str);
Restores a wide character string into str from a virtual stream or
RWFile, respectively, replacing the previous contents of str.
Related Global FunctionsRWWString
strXForm(const RWWString&);
Returns a string transformed by ::wsxfrm(), to allow quicker collation
than RWWString::collate().
RWWString
toLower(const RWWString& str);
Returns a version of str where all upper-case characters have been
replaced with lower-case characters. Uses the C library function
towlower().
Page 16
RWWString(3C++) RWWString(3C++)
RWWString
toUpper(const RWWString& str);
Returns a version of str where all lower-case characters have been
replaced with upper-case characters. Uses the C library function
towupper().
Page 17