the complete guide to mozilla/string

by Scott Collins

last modified 8 April 2001

Abstract

This document provides an introduction to the design and use of the string classes in mozilla, detailed information on their implementation and how one may extend them, and answers to frequently asked questions about strings.

contents

A note to potential editors: don't even consider modifying this document with an HTML editor. That would destroy the internal formatting, and make patches unmanagable.


user's guide

Strings in mozilla are a world apart from char*s. If you don't know why they are different, this section is the place for you to start. If you're already familiar with the hierarchy of string classes in mozilla, then you might want to skip ahead to the implementor's guide or the FAQ.

introduction

what and what isn't a string?

A string is an opaque container holding a, possibly zero length, linear sequence of characters. Understanding the implications of this statement is the foundation for understanding all mozilla's string classes.

readable and writable

promises

flat strings

encoding

sharing

using the string classes correctly; using the correct string class

basic string operations

comparison

concatenation

substrings

find and replace

conversions

calling a function that expects a different kind of string

converting between string classes

converting between encodings

selecting the right string class

user string classes

selecting the right string class for a parameter

selecting the right string class for a local variable

selecting the right string class for a member variable

selecting the right string class for a return value

selecting the right string class in IDL

dont's

using string iterators

what is an iterator?

reading iterators and writing iterators

`chunky' iterating for efficiency

copy_string, character sources and sinks

encoding conversion iterators

summary


implementor's guide


frequently asked questions

is there any string doc?
Yes, you're soaking in it!
I have a string, how to I get a pointer to the characters?
...
What is the best way to return a string?

There are several reasonable ways to produce a string result from a function. If you are already holding the answer as a sharable string, you can simply return that string (pass-by-value). Otherwise, the most efficient and flexible way to return a string is to assign your result into a non-const reference parameter. Don't bother to create a sharable string from scratch with your generated result.

Why? The two things you want to minimize in string manipulation are, in order of importance, heap allocation, and moving characters around.

If I have a PRUnichar *aKey [or other representation of a wide] string, what can I use (easily :) to convert it to a printf() printable string? Just for debugging...
If it's just for debugging, you probably wouldn't care if something odd was printed in the case of a UCS2 character that didn't have an ASCII equivalent. The simplest thing in this case is to make a temporary conversion using NS_ConvertUCS2toUTF8. Remember not to hold onto the pointer you get out of this beyond the lifetime of temporary.
const PRUnichar* aKey;

printf("%s\n", NS_ConvertUCS2toUTF8(aKey).get());       // GOOD
  // the simplest way to get a |printf|-able |const char*| out of a string

  // works just as well with an formal wide string type...
const nsAString& aString = ...;  // perhaps it's a parameter
printf("%s\n", NS_ConvertUCS2toUTF8(aString).get());


  // But don't hold onto the pointer longer than the lifetime of the temporary!
const char* cstring = NS_ConvertUCS2toUTF8(aKey).get(); // BAD!
printf("%s\n", cstring);                                // |cstring| is dangling