diff --git a/mozilla/string/doc/string-guide.html b/mozilla/string/doc/string-guide.html index 6d0d25831b1..47dc2d03a6e 100644 --- a/mozilla/string/doc/string-guide.html +++ b/mozilla/string/doc/string-guide.html @@ -540,6 +540,1937 @@ void PrintSomeStrings( const nsAString& aString, const PRUnichar* aKey, const ns + +
+ Here are the email answers I have yet to format into the FAQ. + Some of the URLs may be out-dated or moved. + The messages are in order from oldest to newest. +
++Date: Thu, 13 Apr 2000 19:41:47 -0400 ++ +
Encoding Wars + +
This message is all about strings and the various encodings that might +be used to interpret their contents, the ramifications of that, and +where we're heading. The point of this message is to say what we're +currently thinking, and get feedback. I apologize in advance for the +rambling, and for the fact that this message may accidentally mix +discussion of how things are and how they will be. + +
There are many different possible encodings. Three in common use in +the Mozilla source base are: ASCII, UCS2, and UTF8. In ASCII, every +character fits in 7-bits and is typically stored in an 8-bit byte. We +usually represent ASCII strings with nsCStrings, nsXPIDLCStrings, +or char string literals. In UCS2, characters occupy 16 bits each. +We usually represent UCS2 strings as nsStrings, etc., i.e., two-byte +or `wide' strings. UTF8 is a multi-byte encoding. A character might +occupy one, two, or three bytes. It is easiest to store and +manipulate such a string within a single-byte or `narrow' string +implementation. + +
None of our current string implementations know the encoding of the +data they hold at any given moment. An nsCString might legitimately +hold data encoded in ASCII, UTF8, or even EBCDIC for that matter. + +
Operations that convert from one encoding to another, or operations +that are encoding sensitive (e.g., to_upper), rightly belong in +i18n. The fact that our current string interfaces automatically and +implicitly convert between wide and narrow strings is actually the +source of many errors in two particular categories: (1) unintended +extra work, (2) mistaken re-encoding, e.g., accidentally `converting' +a UTF8 string to UCS2 by pretending the UTF8 string is ASCII and then +padding with '\0's. + +
We've known these were bad for a long time, and have been trying to +find the right way to fix them. The current thinking is to just byte +the bullet and eliminate implicit conversions. That has interesting +ramifications. + +
+void foo( const nsString& aUCS2string );
+
+foo("hello"); // works! constructs a temporary |nsString| by
+ // converting the ASCII literal with padding.
+ // Note: this requires an allocation
+
+Though we've always hated this form since it requires a heap +allocation. In current code, we recommend + +
+foo( nsAutoString("hello") );
+
+which still copy/converts, but at least it probably doesn't need to do +a heap allocation. In the best of all worlds, no conversion, copying, +or allocation would be necessary. To do that, you would need to be +able to directly specify a UCS2 string, e.g., with the L"hello" +notation, and wrap that in an interface that just held a pointer. +E.g., something like + +
+void foo( const nsAReadableString& aUCS2string ); + +foo( nsLiteralString(L"hello") ); ++
There are problems with this example, however. The L notation +specifically makes objects that are arrays of wchar_t, which under +GCC is a 4-byte element. This leads to incompatibility with JS, and +the annoyance of possibly bloated storage (I'm sort of minimizing the +situation here. It's worse that I make it sound). More about tricks +to get around this in a bit, but first, let me talk about what to do +in the meantime while we're just getting rid of implicit constructors. + Initially to get around this problem (what problem? The problem that +foo("hello") stopped compiling on my machine when I threw the +switch) I made a routine called NS_ConvertToString which looked like +this + +
+inline
+nsAutoString
+NS_ConvertToString( const char* anASCIIstring )
+ {
+ nsAutoString aUCS2string;
+ aUCS2string.AssignWithConversion(anASCIIstring);
+ return aUCS2string;
+ }
+
+Which lets me write + +
+foo( NS_ConvertToString("hello") );
+
+This was OK, but in discussion there were concerns about performance +on machines that didn't inline well, and issues about naming. In +that meeting we came up with an alternate naming strategy that we +think has room for growth and an implementation more likely to be +efficient on every platform. The implementation is to define a new +class that derives from nsAutoString, but allows construction from a +char* + +
+class NS_ConvertASCIItoUCS2 : public nsAutoString
+ {
+ public:
+ NS_ConvertASCIItoUCS2( const char* );
+ // ...
+ };
+
+Which gives identical (though renamed) notation for calling foo: + +
+foo( NS_ConvertASCIItoUCS2("hello") );
+
+It looks like a function call to an explicit encoding conversion. It +acts like a function call to an explicit encoding conversion. It is +a function call to an explicit encoding conversion. We think that +this naming pattern has room for growth. In the meeting, we concluded +that the best representation for encoding conversions is a family of +functions, and NS_ConvertASCIItoUCS2 fits right in. We think that +XPCOM probably can't live without the ASCII to UCS2 conversion (though +as explicit as possible) but that all others rightly belong in i18n +land. + +
You can probably deduce from the clues in NS_ConvertToString, above, +that constructors weren't the only thing that became explicit. +Assignment, appending, comparison, et al, got renamed so that when +assigning, appending, or comparing to a value in a different encoding +the `WithConversion' form must be used. E.g., + +
+nsString aUCS2string; +nsCString anASCIIstring; +// ... + +aUCS2string += anASCIIstring; // Currently legal, but not for long +aUCS2string.Append(anASCIIstring); // same + +aUCS2string.AppendWithConversion(anASCIIstring); // the new way + +if ( aUCS2string == anASCIIstring ) // Sorry, this is going away too + // ... + +if ( aUCS2string.EqualsWithConversion(anASCIIstring) ) + // ... ++
Yes, it's long and annoying. Just like the extra work you were +implicitly asking to have done, perhaps incorrectly. There are other +reasons to rename these functions. When nsString and nsCString +defined a ton of, e.g., Appends each there was no problem, because +nobody wanted to override Append. Now, with strings inheriting from +abstract base classes we immediately run into the problem that +overriding and overloading don't mix very well in C++. Because of a +feature of C++ called name hiding, it is problematic to override only +a single signature of a name overloaded in a base class. The base +nsAWritableString provides several Appends, all for objects of +(hopefully) the same encoding. nsString can't easily add a bunch of +new Appends (the converting ones) without running face first into +the name hiding problem. The discussion of the fix for this is mostly +unrelated to encoding issues, so I'll defer it to another post. + +
In hindsight, after the meeting, it seemed clear that all the +`WithConversion' forms would be better named + +
+xxxConvertingASCIItoUCS2 +xxxConvertingUCS2toASCII ++
however, the real goal (probably) is to move most such conversions +into i18n. Just bringing attention to the previously implicit +conversions is a good first step. Renaming these conversions as just +suggested is probably the right thing to do, though it sort of +validates them, which I'm not sure we really want. This is a decision +we need to discuss further. + +
Now, back to the string literal problem above. One possible solution +is to use a macro. Imagine + +
+NS_LITERAL_STRING("Hello")
+
+which on a machine where the L trick works, turns into + +
+nsLiteralString(L"Hello") ++
but on a machine where there is trouble, turns into something less +appealing, but more likely to work, like + +
+NS_ConvertASCIItoUCS2("Hello")
+
+Another solution is to add a compilation step that fixes L strings +on bad platforms to be non-L strings, but padded with \0s. E.g., +L"Hello" gets preprocessed into "\000H\000e\000l\000l\000o\000". +This solution is more annoying to the developer, where the prior +solution is more annoying during the runtime. + +
Before we go to too much trouble on this specific feature, we will +probably want to do more measurement to see just how much and how +often we are converting constant literal strings, and why. + + +
I'm currently ripping through the tree fixing things to use the +`WithConversion' forms where appropriate. I was also converting +things to use NS_ConvertToString where appropriate; unless I get +talked out of it, I want to switch midstream to +NS_ConvertASCIItoUCS2, then go back and fix up the +NS_ConvertToString instances later. I've set things up so I can +check in as I go. After all these conversions have been done, I'll be +able to throw the switch (what switch? NEW_STRING_APIS) which will +make nsString inherit from nsAWritableString, etc. and allow us to +start exploiting these other opportunities (e.g., for literal strings, +shared strings, etc. See +http://bugzilla.mozilla.org/show_bug.cgi?id=28221 for details and +reasoning.) + +
I guess I'm expecting comments on: + +
So as not to jumble the discussion, I'll be separately posting other +requests for comments about specific features of the design of the new +string hierarchy. + +
I hope this helps keep everybody filled in on what we're thinking and +able to point out what we're forgetting or screwing up :-) + + + + + +
+Date: Wed, 19 Apr 2000 21:12:47 -0400 +Subject: more string info ++ +
news://news.mozilla.org/scc-705460.16423913042000@news.mozilla.org + + + + + +
+Date: Fri, 26 May 2000 15:31:37 -0400 +Subject: Re: Question on == ++ +
I would prefer you compare with Equals (which should really be named +IsEqualTo) rather than operator==() because of this: + +
+char* a; +char* b; + +// ... + +if ( a == b ) + // ... ++
Comparing two raw `string' pointers doesn't compare the characters +they point to, but instead compares the bits of the pointers. For +this reason, I may eventually make comparison of a string with a +pointer using operators just go away. + + + + + +
+Date: Wed, 14 Jun 2000 14:38:55 -0400 +Subject: Re: Fix to XprtDefs.h ++ +
Yes, we're aware that turning off wchar_t support makes wchar_t be +a synonym for unsigned short under Metrowerks. We know that the +current version of VC++ also makes these types equivalent. In theory, +though, the types are distinct even when they are the same size and +shape. By using real wchar_t support, we are forced to recognize +the distinction and navigate it appropriately with reinterpret_cast +(via NS_REINTERPRET_CAST). The win here is that we aren't caught by +compiler changes that suddenly make some set of compilers compliant +and therefore break our code. We will add an autoconf test that lets +UNIX compilers opt in to our string scheme when they have an +appropriately shaped wchar_t. If these happen to be compliant +compilers, all will be well. If they don't, the casts don't hurt, +because they are type correct. We are writing our code to meet the +standard as we move forward. + +
The win for us is realized by the following macros + +
+#ifdef HAVE_CPP_2BYTE_WCHAR_T + #define NS_LITERAL_STRING(s) nsLiteralString(L##s, \ + (sizeof(L##s)/sizeof(wchar_t))-1) +#else + #define NS_LITERAL_STRING(s) NS_ConvertASCIItoUCS2(s, \ + sizeof(s)-1) +#endif ++
An nsLiteralString points directly to the literal characters. No +copying, no conversion, and the length calculation happens at compile +time. This has turned out to be as large a savings as 15% of code +space and 8% of data space, net, in our string test harness It's +faster as well, again by eliminating the copying, conversion, and +length calculation. We don't know yet what those numbers translate +into in our real code base, but we have high hopes. + +
I don't want to be in the position to ask you to change your code. I +don't think it's appropriate for me to do so. The AIM application +that is your client is our client as well. They need to resolve this +difference between us in whatever way they think best. That may mean +asking you if changing your apis is the right thing to do. Or it may +mean applying the casts. Our code-base and yours, Justin, are more +like cousins. I don't think you should have to change just to conform +to us. You may think my arguments for using real wchar_t have +merit, and adopt similar usage just because you agree; but I think the +only obligation you have is to follow the technical solution you think +is right for your code. + +
If you decide to make this api change, it will mean shipping a new +binary (on Mac) for your library to clients who want to switch over to +the new api (since the name mangling will be different, and therefore, +the link requirements will change). + +
Hope this helps, + + + + + +
+Date: Thu, 15 Jun 2000 19:36:55 -0400 +Subject: Re: Checkin approval for bug 32336 ++ +
+S.Equals(NS_LITERAL_STRING("bar"), PR_TRUE, 3)
+
+doesn't compile because there is no three parameter form for Equals. + For all definitions of Equals on strings, see "nsAReadableString.h" + +
http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h + +
There is an EqualsWithConversion that takes three parameters. + +
http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.h#731 + +
It is ``EqualsWithConversion'' because it admits the possibility of an +encoding specific transformation, in this case to provide +case-insensitive comparison. This also wouldn't compile, however, +since, at the moment, an nsLiteralString doesn't provide an operator +to produce a const PRUnichar* (though perhaps it should), and it +doesn't satisfy the other interfaces that match this call, e.g., a +const nsString&. + +
Perhaps I need to move case-insensitive comparison up out of +nsString into a global encoding specific transformations and +algorithms file (which was on its way anyway as Waterson, knows); this +use is one bit of evidence to support this. In the short term, this +can be fixed (if we think the current behavior is wrong) by providing +operator const CharT*() const on literal string. + +
If you can live with out case-folding, the earlier form is preferred + +
+S == NS_LITERAL_STRING("bar")
+
+if you can't, then one of the fixes I mentioned is in order. + + + + + +
+Date: Thu, 15 Jun 2000 19:47:12 -0400 +Subject: Re: [Fwd: how to use nsString ?] ++ +
+ >I see these same examples time and again in the embedding + >samples/docs, but I can't compile them. ++ +
Apologies. Documentation mentioning strings is getting out of date. +Here are some specific answers. + + +
+ >nsString URLString("http://www.mozilla.org");
+
+
+...is now perhaps best expressed as + + nsString URLString( NS_LITERAL_STRING("http://www.mozilla.org") ); + +
since an nsString is a sequence of 2-byte wide characters, and the +routines that implicitly convert 1-byte sequences (like the literal +sequence you specified, "http:...") are now gone. + +
Up until not too long ago, one would have had to say + +
+nsString URLString;
+URLString.AssignWithConversion("http://www.mozilla.org");
+
+The NS_LITERAL_STRING construction is new machinery that has the +potential to make many operations much more efficient. + +
+ >nsString URLString;
+ >URLString.SetString("www.mozilla.org");
+
+
+SetString was a synonym for Assign or assignment with +operator=(), it too went away. The equivalent is the second +example I gave above, that is, the one with AssignWithConversion. + +
Assign still exists. AssignWithConversion takes on that +functionality for assignments that require encoding transformations +(e.g., from ASCII to UCS2). SetString is gone, since it was always +a synonym for Assign. + +
Learn more about the general APIs for strings that we are trying to +move to by examining + +http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h +http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h + +
Hope this helps, + + + + + +
+Date: Thu, 15 Jun 2000 21:26:51 -0400 +Subject: Re: Checkin approval for bug 32336 ++ +
+ >I *need* the count attribute, because I need to compare only the first + >chars (that's inherent to the logic). ++ +
This is what substrings are for. In that case, you could use + +
+Substring(S, 0, 3) == NS_LITERAL_STRING("bar")
+
+As for case-folding, it's best if you can case-fold everything up +front, instead of doing it repeatedly. I'll have to get back to you +on a general solution to that problem, or what my schedule for getting +it checked in would be. I'm sorry, I know that's not what you needed +to hear. If the source string is an nsString, you can continue to +exploit its implementation of these routines, e.g., ToLower all +up-front. + +
Hope this helps, + + + + + +
+Date: Mon, 19 Jun 2000 14:23:47 -0400 +Subject: Re: string fu ++ +
+ >It seems less convenient to have to first check path.IsEmpty, and + >then if false get path.Last and test it. ++ +
What would you prefer? That extracting a character not in the string +always return CharT(0)? Can't do it for two reasons: (1) 0 may be +a valid character in a particular encoding, so it can't be used in +general as a ``no character at that position'' marker; and (2) I can't +control what an individual string implementation does when asked to +get an out-of-bounds fragment, it's explicitly undefined. That means +the result of CharAt is explicitly undefined for indexes outside the +defined contents of the string. As a debugging convenience, I have +made this assert, but it has always been the case that retrieving such +a character had undefined results ... even in [the old] code. + +
OK, you might say, well at least let me ask for a character that is +only off the end by one. E.g., Last of an empty string. Reason (1) +from above still applies. How bad is it to say, for the case you gave + +
+PRBool needsDelim = PR_FALSE;
+if ( !path.IsEmpty() )
+ {
+ PRUnichar last = path.Last();
+ needsDelim = !(last == '/' || last == '\\');
+ }
+
+In general, you probably want to opt out of a whole lot of work when +the source string is empty. It is slightly less convenient, but it +doesn't tie us to a bunch of implementation specific mojo. + + +
+ >Can we fix GetUnicode in this case? ++ +
This is an annoying property of auto strings, e.g., that they always +have an allocated buffer. I'm happy to fix this bug, however, be +aware that GetUnicode and GetBuffer are artifacts of [the old] +implementation that we don't want to support. They are not part of +the abstract interface. We will keep them no longer than we have to. +They don't support our multi-fragment paradigm. People who require a +contiguous hunk of characters in the future, and are unwilling to +switch over to chunky-iterators, may be forced to copy the string to +their own buffer. There will be an implementation of narrow character +string that guarantees contiguous allocation and a zero-terminator, +much as nsCString does now, for compatibility with platform uses, +but this won't be the default string class. + + + + + +
+Date: Mon, 19 Jun 2000 17:22:31 -0400 ++ +
Clarifying String Sematics + +
Recently, I added an assert to the string operations that extract +characters, namely First(), Last(), CharAt(), and +operator[](). This assert fires when any of these routines are used +to access a character outside the defined contents of the string. For +First() and Last() that means whenever they are applied to an +empty string. For CharAt() and operator[](), that means whenever +they are used to access an index outside the range of +0..Length()-1. There have been some complaints, however, the +result was always undefined. What follows is extracted from an email +exchange between me and warren on this topic. I hope it clarifies +strings semantics + +
Warren writes: +
+ >I hit your funky CharAt assertion tonight in this piece of code:
+
+ >NS_IMETHODIMP
+ >nsIOService::ResolveRelativePath(
+ > const char *relativePath,
+ > const char* basePath,
+ > char **result )
+ > {
+ > nsCAutoString name;
+ > nsCAutoString path(basePath);
+ >
+ > PRUnichar last = path.Last();
+ > PRBool needsDelim = !(last == '/' || last == '\\' || last ==
+ > '\0');
+ > ...
+
+ >where basePath is null. It seems less convenient to have to first
+ >check path.IsEmpty, and then if false get path.Last and test it.
+
+
+I replied: +
+ >What would you prefer? That extracting a character not in the + >string always return CharT(0)? Can't do it for two reasons: + >(1) 0 may be a valid character in a particular encoding, so it + >can't be used in general as a ``no character at that position'' + >marker; and (2) I can't control what an individual string + >implementation does when asked to get an out-of-bounds fragment, + >it's explicitly undefined. That means the result of CharAt is + >explicitly undefined for indexes outside the defined contents of + >the string. As a debugging convenience, I have made this assert, + >but it has always been the case that retrieving such a character + >had undefined results ... even in [the old] code. + + >OK, you might say, well at least let me ask for a character that + >is only off the end by one. E.g., Last of an empty string. + >Reason (1) from above still applies. How bad is it to say, for the + >case you gave + + > PRBool needsDelim = PR_FALSE; + > if ( !path.IsEmpty() ) + > { + > PRUnichar last = path.Last(); + > needsDelim = !(last == '/' || last == '\\'); + > } + + >In general, you probably want to opt out of a whole lot of work + >when the source string is empty. It is slightly less convenient, + >but it doesn't tie us to a bunch of implementation specific mojo. ++ +
Warren also asks: +
+ >Here's another issue, perhaps more serious. If I say this:
+
+ > foo(const PRUnichar* s) {
+ > nsAutoString str(s);
+ > bar(str.GetUnicode());
+ > }
+
+ >where s is null, bar will get passed a zero-length PRUnichar
+ >sequence instead of null. This makes it so that you can't just
+ >test for the argument == null. You have to nsCRT::strlen(arg) == 0
+ >which is much less efficient. Can we fix GetUnicode in this case?
+
+
+And I reply: +
+ >This is an annoying property of auto strings, e.g., that they + >always have an allocated buffer. I'm happy to fix this bug, + >however, be aware that GetUnicode and GetBuffer are artifacts + >of [the old] implementation that we don't want to support. They + >are not part of the abstract interface. We will keep them no + >longer than we have to. They don't support our multi-fragment + >paradigm. People who require a contiguous hunk of characters in + >the future, and are unwilling to switch over to chunky-iterators, + >may be forced to copy the string to their own buffer. There will + >be an implementation of narrow character string that guarantees + >contiguous allocation and a zero-terminator, much as nsCString + >does now, for compatibility with platform uses, but this won't be + >the default string class. ++ +
In a later message, Chris Waterson asks a related question +
+ >scc: should we add operator PRUnichar*() to
+ >NS_ConvertASCIItoUCS2?
+
+
+And I reply: +
+ >It seems reasonable. A lot more reasonable that forcing people to + >call GetUnicode(). I alluded to platform specific classes in an + >earlier message to warren that you were cc'd on, Chris. I imagine + >that the ...Convert... routines would be required to produce + >contiguous allocation 0-terminated strings (though the as yet + >unimplemented ...Copy... forms, of course wouldn't. So operator + >const PRUnichar*() const makes perfect sense to me here. ++ +
Hope this makes sense, + + + + +
+Date: Tue, 20 Jun 2000 04:05:31 -0400 +Subject: Re: NS_LITERAL_STRING is broken ++ +
The behavior you describe sounds exactly like when you say + +
+const char* foobar = "foobar"; + +... NS_LITERAL_STRING(foobar).GetUnicode() ... ++
because in this case, the thing passed in is a const char*. +NS_LITERAL_STRING is not meant to be used in this way. It is only +meant to be used around a " delimited string. The type of such is +const char[N] where N is the number of characters in the string + 1 +for the zero terminator it helpfully adds. sizeof such a type is +N. + +
Are you sure you had the actual string as an argument, as in your +example to me? Or could the actual code have been like my sample, +above? + + + + + +
+Date: Thu, 29 Jun 2000 13:35:10 -0400 +Subject: Re: a fix ++ +
+ > + if (Length() == 0) { return nsnull; }
+
+
+
+Dave, + +
please read + + news://news.mozilla.org/scc-314ABF.14261619062000@news.mozilla.org + +
It's just plain wrong to let people try to index into a string outside +its defined contents. I can't just return '\0' or PRUnichar('\0') +there as that could be a legal value to have somewhere in your +string for some encodings ... and the encoding is not specified. So +your patch has the basic problem of defeating my plan to stop people +from doing this bad thing. + +
The second problem with your patch is that you use the symbolic +constant nsnull, which is ostensibly a pointer value; Last returns +a character. nsnull is not appropriate for that purpose. In fact, +C++ gurus pretty much eschew the use of symbolic constants for 0. +NULL is to be avoided. nsnull is wrong-headed in that it presumes +we could have some other application specific value for NULL. We +can't, it would never work. It's just wasted brain-print. Always use +0 for these situations, and if you want to communicate the fact that +something is a pointer type, either use a comment or a +(construction-style) cast, like so (graded examples from worst to +best:) + +
Don't let this discourage you; keep up the good work :-) + + + + + +
+Date: Tue, 8 Aug 2000 23:47:16 -0400 +Subject: Re: nsWritingIterator? ++ +
+ >Can you give me any pointers to examples, or docs, or just some + >general advice? ++ + http://ScottCollins.net/Journal/discussion/string_iterators.html + +
does this help? + +
I can personally walk you through any specific scenario you need. + + + + + +
+Date: Wed, 9 Aug 2000 02:35:03 -0400 +Subject: Re: nsWritingIterator? ++ +
You got it right... it's nsWritingIterator
Here are three examples of running through a string and modifying some +of the characters in it. All use nsWritingIterators. + + +
+ // inefficient, but works in a pinch:
+ // iterators can hide all details of chunks by acting like
+ // a raw character pointer
+
+nsWritingIterator<PRUnichar> s = S.BeginWriting();
+nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
+
+ // for each character in the string |S|
+while ( s != done_with_string )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s -'a' + 'A';
+ }
+
+
+
+
+ // efficient
+ // iterators provide a mechanism by which you can process
+ // a chunk-at-a-time
+
+nsWritingIterator<PRUnichar> iter = S.BeginWriting();
+nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
+
+ // for each chunk of the string
+while ( iter != done_with_string )
+ {
+ size_t N = iter.size_forward(); // # of chars in this chunk
+ PRUnichar* s = iter.get();
+ PRUnichar* done_with_chunk = s + N;
+
+ // for each character in this chunk
+ for ( ; s < done_with_chunk; ++s )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s - 'a' + 'A';
+ }
+
+ // advance the iterator past characters
+ // we examined (and into the next chunk, if any)
+ s += N;
+ }
+
+
+
+ // elegant
+ // pull your transformation into a `sink', and |copy_string|
+ // will efficiently pump any kind of string into it
+
+struct Capitalize
+ {
+ // inline
+ PRUint32
+ write( PRUnichar* s, PRUint32 N )
+ // processes one chunk, called repeatedly by |copy_string|
+ {
+ PRUnichar* done_with_chunk = s + N;
+
+ // for each character in this chunk
+ for ( ; s < done_with_chunk; ++s )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s - 'a' + 'A';
+ }
+ }
+ };
+
+copy_string(S.BeginWriting(), S.EndWriting(), Capitalize());
+
+Does this show it better? + + + + + +
+Date: Thu, 17 Aug 2000 18:23:22 -0400 ++ +
+ >I tried looking at the string header files but they + >are awfully complicated. ++ +
I'll explain things in a little more detail than you need, then so +that some of the stuff you see in these headers will make more sense. +I'll also answer your questions out of order. + +
First: the string hierarchy looks like this + +http://ScottCollins.net/Journal/discussion/string_hierarchy.gif + +
The two most important headers are: + +http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h +http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h + +
These abstract classes, nsAReadable[C]String, and +nsAWritable[C]String are typically what you will want to use in the +interfaces of new code. If you write a piece of code that takes a +string for input, consider, e.g., + +
+void consumes_a_string( const nsAReadableString& aInput ); ++
If you write a piece of code that modifies a string, consider + +
+void modifies_a_string( nsAWritableString& aResult ); ++
When creating your own classes, member strings will typically be +nsStrings. When you can't avoid creating a short string that you +need only temporarily during a function, you will typically use +nsAutoString. When someone passes you a raw pointer, or a raw +pointer and a length, representing a buffer of characters that you may +examine, but won't own, you can treat it like a string by wrapping it +in an nsLiteralString, e.g., + +
+void
+reads_a_buffer( const PRUnichar* aInput, PRUint32 aInputLength )
+ {
+ nsLiteralString input(aInput, aInputLength);
+ // doesn't allocate or copy
+
+ // ...
+ }
+
+You will use nsLiteralString around quoted constant strings as well, +though typically through the NS_LITERAL_STRING macro, to avoid doing +a length calculation + +
+NS_LITERAL_STRING("x")
+
+expands to + +
+nsLiteralString(L"x", (sizeof(L"x")/sizeof(PRUnichar) - 1)) ++
if L notation works as needed on your platform. + +Those are the basics. Now onto your questions: + + +
+ >For example this won't compile. [...] + + >str1 += L"abc " + str2 + L"def"; ++ + +
L"abc " makes a an object that is a const wchar_t[5], and none of +the string code knows about wchar_t. The main reason is that +wchar_t is not necessarily the right size (it can be 4 bytes under +gcc). If you wrap these constant expressions in NS_LITERAL_STRING, +as described above, you should get the right thing, e.g., + +
+str1 += NS_LITERAL_STRING("abc ") + str2 + NS_LITERAL_STRING("def");
+
++ >Another one is: + >function(const PRUnichar *foo); + >call function(L"abc " + str2); + + >It won't create a temporary nsString. ++ +
This one, I have a quick and easy explanation for. If function was +declared like this + +
+function( const nsAReadableString& ) ++
then, no problem, since a nsPromiseConcatenation (which was the +result of adding those two things together) is a readable string. +No other objects need to be created; no copying needs to be performed. + +
In all cases, we want the creation of nsStrings et al, to be +explicit, since creation is unbelievably expensive, requiring heap +allocation, locks, copying, etc. + +
I hope this answers both your posts, + + + + + +
+Date: Thu, 17 Aug 2000 20:57:08 -0400 +Subject: re our conversation ++ + return ToNewUnicode( nsLiteralCString(buffer) ); + + + + + + +
+Date: Fri, 18 Aug 2000 02:52:45 -0400 +Subject: Re: More questions and new string API ++ +
+ >1) How do I return a static string?
+
+ >const nsAReadableString& foo() {return NS_LITERAL_STRING("x");}
+ >errors on taking the address of a temporary variable.
+
+
+Unfortunately, NS_LITERAL_STRINGs definition is not particularly +amenable to this use. Instead, you would have to say something like +this: + +
+const nsAReadableString&
+foo()
+ {
+#ifdef HAVE_CPP_2BYTE_WCHAR_T
+ static nsLiteralString static_foo(L"x", 1);
+#else
+ static nsLiteralString static_foo;
+ static PRBool initialized = PR_FALSE;
+ if ( !initialized )
+ {
+ static_foo.AssignWithConversion("x", 1);
+ initialized = PR_TRUE;
+ }
+#endif
+ return static_foo;
+ }
+
++ >2) I'm using these with the STL library in an XPCOM component. + >What type should I use with map? This doesn't work... + + >typedef map+ +mapStringMyType; + >mapStringMyType foo; + >foo.find(nsAReadableString); - I want to find on a ReadableString +
I don't know what errors you are getting; but it probably doesn't work +because a reference isn't an assignable type. This is just a guess. +You may need to use + +
+map++
If you actually want the map to manage ownership of the keys, then +you'll want to use a concrete type, e.g., + +
+map++
or perhaps + +
+map++
Or maybe there's something else wrong. Send me the error messages. +If you end up using a pointer, then of course you'll have to supply a +comparison function to the map template. You won't be satisfied +with the default comparison of pointers :-) Sorry I couldn't answer +this one more completely. + + +
+ >3) How do a get a raw PRUnichar pointer out of nsAReadableString + >when I need to call something that wants 'unsigned short *'? ++ +
The problem with this scenario is that an nsAReadableString doesn't +promise that all its data is contiguous, nor that it is +zero-terminated, which is what I suspect you want in this case. If +the function you want to call can take {pointer, length} tuples, and +can consume the string in hunks without zero termination ... then you +can use copy_string to pump the string into your function, see + + http://ScottCollins.net/Journal/discussion/string_iterators.html + +
If not, and you absolutely have to have a contiguous zero-terminated +buffer, then there is a new facility (part of the DOMAPI branch) that +does what you need. It's not checked in on the trunk; it should +be in early next week. It is nsPromiseFlatString. This class +promises a contiguous zero-terminated buffer; and has an operator +PRUnichar* to produce a pointer to that buffer automatically. If the +underlying class is one that happens to be a single fragment and +zero-terminated, then, like nsPromiseSubstring and +nsPromiseConcatenation, this class merely holds a reference into the +original data. If, however, the underlying string is multi-fragment +or not zero-terminated, then nsPromiseFlatString allocates a +contiguous buffer of appropriate size and copies the fragmented string +data to it. So given + +
+void ReadBuffer( PRUnichar* ); ++
You can call this as efficiently as possible with an arbitrary string +like so + +
+ReadBuffer( nsPromiseFlatString(aString) ); ++
If the function you are calling needs to take ownership of the buffer +you hand it, then you will probably call ToNewUnicode like so + +
+void ConsumeBuffer( PRUnichar* ); + +ConsumeBuffer( ToNewUnicode(aString) ); ++
The global function ToNewUnicode is declared in "nsReadableUtils.h", +and was only recently added to the build. It is currently being used +in the DOMAPI branch. It is part of the build, but the file +"dlldeps.c" in XPCOM may need to be modified to ensure it is exported +on your platform if you are building the tip. + +Needless to say, you want to avoid functions that require bare +pointers for several reasons: (a) they typically assume +zero-termination, which is not guaranteed by the normal encodings; (b) +they require contiguous allocation, which may not be possible; (c) +they scan for the end of the string, at linear cost (if the encoding +makes it possible at all), when the length could be known in advance. +If you have to do it, the above mechanisms work, but be aware of the +cost and the potential need to copy. + + +
+ >4) How do I declare a local variable to hold a nsAReadableString? + >and a member variable? ++ +
nsAReadableString is an abstract type. So you can't have a concrete +instance of it. All strings in the hierarchy are readable strings. +If you just want a reference to a readable string, you can say, e.g., + +
+struct foo
+ {
+ const nsAReadableString& mString;
+ // ...
+
+ foo( const nsAReadableString& aString ) : mString(aString) { }
+ };
+
+...similarly with pointers; but I suspect you are looking for +something more concrete. An nsString is a nsAReadableString, and +is the typical thing you want as a member variable. An nsAutoString +is also an nsAReadableString and is typically what you would use for +a short (in length) temporary (in lifetime) local variable, as I +mentioned in my previous post. + + +
+ >5) If I call a function that returns a PRUnichar* and I want t + >use it as a nsAReadableString should I wrap it in a + >nsLiteralString? ++ +
Yes, though remember, an nsLiteralString assumes the lifetime of the +underlying data is under someone else's control. If the called +function gives you a buffer that you need to delete, you will have +to manage that yourself. Currently, people often use nsXPIDLString +to handle that. XPIDL strings are not part of the hierarchy. They +are only used as a sort of string-auto_ptr. However, I'm +integrating their functionality into nsString. There is no problem +in wrapping the same pointer in both as two separate local variables, +one to give you the readable interface, and one to manage the +lifetime. + +
If it's OK with you, I'd like to post this reply (including your +quoted questions) to n.p.m.xpcom and also put a copy near the string +iterator discussion I provided a link to above, so that other people +with similar questions can see these answers. + +
Hope this helps, + + + + + +
+Date: Sun, 3 Sep 2000 03:52:17 -0400 ++ +
In article <8nu9m2$eo14@secnews.netscape.com>, "Jon Smirl"
+ Thanks, and I appreciate your comments and insights.
+
+
+>
+> 1) Should there be a nsSegmentedString derived from nsString instead
+> of building segment support into nsString? None of my strings are
+> segmented but
+> I keep executing code that is supports it. nsPromiseFlatString would
+> be trivial in the non-segmented case.
+
+ The general case is that a string does not promise to have contiguous
+data. A specific case is that, for some implementations, it does.
+You couldn't do it the other way around, because a segmented string
+couldn't satisfy all the promises of a flat string. However, through
+the use of chunky iterators, operating on strings that happen to be
+flat is very efficient. In fact, nsPromiseFlatString is trivial in
+the non-segmented case. In addition, I'll be adding an abstract flat
+class into the hierarchy, which will present additional interface ...
+in your local routines where you actually have declared a concrete
+string instance that happens to be flat, the compiler will give you
+the benefit of using the flat specific routines (e.g., a substring
+object over a flat string is simpler than the general purpose
+substring). I need to be cautious about this, though, since I don't
+automatically want people propagating the flat type through their
+interfaces. That would put us in the same boat we're in right now ...
+where routines only work on a specific kind of string, which denies
+other parts of the code the opportunity to use an implementation
+beneficial to its specific needs, and typically for no good reason.
+
+>
+> 2) Should nsAWritableString have a way to get the buffer and then
+> return it?
+> I need to get the buffer to pass it to OS calls. I'm doing this now
+> by passing around nsStrings instead of the interface. If I just use
+> the interface I encur an extra copy since I have to use a temporary
+> buffer.
+
+ A specific string implementation could promise this, but in general, a
+writable could not. After all, a writable doesn't even guarantee
+contiguous storage. To some degree, this is what
+nsPromiseFlatString is for. However, this is a readable promise
+only. It will also be the case that ns[C]Strings, in the very near
+future will be able to just assume ownership of an arbitrary buffer
+allocated on the free store with the XPCOM allocators ... getting one
+to give up its buffer, on the other hand, presents some problems. Do
+you have a lot of places where the system writes into your string
+buffer space? Or do you have a lot of system routines that return you
+new buffers? I can imagine using nsPromiseFlatString for this, but
+what happens when the OS alters the underlying data? If the promise
+had generated that flat data on behalf of a multi-fragment string,
+should it now put the changes back? It's possible to do, I just want
+to know if it's correct to allow this situation to happen.
+
+
+
+>
+> 3) There needs to be a NS_LITERAL_CHAR() to go along with
+> NS_LITERAL_STRING().
+
+ OK.
+
+
+
+> Having NS_LITERAL_STRING() all over the code clutters
+> it up and makes it hard to tell what the code is doing, could we
+> have a standard short alias for this?
+
+ Yes, I'll try to think of something ... perhaps NS_LSTR?
+
+
+> 4) nsLiteralString should support n.ToInteger(&error);
+
+ ToInteger is actually a bad interface. It's only good if your
+entire string is the number; this encourages you to edit your string
+until it is one, or perhaps copy the numeric part to another string.
+Better if you just sscanf a string (don't know if I can provide
+that in the general case, but I'm thinking about it), or else use
+regular C++ extractors (which wouldn't be too hard for me to
+provide), or else I could give you a ToInteger that works on a pair
+of iterators, extracting the integer from the digits between them.
+
+>
+> 5) There should be a global define for an interface to a readonly
+> empty string.
+
+ Yes, there will be.
+
+
+>
+> 6) Something is wrong with concatenation....
+
+ Hopefully I've fixed this now.
+
+
+
+> 8) A forward definition is missing in the h files
+
+ I'll check it out.
+
+
+
+ My understanding is that you have already found the answers to your
+other questions.
+
+ I hope this helps,
+
+
+
+
+ nsMemory::Free
+
+
+
+
+
+ You use several NS_ConvertASCIItoUCS2("...").GetUnicode(), these should be
+
+ NS_LITERAL_STRING("...").get()
+
+ Don't do this to the very first case where you aren't wrapping an actual literal string.
+The first instance would should exploit NS_LITERAL_STRING technology as well,
+around the initial declarations of the strings ... probably want to do this with
+NS_NAMED_LITERAL_STRING.
+
+
+
+ You can see from the line of code that you're on, that this should
+have been fine. nsMemory::Alloc would be asked to allocate a 1 byte
+object. But it failed trying to allocate that. Which suggests that
+the allocator was busy and non-reentrant and the debugger tried to
+misuse it. Yes?
+
+ Of course, this doesn't solve your problem. Perhaps we need to go
+back to the idea of a function that returns a pointer to the first
+hunk of the string.
+
+ This code should work regardless of what the allocator is doing. The
+downsides are (a) it only returns the first hunk of the string, in the
+case of a multi-fragment string; and (b) that hunk might not be
+zero-terminated.
+
+ Hope this helps,
+
+
+
+
+
+ At 3:04 PM -0400 10/11/00, Mike Shaver wrote:
+ Macro ugliness makes NS_LITERAL_STRING inappropriate for use over
+other macros. In other words:
+
+ is good.
+
+ is bad. Why? Because it turns into
+
+ and there is no LFOO. Sorry. If you have to do this to a
+macro-ized string, do the magic by hand, e.g.,
+
+ or else if you don't care that nsLiteralString will scan for the
+length, just say
+
+ Hope this helps,
+
+
+
+
+
+ Actually, I'm not even sure you can do it by hand, since you didn't
+
+ and can't do that cross-platform. The other way around this is to
+define a global instead of a macro, that is, instead of saying
+
+ at the top of your file, say
+
+ or else, if the macro was used only in one spot ... perhaps you could
+just eliminate the macro in favor of NS_NAMED_LITERAL in situ.
+
+ Arghh. In this case, you may be stuck with the extra work of
+AssignWithConversion.
+
+
+
+
+
+ No, there isn't. But you could move such special processing into the
+destructor of the sink. Remember, the sink is passed by reference, so
+you can exactly control its lifetime.
+
+ Hope this helps,
+
+
+
+
+
+ This is explicitly allowed. That's why I'm proposing to change the
+names of those classes to nsLocal[C]String.
+
+
+>2) Should nsString2x.h and nsString2x.cpp go away? They look like a
+>never-completed rewrite or something...
+
+ Yes. They should go away. They are uncompleted [old] bullshit,
+exactly as you diagnosed.
+
+ I'll look into the other two questions.
+
+
+
+
+
+ We've been removing implicit conversion operators because they
+_always_ lead to trouble. Usually they make it harder to pick the
+right function when overloading is involved and in the past they have
+led to huge performance suckage because we ended up doing conversions
+when we didn't need to because the implicit operator made us pick the
+wrong function.
+
+ It's borderline when the class implements something that is so
+close, as with a guaranteed flat string or an nsCOMPtr ... but the
+general recommendation is to avoid implicit conversions.
+
+ See bug #53057.
+
+
+
+
+
+ bug:
+ http://bugzilla.mozilla.org/show_bug.cgi?id=57087
+
+ patch:
+ http://bugzilla.mozilla.org/showattachment.cgi?attach_id=24576
+
+ This patch is supposed to add the ability to define very long literal
+strings more easily by breaking lines, e.g.,
+
+ The main danger in this scheme is callers who omit the inner NS_L
+wrapping. Though I believe this will be caught at compile time as the
+wrong type initializer.
+
+ Seeking input from everybody, and waterson in particular.
+
+
+
+
+
+ There are some utilities in "xpcom/ds/nsReadableUtils.h". In
+particular, if you want to get back a new heap-allocated ASCII string
+with the minimal work, you would say
+
+ It's more efficient if you happen to already know the length. If you
+don't, don't bother counting, that's what I'll do in the constructor
+for nsLiteralString. If you do, then call like this
+
+ Other routines in that file will help you if, for instance, you wanted
+to translate into a buffer you had already allocated.
+
+ Hope this helps,
+
+
+
+
+
+ Here you go, Mike:
+
+ http://scottcollins.net/journal/discussion/mjudge-scratch.cpp
+
+
+
+
+
+
+ If you get an iterator into a string and you advance it all the way to
+the end of the string, and then keep trying to advance it, you hit
+this assert. This could happen, for example if you tried to copy 10
+characters out of a 9 character string. I've tried to make this
+impossible to get to. As far as I know, all my routines trim requests
+in advance of manipulating iterators. When you see this, you should
+get the stack. That will take you right to the bad spot.
+
+
+
+
+
+ You do know you are comparing two pointers now? It seems unlikely
+those two pointers would ever be the same pointer. You probably want
+to say something like
+
+ ...so that you compare the contents of two strings. Right now,
+you're just testing to see if two pointers both point to the same
+location in memory. A lot of people make this mistake. I would like
+to make it obvious to people that comparing two pointers does not
+compare strings. Can you tell me what gave you that impression so
+that I can figure out how to better educate people not to do this? By
+the way, it's not that I don't want to make this compare two
+strings; it's that in C++, you can't override operations for built-in
+types. And pointers are built-in types. So I can't make
+operator==(const PRUnichar*, const PRUnichar*) do anything different
+than it already does, which is the same thing it does for any other
+pointer.
+
+
+
+
+
+
diff --git a/mozilla/xpcom/string/doc/string-guide.html b/mozilla/xpcom/string/doc/string-guide.html
index 6d0d25831b1..47dc2d03a6e 100644
--- a/mozilla/xpcom/string/doc/string-guide.html
+++ b/mozilla/xpcom/string/doc/string-guide.html
@@ -540,6 +540,1937 @@ void PrintSomeStrings( const nsAString& aString, const PRUnichar* aKey, const ns
+
+
+ Here are the email answers I have yet to format into the FAQ.
+ Some of the URLs may be out-dated or moved.
+ The messages are in order from oldest to newest.
+ Encoding Wars
+
+ This message is all about strings and the various encodings that might
+be used to interpret their contents, the ramifications of that, and
+where we're heading. The point of this message is to say what we're
+currently thinking, and get feedback. I apologize in advance for the
+rambling, and for the fact that this message may accidentally mix
+discussion of how things are and how they will be.
+
+ There are many different possible encodings. Three in common use in
+the Mozilla source base are: ASCII, UCS2, and UTF8. In ASCII, every
+character fits in 7-bits and is typically stored in an 8-bit byte. We
+usually represent ASCII strings with nsCStrings, nsXPIDLCStrings,
+or char string literals. In UCS2, characters occupy 16 bits each.
+We usually represent UCS2 strings as nsStrings, etc., i.e., two-byte
+or `wide' strings. UTF8 is a multi-byte encoding. A character might
+occupy one, two, or three bytes. It is easiest to store and
+manipulate such a string within a single-byte or `narrow' string
+implementation.
+
+ None of our current string implementations know the encoding of the
+data they hold at any given moment. An nsCString might legitimately
+hold data encoded in ASCII, UTF8, or even EBCDIC for that matter.
+
+ Operations that convert from one encoding to another, or operations
+that are encoding sensitive (e.g., to_upper), rightly belong in
+i18n. The fact that our current string interfaces automatically and
+implicitly convert between wide and narrow strings is actually the
+source of many errors in two particular categories: (1) unintended
+extra work, (2) mistaken re-encoding, e.g., accidentally `converting'
+a UTF8 string to UCS2 by pretending the UTF8 string is ASCII and then
+padding with '\0's.
+
+ We've known these were bad for a long time, and have been trying to
+find the right way to fix them. The current thinking is to just byte
+the bullet and eliminate implicit conversions. That has interesting
+ramifications.
+
+ Though we've always hated this form since it requires a heap
+allocation. In current code, we recommend
+
+ which still copy/converts, but at least it probably doesn't need to do
+a heap allocation. In the best of all worlds, no conversion, copying,
+or allocation would be necessary. To do that, you would need to be
+able to directly specify a UCS2 string, e.g., with the L"hello"
+notation, and wrap that in an interface that just held a pointer.
+E.g., something like
+
+ There are problems with this example, however. The L notation
+specifically makes objects that are arrays of wchar_t, which under
+GCC is a 4-byte element. This leads to incompatibility with JS, and
+the annoyance of possibly bloated storage (I'm sort of minimizing the
+situation here. It's worse that I make it sound). More about tricks
+to get around this in a bit, but first, let me talk about what to do
+in the meantime while we're just getting rid of implicit constructors.
+ Initially to get around this problem (what problem? The problem that
+foo("hello") stopped compiling on my machine when I threw the
+switch) I made a routine called NS_ConvertToString which looked like
+this
+
+ Which lets me write
+
+ This was OK, but in discussion there were concerns about performance
+on machines that didn't inline well, and issues about naming. In
+that meeting we came up with an alternate naming strategy that we
+think has room for growth and an implementation more likely to be
+efficient on every platform. The implementation is to define a new
+class that derives from nsAutoString, but allows construction from a
+char*
+
+ Which gives identical (though renamed) notation for calling foo:
+
+ It looks like a function call to an explicit encoding conversion. It
+acts like a function call to an explicit encoding conversion. It is
+a function call to an explicit encoding conversion. We think that
+this naming pattern has room for growth. In the meeting, we concluded
+that the best representation for encoding conversions is a family of
+functions, and NS_ConvertASCIItoUCS2 fits right in. We think that
+XPCOM probably can't live without the ASCII to UCS2 conversion (though
+as explicit as possible) but that all others rightly belong in i18n
+land.
+
+ You can probably deduce from the clues in NS_ConvertToString, above,
+that constructors weren't the only thing that became explicit.
+Assignment, appending, comparison, et al, got renamed so that when
+assigning, appending, or comparing to a value in a different encoding
+the `WithConversion' form must be used. E.g.,
+
+ Yes, it's long and annoying. Just like the extra work you were
+implicitly asking to have done, perhaps incorrectly. There are other
+reasons to rename these functions. When nsString and nsCString
+defined a ton of, e.g., Appends each there was no problem, because
+nobody wanted to override Append. Now, with strings inheriting from
+abstract base classes we immediately run into the problem that
+overriding and overloading don't mix very well in C++. Because of a
+feature of C++ called name hiding, it is problematic to override only
+a single signature of a name overloaded in a base class. The base
+nsAWritableString provides several Appends, all for objects of
+(hopefully) the same encoding. nsString can't easily add a bunch of
+new Appends (the converting ones) without running face first into
+the name hiding problem. The discussion of the fix for this is mostly
+unrelated to encoding issues, so I'll defer it to another post.
+
+ In hindsight, after the meeting, it seemed clear that all the
+`WithConversion' forms would be better named
+
+ however, the real goal (probably) is to move most such conversions
+into i18n. Just bringing attention to the previously implicit
+conversions is a good first step. Renaming these conversions as just
+suggested is probably the right thing to do, though it sort of
+validates them, which I'm not sure we really want. This is a decision
+we need to discuss further.
+
+ Now, back to the string literal problem above. One possible solution
+is to use a macro. Imagine
+
+ which on a machine where the L trick works, turns into
+
+ but on a machine where there is trouble, turns into something less
+appealing, but more likely to work, like
+
+ Another solution is to add a compilation step that fixes L strings
+on bad platforms to be non-L strings, but padded with \0s. E.g.,
+L"Hello" gets preprocessed into "\000H\000e\000l\000l\000o\000".
+This solution is more annoying to the developer, where the prior
+solution is more annoying during the runtime.
+
+ Before we go to too much trouble on this specific feature, we will
+probably want to do more measurement to see just how much and how
+often we are converting constant literal strings, and why.
+
+
+ I'm currently ripping through the tree fixing things to use the
+`WithConversion' forms where appropriate. I was also converting
+things to use NS_ConvertToString where appropriate; unless I get
+talked out of it, I want to switch midstream to
+NS_ConvertASCIItoUCS2, then go back and fix up the
+NS_ConvertToString instances later. I've set things up so I can
+check in as I go. After all these conversions have been done, I'll be
+able to throw the switch (what switch? NEW_STRING_APIS) which will
+make nsString inherit from nsAWritableString, etc. and allow us to
+start exploiting these other opportunities (e.g., for literal strings,
+shared strings, etc. See
+http://bugzilla.mozilla.org/show_bug.cgi?id=28221 for details and
+reasoning.)
+
+ I guess I'm expecting comments on:
+
+ So as not to jumble the discussion, I'll be separately posting other
+requests for comments about specific features of the design of the new
+string hierarchy.
+
+ I hope this helps keep everybody filled in on what we're thinking and
+able to point out what we're forgetting or screwing up :-)
+
+
+
+
+
+ news://news.mozilla.org/scc-705460.16423913042000@news.mozilla.org
+
+
+
+
+
+ I would prefer you compare with Equals (which should really be named
+IsEqualTo) rather than operator==() because of this:
+
+ Comparing two raw `string' pointers doesn't compare the characters
+they point to, but instead compares the bits of the pointers. For
+this reason, I may eventually make comparison of a string with a
+pointer using operators just go away.
+
+
+
+
+
+ Yes, we're aware that turning off wchar_t support makes wchar_t be
+a synonym for unsigned short under Metrowerks. We know that the
+current version of VC++ also makes these types equivalent. In theory,
+though, the types are distinct even when they are the same size and
+shape. By using real wchar_t support, we are forced to recognize
+the distinction and navigate it appropriately with reinterpret_cast
+(via NS_REINTERPRET_CAST). The win here is that we aren't caught by
+compiler changes that suddenly make some set of compilers compliant
+and therefore break our code. We will add an autoconf test that lets
+UNIX compilers opt in to our string scheme when they have an
+appropriately shaped wchar_t. If these happen to be compliant
+compilers, all will be well. If they don't, the casts don't hurt,
+because they are type correct. We are writing our code to meet the
+standard as we move forward.
+
+ The win for us is realized by the following macros
+
+ An nsLiteralString points directly to the literal characters. No
+copying, no conversion, and the length calculation happens at compile
+time. This has turned out to be as large a savings as 15% of code
+space and 8% of data space, net, in our string test harness It's
+faster as well, again by eliminating the copying, conversion, and
+length calculation. We don't know yet what those numbers translate
+into in our real code base, but we have high hopes.
+
+ I don't want to be in the position to ask you to change your code. I
+don't think it's appropriate for me to do so. The AIM application
+that is your client is our client as well. They need to resolve this
+difference between us in whatever way they think best. That may mean
+asking you if changing your apis is the right thing to do. Or it may
+mean applying the casts. Our code-base and yours, Justin, are more
+like cousins. I don't think you should have to change just to conform
+to us. You may think my arguments for using real wchar_t have
+merit, and adopt similar usage just because you agree; but I think the
+only obligation you have is to follow the technical solution you think
+is right for your code.
+
+ If you decide to make this api change, it will mean shipping a new
+binary (on Mac) for your library to clients who want to switch over to
+the new api (since the name mangling will be different, and therefore,
+the link requirements will change).
+
+ Hope this helps,
+
+
+
+
+
+ doesn't compile because there is no three parameter form for Equals.
+ For all definitions of Equals on strings, see "nsAReadableString.h"
+
+ http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h
+
+ There is an EqualsWithConversion that takes three parameters.
+
+ http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsString2.h#731
+
+ It is ``EqualsWithConversion'' because it admits the possibility of an
+encoding specific transformation, in this case to provide
+case-insensitive comparison. This also wouldn't compile, however,
+since, at the moment, an nsLiteralString doesn't provide an operator
+to produce a const PRUnichar* (though perhaps it should), and it
+doesn't satisfy the other interfaces that match this call, e.g., a
+const nsString&.
+
+ Perhaps I need to move case-insensitive comparison up out of
+nsString into a global encoding specific transformations and
+algorithms file (which was on its way anyway as Waterson, knows); this
+use is one bit of evidence to support this. In the short term, this
+can be fixed (if we think the current behavior is wrong) by providing
+operator const CharT*() const on literal string.
+
+ If you can live with out case-folding, the earlier form is preferred
+
+ if you can't, then one of the fixes I mentioned is in order.
+
+
+
+
+
+ Apologies. Documentation mentioning strings is getting out of date.
+Here are some specific answers.
+
+
+ ...is now perhaps best expressed as
+
+ nsString URLString( NS_LITERAL_STRING("http://www.mozilla.org") );
+
+ since an nsString is a sequence of 2-byte wide characters, and the
+routines that implicitly convert 1-byte sequences (like the literal
+sequence you specified, "http:...") are now gone.
+
+ Up until not too long ago, one would have had to say
+
+ The NS_LITERAL_STRING construction is new machinery that has the
+potential to make many operations much more efficient.
+
+ SetString was a synonym for Assign or assignment with
+operator=(), it too went away. The equivalent is the second
+example I gave above, that is, the one with AssignWithConversion.
+
+ Assign still exists. AssignWithConversion takes on that
+functionality for assignments that require encoding transformations
+(e.g., from ASCII to UCS2). SetString is gone, since it was always
+a synonym for Assign.
+
+ Learn more about the general APIs for strings that we are trying to
+move to by examining
+
+http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h
+http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h
+
+ Hope this helps,
+
+
+
+
+
+ This is what substrings are for. In that case, you could use
+
+ As for case-folding, it's best if you can case-fold everything up
+front, instead of doing it repeatedly. I'll have to get back to you
+on a general solution to that problem, or what my schedule for getting
+it checked in would be. I'm sorry, I know that's not what you needed
+to hear. If the source string is an nsString, you can continue to
+exploit its implementation of these routines, e.g., ToLower all
+up-front.
+
+ Hope this helps,
+
+
+
+
+
+ What would you prefer? That extracting a character not in the string
+always return CharT(0)? Can't do it for two reasons: (1) 0 may be
+a valid character in a particular encoding, so it can't be used in
+general as a ``no character at that position'' marker; and (2) I can't
+control what an individual string implementation does when asked to
+get an out-of-bounds fragment, it's explicitly undefined. That means
+the result of CharAt is explicitly undefined for indexes outside the
+defined contents of the string. As a debugging convenience, I have
+made this assert, but it has always been the case that retrieving such
+a character had undefined results ... even in [the old] code.
+
+ OK, you might say, well at least let me ask for a character that is
+only off the end by one. E.g., Last of an empty string. Reason (1)
+from above still applies. How bad is it to say, for the case you gave
+
+ In general, you probably want to opt out of a whole lot of work when
+the source string is empty. It is slightly less convenient, but it
+doesn't tie us to a bunch of implementation specific mojo.
+
+
+ This is an annoying property of auto strings, e.g., that they always
+have an allocated buffer. I'm happy to fix this bug, however, be
+aware that GetUnicode and GetBuffer are artifacts of [the old]
+implementation that we don't want to support. They are not part of
+the abstract interface. We will keep them no longer than we have to.
+They don't support our multi-fragment paradigm. People who require a
+contiguous hunk of characters in the future, and are unwilling to
+switch over to chunky-iterators, may be forced to copy the string to
+their own buffer. There will be an implementation of narrow character
+string that guarantees contiguous allocation and a zero-terminator,
+much as nsCString does now, for compatibility with platform uses,
+but this won't be the default string class.
+
+
+
+
+
+ Clarifying String Sematics
+
+ Recently, I added an assert to the string operations that extract
+characters, namely First(), Last(), CharAt(), and
+operator[](). This assert fires when any of these routines are used
+to access a character outside the defined contents of the string. For
+First() and Last() that means whenever they are applied to an
+empty string. For CharAt() and operator[](), that means whenever
+they are used to access an index outside the range of
+0..Length()-1. There have been some complaints, however, the
+result was always undefined. What follows is extracted from an email
+exchange between me and warren on this topic. I hope it clarifies
+strings semantics
+
+ Warren writes:
+ I replied:
+ Warren also asks:
+ And I reply:
+ In a later message, Chris Waterson asks a related question
+ And I reply:
+ Hope this makes sense,
+
+
+
+
+ The behavior you describe sounds exactly like when you say
+
+ because in this case, the thing passed in is a const char*.
+NS_LITERAL_STRING is not meant to be used in this way. It is only
+meant to be used around a " delimited string. The type of such is
+const char[N] where N is the number of characters in the string + 1
+for the zero terminator it helpfully adds. sizeof such a type is
+N.
+
+ Are you sure you had the actual string as an argument, as in your
+example to me? Or could the actual code have been like my sample,
+above?
+
+
+
+
+
+ Dave,
+
+ please read
+
+ news://news.mozilla.org/scc-314ABF.14261619062000@news.mozilla.org
+
+ It's just plain wrong to let people try to index into a string outside
+its defined contents. I can't just return '\0' or PRUnichar('\0')
+there as that could be a legal value to have somewhere in your
+string for some encodings ... and the encoding is not specified. So
+your patch has the basic problem of defeating my plan to stop people
+from doing this bad thing.
+
+ The second problem with your patch is that you use the symbolic
+constant nsnull, which is ostensibly a pointer value; Last returns
+a character. nsnull is not appropriate for that purpose. In fact,
+C++ gurus pretty much eschew the use of symbolic constants for 0.
+NULL is to be avoided. nsnull is wrong-headed in that it presumes
+we could have some other application specific value for NULL. We
+can't, it would never work. It's just wasted brain-print. Always use
+0 for these situations, and if you want to communicate the fact that
+something is a pointer type, either use a comment or a
+(construction-style) cast, like so (graded examples from worst to
+best:)
+
+ Don't let this discourage you; keep up the good work :-)
+
+
+
+
+
+ does this help?
+
+ I can personally walk you through any specific scenario you need.
+
+
+
+
+
+ You got it right... it's nsWritingIterator Here are three examples of running through a string and modifying some
+of the characters in it. All use nsWritingIterators.
+
+
+ Does this show it better?
+
+
+
+
+
+ I'll explain things in a little more detail than you need, then so
+that some of the stuff you see in these headers will make more sense.
+I'll also answer your questions out of order.
+
+ First: the string hierarchy looks like this
+
+http://ScottCollins.net/Journal/discussion/string_hierarchy.gif
+
+ The two most important headers are:
+
+http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAReadableString.h
+http://lxr.mozilla.org/seamonkey/source/xpcom/ds/nsAWritableString.h
+
+ These abstract classes, nsAReadable[C]String, and
+nsAWritable[C]String are typically what you will want to use in the
+interfaces of new code. If you write a piece of code that takes a
+string for input, consider, e.g.,
+
+ If you write a piece of code that modifies a string, consider
+
+ When creating your own classes, member strings will typically be
+nsStrings. When you can't avoid creating a short string that you
+need only temporarily during a function, you will typically use
+nsAutoString. When someone passes you a raw pointer, or a raw
+pointer and a length, representing a buffer of characters that you may
+examine, but won't own, you can treat it like a string by wrapping it
+in an nsLiteralString, e.g.,
+
+ You will use nsLiteralString around quoted constant strings as well,
+though typically through the NS_LITERAL_STRING macro, to avoid doing
+a length calculation
+
+ expands to
+
+ if L notation works as needed on your platform.
+
+Those are the basics. Now onto your questions:
+
+
+ L"abc " makes a an object that is a const wchar_t[5], and none of
+the string code knows about wchar_t. The main reason is that
+wchar_t is not necessarily the right size (it can be 4 bytes under
+gcc). If you wrap these constant expressions in NS_LITERAL_STRING,
+as described above, you should get the right thing, e.g.,
+
+ This one, I have a quick and easy explanation for. If function was
+declared like this
+
+ then, no problem, since a nsPromiseConcatenation (which was the
+result of adding those two things together) is a readable string.
+No other objects need to be created; no copying needs to be performed.
+
+ In all cases, we want the creation of nsStrings et al, to be
+explicit, since creation is unbelievably expensive, requiring heap
+allocation, locks, copying, etc.
+
+ I hope this answers both your posts,
+
+
+
+
+
+ Unfortunately, NS_LITERAL_STRINGs definition is not particularly
+amenable to this use. Instead, you would have to say something like
+this:
+
+ I don't know what errors you are getting; but it probably doesn't work
+because a reference isn't an assignable type. This is just a guess.
+You may need to use
+
+ If you actually want the map to manage ownership of the keys, then
+you'll want to use a concrete type, e.g.,
+
+ or perhaps
+
+ Or maybe there's something else wrong. Send me the error messages.
+If you end up using a pointer, then of course you'll have to supply a
+comparison function to the map template. You won't be satisfied
+with the default comparison of pointers :-) Sorry I couldn't answer
+this one more completely.
+
+
+ The problem with this scenario is that an nsAReadableString doesn't
+promise that all its data is contiguous, nor that it is
+zero-terminated, which is what I suspect you want in this case. If
+the function you want to call can take {pointer, length} tuples, and
+can consume the string in hunks without zero termination ... then you
+can use copy_string to pump the string into your function, see
+
+ http://ScottCollins.net/Journal/discussion/string_iterators.html
+
+ If not, and you absolutely have to have a contiguous zero-terminated
+buffer, then there is a new facility (part of the DOMAPI branch) that
+does what you need. It's not checked in on the trunk; it should
+be in early next week. It is nsPromiseFlatString. This class
+promises a contiguous zero-terminated buffer; and has an operator
+PRUnichar* to produce a pointer to that buffer automatically. If the
+underlying class is one that happens to be a single fragment and
+zero-terminated, then, like nsPromiseSubstring and
+nsPromiseConcatenation, this class merely holds a reference into the
+original data. If, however, the underlying string is multi-fragment
+or not zero-terminated, then nsPromiseFlatString allocates a
+contiguous buffer of appropriate size and copies the fragmented string
+data to it. So given
+
+ You can call this as efficiently as possible with an arbitrary string
+like so
+
+ If the function you are calling needs to take ownership of the buffer
+you hand it, then you will probably call ToNewUnicode like so
+
+ The global function ToNewUnicode is declared in "nsReadableUtils.h",
+and was only recently added to the build. It is currently being used
+in the DOMAPI branch. It is part of the build, but the file
+"dlldeps.c" in XPCOM may need to be modified to ensure it is exported
+on your platform if you are building the tip.
+
+Needless to say, you want to avoid functions that require bare
+pointers for several reasons: (a) they typically assume
+zero-termination, which is not guaranteed by the normal encodings; (b)
+they require contiguous allocation, which may not be possible; (c)
+they scan for the end of the string, at linear cost (if the encoding
+makes it possible at all), when the length could be known in advance.
+If you have to do it, the above mechanisms work, but be aware of the
+cost and the potential need to copy.
+
+
+ nsAReadableString is an abstract type. So you can't have a concrete
+instance of it. All strings in the hierarchy are readable strings.
+If you just want a reference to a readable string, you can say, e.g.,
+
+ ...similarly with pointers; but I suspect you are looking for
+something more concrete. An nsString is a nsAReadableString, and
+is the typical thing you want as a member variable. An nsAutoString
+is also an nsAReadableString and is typically what you would use for
+a short (in length) temporary (in lifetime) local variable, as I
+mentioned in my previous post.
+
+
+ Yes, though remember, an nsLiteralString assumes the lifetime of the
+underlying data is under someone else's control. If the called
+function gives you a buffer that you need to delete, you will have
+to manage that yourself. Currently, people often use nsXPIDLString
+to handle that. XPIDL strings are not part of the hierarchy. They
+are only used as a sort of string-auto_ptr. However, I'm
+integrating their functionality into nsString. There is no problem
+in wrapping the same pointer in both as two separate local variables,
+one to give you the readable interface, and one to manage the
+lifetime.
+
+ If it's OK with you, I'd like to post this reply (including your
+quoted questions) to n.p.m.xpcom and also put a copy near the string
+iterator discussion I provided a link to above, so that other people
+with similar questions can see these answers.
+
+ Hope this helps,
+
+
+
+
+
+ In article <8nu9m2$eo14@secnews.netscape.com>, "Jon Smirl"
+ Thanks, and I appreciate your comments and insights.
+
+
+>
+> 1) Should there be a nsSegmentedString derived from nsString instead
+> of building segment support into nsString? None of my strings are
+> segmented but
+> I keep executing code that is supports it. nsPromiseFlatString would
+> be trivial in the non-segmented case.
+
+ The general case is that a string does not promise to have contiguous
+data. A specific case is that, for some implementations, it does.
+You couldn't do it the other way around, because a segmented string
+couldn't satisfy all the promises of a flat string. However, through
+the use of chunky iterators, operating on strings that happen to be
+flat is very efficient. In fact, nsPromiseFlatString is trivial in
+the non-segmented case. In addition, I'll be adding an abstract flat
+class into the hierarchy, which will present additional interface ...
+in your local routines where you actually have declared a concrete
+string instance that happens to be flat, the compiler will give you
+the benefit of using the flat specific routines (e.g., a substring
+object over a flat string is simpler than the general purpose
+substring). I need to be cautious about this, though, since I don't
+automatically want people propagating the flat type through their
+interfaces. That would put us in the same boat we're in right now ...
+where routines only work on a specific kind of string, which denies
+other parts of the code the opportunity to use an implementation
+beneficial to its specific needs, and typically for no good reason.
+
+>
+> 2) Should nsAWritableString have a way to get the buffer and then
+> return it?
+> I need to get the buffer to pass it to OS calls. I'm doing this now
+> by passing around nsStrings instead of the interface. If I just use
+> the interface I encur an extra copy since I have to use a temporary
+> buffer.
+
+ A specific string implementation could promise this, but in general, a
+writable could not. After all, a writable doesn't even guarantee
+contiguous storage. To some degree, this is what
+nsPromiseFlatString is for. However, this is a readable promise
+only. It will also be the case that ns[C]Strings, in the very near
+future will be able to just assume ownership of an arbitrary buffer
+allocated on the free store with the XPCOM allocators ... getting one
+to give up its buffer, on the other hand, presents some problems. Do
+you have a lot of places where the system writes into your string
+buffer space? Or do you have a lot of system routines that return you
+new buffers? I can imagine using nsPromiseFlatString for this, but
+what happens when the OS alters the underlying data? If the promise
+had generated that flat data on behalf of a multi-fragment string,
+should it now put the changes back? It's possible to do, I just want
+to know if it's correct to allow this situation to happen.
+
+
+
+>
+> 3) There needs to be a NS_LITERAL_CHAR() to go along with
+> NS_LITERAL_STRING().
+
+ OK.
+
+
+
+> Having NS_LITERAL_STRING() all over the code clutters
+> it up and makes it hard to tell what the code is doing, could we
+> have a standard short alias for this?
+
+ Yes, I'll try to think of something ... perhaps NS_LSTR?
+
+
+> 4) nsLiteralString should support n.ToInteger(&error);
+
+ ToInteger is actually a bad interface. It's only good if your
+entire string is the number; this encourages you to edit your string
+until it is one, or perhaps copy the numeric part to another string.
+Better if you just sscanf a string (don't know if I can provide
+that in the general case, but I'm thinking about it), or else use
+regular C++ extractors (which wouldn't be too hard for me to
+provide), or else I could give you a ToInteger that works on a pair
+of iterators, extracting the integer from the digits between them.
+
+>
+> 5) There should be a global define for an interface to a readonly
+> empty string.
+
+ Yes, there will be.
+
+
+>
+> 6) Something is wrong with concatenation....
+
+ Hopefully I've fixed this now.
+
+
+
+> 8) A forward definition is missing in the h files
+
+ I'll check it out.
+
+
+
+ My understanding is that you have already found the answers to your
+other questions.
+
+ I hope this helps,
+
+
+
+
+ nsMemory::Free
+
+
+
+
+
+ You use several NS_ConvertASCIItoUCS2("...").GetUnicode(), these should be
+
+ NS_LITERAL_STRING("...").get()
+
+ Don't do this to the very first case where you aren't wrapping an actual literal string.
+The first instance would should exploit NS_LITERAL_STRING technology as well,
+around the initial declarations of the strings ... probably want to do this with
+NS_NAMED_LITERAL_STRING.
+
+
+
+ You can see from the line of code that you're on, that this should
+have been fine. nsMemory::Alloc would be asked to allocate a 1 byte
+object. But it failed trying to allocate that. Which suggests that
+the allocator was busy and non-reentrant and the debugger tried to
+misuse it. Yes?
+
+ Of course, this doesn't solve your problem. Perhaps we need to go
+back to the idea of a function that returns a pointer to the first
+hunk of the string.
+
+ This code should work regardless of what the allocator is doing. The
+downsides are (a) it only returns the first hunk of the string, in the
+case of a multi-fragment string; and (b) that hunk might not be
+zero-terminated.
+
+ Hope this helps,
+
+
+
+
+
+ At 3:04 PM -0400 10/11/00, Mike Shaver wrote:
+ Macro ugliness makes NS_LITERAL_STRING inappropriate for use over
+other macros. In other words:
+
+ is good.
+
+ is bad. Why? Because it turns into
+
+ and there is no LFOO. Sorry. If you have to do this to a
+macro-ized string, do the magic by hand, e.g.,
+
+ or else if you don't care that nsLiteralString will scan for the
+length, just say
+
+ Hope this helps,
+
+
+
+
+
+ Actually, I'm not even sure you can do it by hand, since you didn't
+
+ and can't do that cross-platform. The other way around this is to
+define a global instead of a macro, that is, instead of saying
+
+ at the top of your file, say
+
+ or else, if the macro was used only in one spot ... perhaps you could
+just eliminate the macro in favor of NS_NAMED_LITERAL in situ.
+
+ Arghh. In this case, you may be stuck with the extra work of
+AssignWithConversion.
+
+
+
+
+
+ No, there isn't. But you could move such special processing into the
+destructor of the sink. Remember, the sink is passed by reference, so
+you can exactly control its lifetime.
+
+ Hope this helps,
+
+
+
+
+
+ This is explicitly allowed. That's why I'm proposing to change the
+names of those classes to nsLocal[C]String.
+
+
+>2) Should nsString2x.h and nsString2x.cpp go away? They look like a
+>never-completed rewrite or something...
+
+ Yes. They should go away. They are uncompleted [old] bullshit,
+exactly as you diagnosed.
+
+ I'll look into the other two questions.
+
+
+
+
+
+ We've been removing implicit conversion operators because they
+_always_ lead to trouble. Usually they make it harder to pick the
+right function when overloading is involved and in the past they have
+led to huge performance suckage because we ended up doing conversions
+when we didn't need to because the implicit operator made us pick the
+wrong function.
+
+ It's borderline when the class implements something that is so
+close, as with a guaranteed flat string or an nsCOMPtr ... but the
+general recommendation is to avoid implicit conversions.
+
+ See bug #53057.
+
+
+
+
+
+ bug:
+ http://bugzilla.mozilla.org/show_bug.cgi?id=57087
+
+ patch:
+ http://bugzilla.mozilla.org/showattachment.cgi?attach_id=24576
+
+ This patch is supposed to add the ability to define very long literal
+strings more easily by breaking lines, e.g.,
+
+ The main danger in this scheme is callers who omit the inner NS_L
+wrapping. Though I believe this will be caught at compile time as the
+wrong type initializer.
+
+ Seeking input from everybody, and waterson in particular.
+
+
+
+
+
+ There are some utilities in "xpcom/ds/nsReadableUtils.h". In
+particular, if you want to get back a new heap-allocated ASCII string
+with the minimal work, you would say
+
+ It's more efficient if you happen to already know the length. If you
+don't, don't bother counting, that's what I'll do in the constructor
+for nsLiteralString. If you do, then call like this
+
+ Other routines in that file will help you if, for instance, you wanted
+to translate into a buffer you had already allocated.
+
+ Hope this helps,
+
+
+
+
+
+ Here you go, Mike:
+
+ http://scottcollins.net/journal/discussion/mjudge-scratch.cpp
+
+
+
+
+
+
+ If you get an iterator into a string and you advance it all the way to
+the end of the string, and then keep trying to advance it, you hit
+this assert. This could happen, for example if you tried to copy 10
+characters out of a 9 character string. I've tried to make this
+impossible to get to. As far as I know, all my routines trim requests
+in advance of manipulating iterators. When you see this, you should
+get the stack. That will take you right to the bad spot.
+
+
+
+
+
+ You do know you are comparing two pointers now? It seems unlikely
+those two pointers would ever be the same pointer. You probably want
+to say something like
+
+ ...so that you compare the contents of two strings. Right now,
+you're just testing to see if two pointers both point to the same
+location in memory. A lot of people make this mistake. I would like
+to make it obvious to people that comparing two pointers does not
+compare strings. Can you tell me what gave you that impression so
+that I can figure out how to better educate people not to do this? By
+the way, it's not that I don't want to make this compare two
+strings; it's that in C++, you can't override operations for built-in
+types. And pointers are built-in types. So I can't make
+operator==(const PRUnichar*, const PRUnichar*) do anything different
+than it already does, which is the same thing it does for any other
+pointer.
+
+
+
+
+
+
+
+Date: Wed, 20 Sep 2000 17:32:13 -0400
+Subject: Re: how to free an nsString::ToNewCString
+
+
+
+ >What's the current approved way to free an nsString::ToNewCString?
+
+
+
+
+
+
+Date: Thu, 12 Oct 2000 00:57:28 -0400
+Subject: string answers
+
+
+
+nsresult
+DoSomething( nsAWritableString& answer )
+ {
+ nsresult rv;
+
+ nsXPIDLString registry_data;
+ Fetch("key", getter_Shares(registry_data));
+
+ nsLiteralString path(not_my_string);
+
+ PRInt32 first_colon = path.FindChar(PRUnichar(':'));
+ if ( first_colon != -1 )
+ {
+ // convert ... extract path from |path|
+ nsCOMPtr
+
+
+Date: Thu, 12 Oct 2000 02:03:49 -0400
+Subject: Re: and the answer is ...
+
+
+
+const char*
+debug_string( const nsAReadableCString& aCString )
+ {
+ nsReadingIterator<char> iter;
+ aCString.BeginReading(iter);
+ return aCString.IsEmpty() ? "" : iter.get();
+ }
+
+
+
+Date: Thu, 12 Oct 2000 08:30:32 -0400
+Subject: Re: Self healing the cache :-)
+
+
+
+ >NS_LITERAL_STRING(NS_XPCOM_SHUTDOWN_OBSERVER_ID);
+
+
+
+NS_LITERAL_STRING("foo")
+
+
+#define FOO "foo"
+NS_LITERAL_STRING(FOO)
+
+
+nsLiteralString(LFOO, sizeof(LFOO)...
+
+
+nsLiteralString(FOO, sizeof(FOO)/sizeof(PRUnichar)
+ + sizeof(PRUnichar('\0')))
+
+
+nsLiteralString(FOO)
+
+
+
+Date: Thu, 12 Oct 2000 08:36:14 -0400
+Subject: Re: Self healing the cache :-)
+
+
+
+#define FOO L"foo"
+
+
+#define FOO "foo"
+
+
+NS_NAMED_LITERAL_STRING(FOO, "foo")
+
+
+
+Date: Sun, 3 Dec 2000 16:38:07 -0400
+Subject: Re: another copy_string question
+
+
+
+ >Is there a way to tell, inside the write() sink, if one is in the
+ >final hunk? I need to do some special processing at the end.
+
+
+
+{
+ MySink sink;
+ nsReadingIterator<PRUnichar> sourceStart = aStr.BeginReading();
+ nsReadingIterator<PRUnichar> sourceEnd = aStr.EndReading();
+ copy_string(sourceStart, sourceEnd, sink);
+ // |sink| destructor executed here
+}
+
+
+
+Date: Fri, 15 Dec 2000 20:02:08 -0400
+Subject: fragment of code
+
+
+
+nsPromiseFlatString flatKey(aReadable);
+
+flatKey.get()
+
+
+
+Date: Tue, 16 Jan 2001 16:47:37 -0400
+Subject: Re: a few string questions...
+
+
+>I've accumulated a few questions I've been wanting to ask you, mostly
+>about string stuff. Nothing urgent, but I want to ask them before I
+>forget. So here goes...:
+>
+>1) Is it acceptable to use nsLiteralCString or nsLiteralString on
+>something that's not a literal? This can be useful in some places,
+>for example, to convert a char* to PRUnichar*:
+>
+>PRUnichar* new = ToNewUnicode(nsLiteralCString(myCharPtr));
+
+
+
+Date: Thu, 1 Feb 2001 15:12:41 -0400
+Subject: Re: [Fwd: bad string, bad string]
+
+
+
+
+Date: Tue, 6 Feb 2001 18:52:23 -0400
+Subject: seeking review for bug #57087
+
+
+
+NS_MULTILINE_LITERAL( NS_L("This is the start of a very long line")
+ NS_L(" which actually continues across")
+ NS_L(" a couple more.") )
+
+
+
+Date: Wed, 14 Feb 2001 16:09:10 -0400
+Subject: Re: Question...
+
+
+
+PRUnichar* sourceChars = ...;
+
+char* destChars = ToNewCString(nsLiteralString(sourceChars));
+
+
+destChars = ToNewCString( nsLiteralString(sourceChars, length) );
+
+
+
+Date: Fri, 23 Feb 2001 03:12:58 -0400
+Subject: string snippet
+
+
+
+nsCString aInput;
+
+
+
+nsReadingIterator<char> search_start;
+aInput.BeginReading(search_start);
+
+nsReadingIterator<char> search_end;
+aInput.EndReading(search_end);
+
+if ( FindCharInReadable(':', search_start, search_end) )
+ {
+ ++search_start;
+ return ToNewCString( Substring(aInput, search_start, search_end)
+);
+ }
+
+
+
+Date: Wed, 7 Mar 2001 19:44:08 -0400
+Subject: string help
+
+
+
+
+Date: Fri, 9 Mar 2001 20:56:07 -0400
+Subject: Re: string assertions
+
+
+
+
+Date: Sat, 31 Mar 2001 11:04:03 -0400
+Subject: Re: Sun bustage and string advice
+
+
+
+NS_LITERAL_STRING("foo").Equals(aTopic) // or
+
+NS_LITERAL_STRING("foo") == nsLiteralString(aTopic)
+
+
+
+Date: Thu, 13 Apr 2000 19:41:47 -0400
+
+
+
+void foo( const nsString& aUCS2string );
+
+foo("hello"); // works! constructs a temporary |nsString| by
+ // converting the ASCII literal with padding.
+ // Note: this requires an allocation
+
+
+foo( nsAutoString("hello") );
+
+
+void foo( const nsAReadableString& aUCS2string );
+
+foo( nsLiteralString(L"hello") );
+
+
+inline
+nsAutoString
+NS_ConvertToString( const char* anASCIIstring )
+ {
+ nsAutoString aUCS2string;
+ aUCS2string.AssignWithConversion(anASCIIstring);
+ return aUCS2string;
+ }
+
+
+foo( NS_ConvertToString("hello") );
+
+
+class NS_ConvertASCIItoUCS2 : public nsAutoString
+ {
+ public:
+ NS_ConvertASCIItoUCS2( const char* );
+ // ...
+ };
+
+
+foo( NS_ConvertASCIItoUCS2("hello") );
+
+
+nsString aUCS2string;
+nsCString anASCIIstring;
+// ...
+
+aUCS2string += anASCIIstring; // Currently legal, but not for long
+aUCS2string.Append(anASCIIstring); // same
+
+aUCS2string.AppendWithConversion(anASCIIstring); // the new way
+
+if ( aUCS2string == anASCIIstring ) // Sorry, this is going away too
+ // ...
+
+if ( aUCS2string.EqualsWithConversion(anASCIIstring) )
+ // ...
+
+
+xxxConvertingASCIItoUCS2
+xxxConvertingUCS2toASCII
+
+
+NS_LITERAL_STRING("Hello")
+
+
+nsLiteralString(L"Hello")
+
+
+NS_ConvertASCIItoUCS2("Hello")
+
+
+
+
+
+
+Date: Wed, 19 Apr 2000 21:12:47 -0400
+Subject: more string info
+
+
+
+
+Date: Fri, 26 May 2000 15:31:37 -0400
+Subject: Re: Question on ==
+
+
+
+char* a;
+char* b;
+
+// ...
+
+if ( a == b )
+ // ...
+
+
+
+Date: Wed, 14 Jun 2000 14:38:55 -0400
+Subject: Re: Fix to XprtDefs.h
+
+
+
+#ifdef HAVE_CPP_2BYTE_WCHAR_T
+ #define NS_LITERAL_STRING(s) nsLiteralString(L##s, \
+ (sizeof(L##s)/sizeof(wchar_t))-1)
+#else
+ #define NS_LITERAL_STRING(s) NS_ConvertASCIItoUCS2(s, \
+ sizeof(s)-1)
+#endif
+
+
+
+Date: Thu, 15 Jun 2000 19:36:55 -0400
+Subject: Re: Checkin approval for bug 32336
+
+
+
+S.Equals(NS_LITERAL_STRING("bar"), PR_TRUE, 3)
+
+
+S == NS_LITERAL_STRING("bar")
+
+
+
+Date: Thu, 15 Jun 2000 19:47:12 -0400
+Subject: Re: [Fwd: how to use nsString ?]
+
+
+
+ >I see these same examples time and again in the embedding
+ >samples/docs, but I can't compile them.
+
+
+
+ >nsString URLString("http://www.mozilla.org");
+
+
+
+nsString URLString;
+URLString.AssignWithConversion("http://www.mozilla.org");
+
+
+ >nsString URLString;
+ >URLString.SetString("www.mozilla.org");
+
+
+
+
+Date: Thu, 15 Jun 2000 21:26:51 -0400
+Subject: Re: Checkin approval for bug 32336
+
+
+
+ >I *need* the count attribute, because I need to compare only the first
+ >chars (that's inherent to the logic).
+
+
+
+Substring(S, 0, 3) == NS_LITERAL_STRING("bar")
+
+
+
+Date: Mon, 19 Jun 2000 14:23:47 -0400
+Subject: Re: string fu
+
+
+
+ >It seems less convenient to have to first check path.IsEmpty, and
+ >then if false get path.Last and test it.
+
+
+
+PRBool needsDelim = PR_FALSE;
+if ( !path.IsEmpty() )
+ {
+ PRUnichar last = path.Last();
+ needsDelim = !(last == '/' || last == '\\');
+ }
+
+
+ >Can we fix GetUnicode in this case?
+
+
+
+
+Date: Mon, 19 Jun 2000 17:22:31 -0400
+
+
+
+ >I hit your funky CharAt assertion tonight in this piece of code:
+
+ >NS_IMETHODIMP
+ >nsIOService::ResolveRelativePath(
+ > const char *relativePath,
+ > const char* basePath,
+ > char **result )
+ > {
+ > nsCAutoString name;
+ > nsCAutoString path(basePath);
+ >
+ > PRUnichar last = path.Last();
+ > PRBool needsDelim = !(last == '/' || last == '\\' || last ==
+ > '\0');
+ > ...
+
+ >where basePath is null. It seems less convenient to have to first
+ >check path.IsEmpty, and then if false get path.Last and test it.
+
+
+
+ >What would you prefer? That extracting a character not in the
+ >string always return CharT(0)? Can't do it for two reasons:
+ >(1) 0 may be a valid character in a particular encoding, so it
+ >can't be used in general as a ``no character at that position''
+ >marker; and (2) I can't control what an individual string
+ >implementation does when asked to get an out-of-bounds fragment,
+ >it's explicitly undefined. That means the result of CharAt is
+ >explicitly undefined for indexes outside the defined contents of
+ >the string. As a debugging convenience, I have made this assert,
+ >but it has always been the case that retrieving such a character
+ >had undefined results ... even in [the old] code.
+
+ >OK, you might say, well at least let me ask for a character that
+ >is only off the end by one. E.g., Last of an empty string.
+ >Reason (1) from above still applies. How bad is it to say, for the
+ >case you gave
+
+ > PRBool needsDelim = PR_FALSE;
+ > if ( !path.IsEmpty() )
+ > {
+ > PRUnichar last = path.Last();
+ > needsDelim = !(last == '/' || last == '\\');
+ > }
+
+ >In general, you probably want to opt out of a whole lot of work
+ >when the source string is empty. It is slightly less convenient,
+ >but it doesn't tie us to a bunch of implementation specific mojo.
+
+
+
+ >Here's another issue, perhaps more serious. If I say this:
+
+ > foo(const PRUnichar* s) {
+ > nsAutoString str(s);
+ > bar(str.GetUnicode());
+ > }
+
+ >where s is null, bar will get passed a zero-length PRUnichar
+ >sequence instead of null. This makes it so that you can't just
+ >test for the argument == null. You have to nsCRT::strlen(arg) == 0
+ >which is much less efficient. Can we fix GetUnicode in this case?
+
+
+
+ >This is an annoying property of auto strings, e.g., that they
+ >always have an allocated buffer. I'm happy to fix this bug,
+ >however, be aware that GetUnicode and GetBuffer are artifacts
+ >of [the old] implementation that we don't want to support. They
+ >are not part of the abstract interface. We will keep them no
+ >longer than we have to. They don't support our multi-fragment
+ >paradigm. People who require a contiguous hunk of characters in
+ >the future, and are unwilling to switch over to chunky-iterators,
+ >may be forced to copy the string to their own buffer. There will
+ >be an implementation of narrow character string that guarantees
+ >contiguous allocation and a zero-terminator, much as nsCString
+ >does now, for compatibility with platform uses, but this won't be
+ >the default string class.
+
+
+
+ >scc: should we add operator PRUnichar*() to
+ >NS_ConvertASCIItoUCS2?
+
+
+
+ >It seems reasonable. A lot more reasonable that forcing people to
+ >call GetUnicode(). I alluded to platform specific classes in an
+ >earlier message to warren that you were cc'd on, Chris. I imagine
+ >that the ...Convert... routines would be required to produce
+ >contiguous allocation 0-terminated strings (though the as yet
+ >unimplemented ...Copy... forms, of course wouldn't. So operator
+ >const PRUnichar*() const makes perfect sense to me here.
+
+
+
+
+Date: Tue, 20 Jun 2000 04:05:31 -0400
+Subject: Re: NS_LITERAL_STRING is broken
+
+
+
+const char* foobar = "foobar";
+
+... NS_LITERAL_STRING(foobar).GetUnicode() ...
+
+
+
+Date: Thu, 29 Jun 2000 13:35:10 -0400
+Subject: Re: a fix
+
+
+
+ > + if (Length() == 0) { return nsnull; }
+
+
+
+
+
+
+
+
+Date: Tue, 8 Aug 2000 23:47:16 -0400
+Subject: Re: nsWritingIterator?
+
+
+
+ >Can you give me any pointers to examples, or docs, or just some
+ >general advice?
+
+
+ http://ScottCollins.net/Journal/discussion/string_iterators.html
+
+
+
+Date: Wed, 9 Aug 2000 02:35:03 -0400
+Subject: Re: nsWritingIterator?
+
+
+
+ // inefficient, but works in a pinch:
+ // iterators can hide all details of chunks by acting like
+ // a raw character pointer
+
+nsWritingIterator<PRUnichar> s = S.BeginWriting();
+nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
+
+ // for each character in the string |S|
+while ( s != done_with_string )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s -'a' + 'A';
+ }
+
+
+
+
+ // efficient
+ // iterators provide a mechanism by which you can process
+ // a chunk-at-a-time
+
+nsWritingIterator<PRUnichar> iter = S.BeginWriting();
+nsWritingIterator<PRUnichar> done_with_string = S.EndWriting();
+
+ // for each chunk of the string
+while ( iter != done_with_string )
+ {
+ size_t N = iter.size_forward(); // # of chars in this chunk
+ PRUnichar* s = iter.get();
+ PRUnichar* done_with_chunk = s + N;
+
+ // for each character in this chunk
+ for ( ; s < done_with_chunk; ++s )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s - 'a' + 'A';
+ }
+
+ // advance the iterator past characters
+ // we examined (and into the next chunk, if any)
+ s += N;
+ }
+
+
+
+ // elegant
+ // pull your transformation into a `sink', and |copy_string|
+ // will efficiently pump any kind of string into it
+
+struct Capitalize
+ {
+ // inline
+ PRUint32
+ write( PRUnichar* s, PRUint32 N )
+ // processes one chunk, called repeatedly by |copy_string|
+ {
+ PRUnichar* done_with_chunk = s + N;
+
+ // for each character in this chunk
+ for ( ; s < done_with_chunk; ++s )
+ {
+ // if the character is lower case, capitalize it
+ if ( 'a' <= *s && *s <= 'z' )
+ *s = *s - 'a' + 'A';
+ }
+ }
+ };
+
+copy_string(S.BeginWriting(), S.EndWriting(), Capitalize());
+
+
+
+Date: Thu, 17 Aug 2000 18:23:22 -0400
+
+
+
+ >I tried looking at the string header files but they
+ >are awfully complicated.
+
+
+
+void consumes_a_string( const nsAReadableString& aInput );
+
+
+void modifies_a_string( nsAWritableString& aResult );
+
+
+void
+reads_a_buffer( const PRUnichar* aInput, PRUint32 aInputLength )
+ {
+ nsLiteralString input(aInput, aInputLength);
+ // doesn't allocate or copy
+
+ // ...
+ }
+
+
+NS_LITERAL_STRING("x")
+
+
+nsLiteralString(L"x", (sizeof(L"x")/sizeof(PRUnichar) - 1))
+
+
+ >For example this won't compile. [...]
+
+ >str1 += L"abc " + str2 + L"def";
+
+
+
+
+str1 += NS_LITERAL_STRING("abc ") + str2 + NS_LITERAL_STRING("def");
+
+
+ >Another one is:
+ >function(const PRUnichar *foo);
+ >call function(L"abc " + str2);
+
+ >It won't create a temporary nsString.
+
+
+
+function( const nsAReadableString& )
+
+
+
+Date: Thu, 17 Aug 2000 20:57:08 -0400
+Subject: re our conversation
+
+
+ return ToNewUnicode( nsLiteralCString(buffer) );
+
+
+
+
+
+
+
+
+Date: Fri, 18 Aug 2000 02:52:45 -0400
+Subject: Re: More questions and new string API
+
+
+
+ >1) How do I return a static string?
+
+ >const nsAReadableString& foo() {return NS_LITERAL_STRING("x");}
+ >errors on taking the address of a temporary variable.
+
+
+
+const nsAReadableString&
+foo()
+ {
+#ifdef HAVE_CPP_2BYTE_WCHAR_T
+ static nsLiteralString static_foo(L"x", 1);
+#else
+ static nsLiteralString static_foo;
+ static PRBool initialized = PR_FALSE;
+ if ( !initialized )
+ {
+ static_foo.AssignWithConversion("x", 1);
+ initialized = PR_TRUE;
+ }
+#endif
+ return static_foo;
+ }
+
+
+ >2) I'm using these with the STL library in an XPCOM component.
+ >What type should I use with map? This doesn't work...
+
+ >typedef map
+
+
+map
+
+map
+
+map
+
+ >3) How do a get a raw PRUnichar pointer out of nsAReadableString
+ >when I need to call something that wants 'unsigned short *'?
+
+
+
+void ReadBuffer( PRUnichar* );
+
+
+ReadBuffer( nsPromiseFlatString(aString) );
+
+
+void ConsumeBuffer( PRUnichar* );
+
+ConsumeBuffer( ToNewUnicode(aString) );
+
+
+ >4) How do I declare a local variable to hold a nsAReadableString?
+ >and a member variable?
+
+
+
+struct foo
+ {
+ const nsAReadableString& mString;
+ // ...
+
+ foo( const nsAReadableString& aString ) : mString(aString) { }
+ };
+
+
+ >5) If I call a function that returns a PRUnichar* and I want t
+ >use it as a nsAReadableString should I wrap it in a
+ >nsLiteralString?
+
+
+
+
+Date: Sun, 3 Sep 2000 03:52:17 -0400
+
+
+
+
+Date: Wed, 20 Sep 2000 17:32:13 -0400
+Subject: Re: how to free an nsString::ToNewCString
+
+
+
+ >What's the current approved way to free an nsString::ToNewCString?
+
+
+
+
+
+
+Date: Thu, 12 Oct 2000 00:57:28 -0400
+Subject: string answers
+
+
+
+nsresult
+DoSomething( nsAWritableString& answer )
+ {
+ nsresult rv;
+
+ nsXPIDLString registry_data;
+ Fetch("key", getter_Shares(registry_data));
+
+ nsLiteralString path(not_my_string);
+
+ PRInt32 first_colon = path.FindChar(PRUnichar(':'));
+ if ( first_colon != -1 )
+ {
+ // convert ... extract path from |path|
+ nsCOMPtr
+
+
+Date: Thu, 12 Oct 2000 02:03:49 -0400
+Subject: Re: and the answer is ...
+
+
+
+const char*
+debug_string( const nsAReadableCString& aCString )
+ {
+ nsReadingIterator<char> iter;
+ aCString.BeginReading(iter);
+ return aCString.IsEmpty() ? "" : iter.get();
+ }
+
+
+
+Date: Thu, 12 Oct 2000 08:30:32 -0400
+Subject: Re: Self healing the cache :-)
+
+
+
+ >NS_LITERAL_STRING(NS_XPCOM_SHUTDOWN_OBSERVER_ID);
+
+
+
+NS_LITERAL_STRING("foo")
+
+
+#define FOO "foo"
+NS_LITERAL_STRING(FOO)
+
+
+nsLiteralString(LFOO, sizeof(LFOO)...
+
+
+nsLiteralString(FOO, sizeof(FOO)/sizeof(PRUnichar)
+ + sizeof(PRUnichar('\0')))
+
+
+nsLiteralString(FOO)
+
+
+
+Date: Thu, 12 Oct 2000 08:36:14 -0400
+Subject: Re: Self healing the cache :-)
+
+
+
+#define FOO L"foo"
+
+
+#define FOO "foo"
+
+
+NS_NAMED_LITERAL_STRING(FOO, "foo")
+
+
+
+Date: Sun, 3 Dec 2000 16:38:07 -0400
+Subject: Re: another copy_string question
+
+
+
+ >Is there a way to tell, inside the write() sink, if one is in the
+ >final hunk? I need to do some special processing at the end.
+
+
+
+{
+ MySink sink;
+ nsReadingIterator<PRUnichar> sourceStart = aStr.BeginReading();
+ nsReadingIterator<PRUnichar> sourceEnd = aStr.EndReading();
+ copy_string(sourceStart, sourceEnd, sink);
+ // |sink| destructor executed here
+}
+
+
+
+Date: Fri, 15 Dec 2000 20:02:08 -0400
+Subject: fragment of code
+
+
+
+nsPromiseFlatString flatKey(aReadable);
+
+flatKey.get()
+
+
+
+Date: Tue, 16 Jan 2001 16:47:37 -0400
+Subject: Re: a few string questions...
+
+
+>I've accumulated a few questions I've been wanting to ask you, mostly
+>about string stuff. Nothing urgent, but I want to ask them before I
+>forget. So here goes...:
+>
+>1) Is it acceptable to use nsLiteralCString or nsLiteralString on
+>something that's not a literal? This can be useful in some places,
+>for example, to convert a char* to PRUnichar*:
+>
+>PRUnichar* new = ToNewUnicode(nsLiteralCString(myCharPtr));
+
+
+
+Date: Thu, 1 Feb 2001 15:12:41 -0400
+Subject: Re: [Fwd: bad string, bad string]
+
+
+
+
+Date: Tue, 6 Feb 2001 18:52:23 -0400
+Subject: seeking review for bug #57087
+
+
+
+NS_MULTILINE_LITERAL( NS_L("This is the start of a very long line")
+ NS_L(" which actually continues across")
+ NS_L(" a couple more.") )
+
+
+
+Date: Wed, 14 Feb 2001 16:09:10 -0400
+Subject: Re: Question...
+
+
+
+PRUnichar* sourceChars = ...;
+
+char* destChars = ToNewCString(nsLiteralString(sourceChars));
+
+
+destChars = ToNewCString( nsLiteralString(sourceChars, length) );
+
+
+
+Date: Fri, 23 Feb 2001 03:12:58 -0400
+Subject: string snippet
+
+
+
+nsCString aInput;
+
+
+
+nsReadingIterator<char> search_start;
+aInput.BeginReading(search_start);
+
+nsReadingIterator<char> search_end;
+aInput.EndReading(search_end);
+
+if ( FindCharInReadable(':', search_start, search_end) )
+ {
+ ++search_start;
+ return ToNewCString( Substring(aInput, search_start, search_end)
+);
+ }
+
+
+
+Date: Wed, 7 Mar 2001 19:44:08 -0400
+Subject: string help
+
+
+
+
+Date: Fri, 9 Mar 2001 20:56:07 -0400
+Subject: Re: string assertions
+
+
+
+
+Date: Sat, 31 Mar 2001 11:04:03 -0400
+Subject: Re: Sun bustage and string advice
+
+
+
+NS_LITERAL_STRING("foo").Equals(aTopic) // or
+
+NS_LITERAL_STRING("foo") == nsLiteralString(aTopic)
+
+