diff --git a/mozilla/content/html/document/src/nsHTMLDocument.cpp b/mozilla/content/html/document/src/nsHTMLDocument.cpp index 2724a8009b2..3a8dc377e93 100644 --- a/mozilla/content/html/document/src/nsHTMLDocument.cpp +++ b/mozilla/content/html/document/src/nsHTMLDocument.cpp @@ -1475,7 +1475,6 @@ nsHTMLDocument::CreateEntityReference(const nsAReadableString& aName, return NS_ERROR_DOM_NOT_SUPPORTED_ERR; } - NS_IMETHODIMP nsHTMLDocument::GetDoctype(nsIDOMDocumentType** aDocumentType) { @@ -3162,53 +3161,6 @@ nsHTMLDocument::Resolve(JSContext *aContext, JSObject *aObj, jsval aID) return ret; } -//---------------------------- -static PRBool IsInline(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_b: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_img: - case eHTMLTag_i: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_textarea: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - - result = PR_TRUE; - break; - - default: - break; - - } - return result; -} - //---------------------------- class SubText { public: @@ -3427,22 +3379,11 @@ PRBool nsHTMLDocument::SearchBlock(BlockText & aBlockText, return found; } -/////////////////////////////////////////////////////// -// Check to see if a Content node is a block tag. -// We need to treat pre nodes as inline for selection -// purposes even though they're really block nodes. -/////////////////////////////////////////////////////// -PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock) const +//////////////////////////////////////////////////////////////// +// Methods to see if a Content node is a block or an inline tag. +//////////////////////////////////////////////////////////////// +PRInt32 nsHTMLDocument::GetTagID(nsString& aName) const { - nsIDOMElement* domElement; - nsresult rv = aNode->QueryInterface(kIDOMElementIID,(void **)&domElement); - if (NS_FAILED(rv)) - return PR_FALSE; - - nsAutoString tagName; - domElement->GetTagName(tagName); - NS_RELEASE(domElement); - if (!mParserService) { nsIParserService* parserService; @@ -3457,12 +3398,35 @@ PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock) const } PRInt32 id; - mParserService->HTMLStringTagToId(tagName, &id); + mParserService->HTMLStringTagToId(aName, &id); + return id; +} - if (id == eHTMLTag_pre) - return aPreIsBlock; +PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode* aNode) const +{ + if (!aNode) + return NS_ERROR_INVALID_ARG; - return !IsInline(nsHTMLTag(id)); + // Get the id of the tag itself: + nsAutoString tagName; + aNode->GetNodeName(tagName); + PRInt32 ID = GetTagID(tagName); + + // Get the parent + nsCOMPtr parentNode; + nsresult rv = aNode->GetParentNode(getter_AddRefs(parentNode)); + if (NS_FAILED(rv)) return rv; + + // and the parent's id + parentNode->GetNodeName(tagName); + PRInt32 parentID = GetTagID(tagName); + + // Now we can get the inline status from the DTD: + nsCOMPtr dtd; + rv = GetDTD(getter_AddRefs(dtd)); + if (NS_FAILED(rv) || !dtd) + return PR_FALSE; + return dtd->IsBlockElement(ID, parentID); } ///////////////////////////////////////////// @@ -4180,26 +4144,33 @@ nsHTMLDocument::IsInSelection(nsIDOMSelection* aSelection, const nsIContent* aContent) const { // HTML document has to include body in the selection, - // so that output can see style nodes on the body. -#if 0 //this was here to pass the wrap col around. this is NOT necessary any more + // so that output can see style nodes on the body + // in case the caller doesn't know to specify wrap column + // or preformatted or similar styles. nsIAtom* tag; nsresult rv = aContent->GetTag(tag); PRBool retval = (NS_SUCCEEDED(rv) && tag == nsHTMLAtoms::body); - NS_IF_RELEASE(tag); if (retval) - return retval; -#endif + { + NS_IF_RELEASE(tag); + return PR_TRUE; + } // If it's a block node, return true if the node itself // is in the selection. If it's inline, return true if // the node or any of its children is in the selection. - PRBool retval; nsCOMPtr node (do_QueryInterface((nsIContent*)aContent)); - if (NodeIsBlock(node, PR_FALSE)) - aSelection->ContainsNode(node, PR_FALSE, &retval); - else - aSelection->ContainsNode(node, PR_TRUE, &retval); + PRBool nodeIsBlock = (tag != nsHTMLAtoms::pre + && tag != nsHTMLAtoms::h1 + && tag != nsHTMLAtoms::h2 + && tag != nsHTMLAtoms::h3 + && tag != nsHTMLAtoms::h4 + && tag != nsHTMLAtoms::h5 + && tag != nsHTMLAtoms::h6 + && NodeIsBlock(node)); + aSelection->ContainsNode(node, !nodeIsBlock, &retval); + NS_IF_RELEASE(tag); return retval; } diff --git a/mozilla/content/html/document/src/nsHTMLDocument.h b/mozilla/content/html/document/src/nsHTMLDocument.h index dd7fc1d2fd6..1f21a911565 100644 --- a/mozilla/content/html/document/src/nsHTMLDocument.h +++ b/mozilla/content/html/document/src/nsHTMLDocument.h @@ -158,7 +158,8 @@ protected: nsString & aStr, nsIDOMNode * aCurrentBlock); - PRBool NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock = PR_TRUE) const; + PRInt32 GetTagID(nsString& aName) const; + PRBool NodeIsBlock(nsIDOMNode * aNode) const; nsIDOMNode * FindBlockParent(nsIDOMNode * aNode, PRBool aSkipThisContent = PR_FALSE); diff --git a/mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp b/mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp index 95f057fcd26..1bf5aa101fb 100644 --- a/mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp +++ b/mozilla/htmlparser/src/nsHTMLContentSinkStream.cpp @@ -50,20 +50,15 @@ static NS_DEFINE_CID(kSaveAsCharsetCID, NS_SAVEASCHARSET_CID); static NS_DEFINE_CID(kEntityConverterCID, NS_ENTITYCONVERTER_CID); +static NS_DEFINE_IID(kCParserIID, NS_IPARSER_IID); +static NS_DEFINE_IID(kCParserCID, NS_PARSER_IID); + static char* gHeaderComment = ""; static char* gDocTypeHeader = ""; const int gTabSize=2; static const nsString gMozDirty = NS_ConvertToString("_moz_dirty"); -static PRBool IsInline(eHTMLTags aTag); -static PRBool IsBlockLevel(eHTMLTags aTag); -static PRInt32 BreakBeforeOpen(eHTMLTags aTag); -static PRInt32 BreakAfterOpen(eHTMLTags aTag); -static PRInt32 BreakBeforeClose(eHTMLTags aTag); -static PRInt32 BreakAfterClose(eHTMLTags aTag); -static PRBool IndentChildren(eHTMLTags aTag); - /** * This method gets called as part of our COM-like interfaces. * Its purpose is to create an interface to parser object @@ -115,6 +110,7 @@ nsHTMLContentSinkStream::nsHTMLContentSinkStream() mLowerCaseTags = PR_TRUE; memset(mHTMLTagStack,0,sizeof(mHTMLTagStack)); memset(mDirtyStack,0,sizeof(mDirtyStack)); + mDTD = 0; mHTMLStackPos = 0; mColPos = 0; mIndent = 0; @@ -166,8 +162,10 @@ nsHTMLContentSinkStream::Initialize(nsIOutputStream* aOutStream, nsHTMLContentSinkStream::~nsHTMLContentSinkStream() { - if (mBuffer) - nsMemory::Free(mBuffer); + NS_IF_RELEASE(mDTD); + + if (mBuffer) + nsMemory::Free(mBuffer); } /** @@ -1111,6 +1109,159 @@ nsHTMLContentSinkStream::CloseContainer(const nsIParserNode& aNode){ return NS_OK; } +/** + * Find out from the parser whether a node is a block node. + */ +PRBool nsHTMLContentSinkStream::IsBlockLevel(eHTMLTags aTag) +{ + if (!mDTD) + { + nsCOMPtr parser; + nsresult rv = nsComponentManager::CreateInstance(kCParserCID, + nsnull, + kCParserIID, + (void **)&parser); + if (NS_FAILED(rv)) return rv; + if (!parser) return NS_ERROR_FAILURE; + + nsAutoString htmlmime (NS_LITERAL_STRING("text/html")); + rv = parser->CreateCompatibleDTD(&mDTD, 0, eViewNormal, + &htmlmime, eDTDMode_transitional); + /* XXX Note: We output linebreaks for blocks. + I.e. we output linebreaks for "unknown" inline tags. + I just hunted such a bug for , same for , etc.. + Better fallback to inline. /BenB */ + if (NS_FAILED(rv) || !mDTD) + return PR_FALSE; + } + + // Now we can get the inline status from the DTD: + return mDTD->IsBlockElement(aTag, eHTMLTag_unknown); +} + +/** + * **** Pretty Printing Methods ****** + * + */ + +/** + * Desired line break state before the open tag. + */ +PRBool nsHTMLContentSinkStream::BreakBeforeOpen(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + switch (aTag) + { + case eHTMLTag_html: + result = PR_FALSE; + break; + + default: + result = IsBlockLevel(aTag); + } + return result; +} + +/** + * Desired line break state after the open tag. + */ +PRBool nsHTMLContentSinkStream::BreakAfterOpen(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_body: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_table: + case eHTMLTag_tbody: + case eHTMLTag_style: + case eHTMLTag_br: + result = PR_TRUE; + break; + + default: + break; + } + return result; +} + +/** + * Desired line break state before the close tag. + */ +PRBool nsHTMLContentSinkStream::BreakBeforeClose(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_head: + case eHTMLTag_body: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_table: + case eHTMLTag_tbody: + case eHTMLTag_style: + result = PR_TRUE; + break; + + default: + break; + } + return result; +} + +/** + * Desired line break state after the close tag. + */ +PRBool nsHTMLContentSinkStream::BreakAfterClose(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_tr: + case eHTMLTag_th: + case eHTMLTag_td: + case eHTMLTag_pre: + result = PR_TRUE; + break; + + default: + result = IsBlockLevel(aTag); + } + return result; +} + +/** + * Indent/outdent when the open/close tags are encountered. + * This implies that BreakAfterOpen() and BreakBeforeClose() + * are true no matter what those methods return. + */ +PRBool nsHTMLContentSinkStream::IndentChildren(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_table: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_tbody: + case eHTMLTag_form: + case eHTMLTag_frameset: + result = PR_TRUE; + break; + + default: + result = PR_FALSE; + break; + } + return result; +} /** * This method gets called when the parser begins the process @@ -1176,181 +1327,3 @@ nsHTMLContentSinkStream::NotifyError(const nsParserError* aError) return NS_OK; } -///////////////////////////////////////////////////////////// -//// Useful static methods -///////////////////////////////////////////////////////////// - -static PRBool IsInline(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_b: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_img: - case eHTMLTag_i: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_textarea: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - result = PR_TRUE; - break; - - default: - break; - - } - return result; -} - -static PRBool IsBlockLevel(eHTMLTags aTag) -{ - return !IsInline(aTag); -} - -/** - * **** Pretty Printing Methods ****** - * - */ - -/** - * Desired line break state before the open tag. - */ -static PRBool BreakBeforeOpen(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - switch (aTag) - { - case eHTMLTag_html: - result = PR_FALSE; - break; - - default: - result = IsBlockLevel(aTag); - } - return result; -} - -/** - * Desired line break state after the open tag. - */ -static PRBool BreakAfterOpen(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_body: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_table: - case eHTMLTag_tbody: - case eHTMLTag_style: - case eHTMLTag_br: - result = PR_TRUE; - break; - - default: - break; - } - return result; -} - -/** - * Desired line break state before the close tag. - */ -static PRBool BreakBeforeClose(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_head: - case eHTMLTag_body: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_table: - case eHTMLTag_tbody: - case eHTMLTag_style: - result = PR_TRUE; - break; - - default: - break; - } - return result; -} - -/** - * Desired line break state after the close tag. - */ -static PRBool BreakAfterClose(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_tr: - case eHTMLTag_th: - case eHTMLTag_td: - case eHTMLTag_pre: - result = PR_TRUE; - break; - - default: - result = IsBlockLevel(aTag); - } - return result; -} - -/** - * Indent/outdent when the open/close tags are encountered. - * This implies that BreakAfterOpen() and BreakBeforeClose() - * are true no matter what those methods return. - */ -static PRBool IndentChildren(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_table: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_tbody: - case eHTMLTag_form: - case eHTMLTag_frameset: - result = PR_TRUE; - break; - - default: - result = PR_FALSE; - break; - } - return result; -} - diff --git a/mozilla/htmlparser/src/nsHTMLContentSinkStream.h b/mozilla/htmlparser/src/nsHTMLContentSinkStream.h index cf44cab7ac0..8f921e4009e 100644 --- a/mozilla/htmlparser/src/nsHTMLContentSinkStream.h +++ b/mozilla/htmlparser/src/nsHTMLContentSinkStream.h @@ -62,6 +62,7 @@ class ostream; class nsIParserNode; class nsIOutputStream; +class nsIDTD; class nsIHTMLContentSinkStream : public nsIHTMLContentSink { public: @@ -102,7 +103,6 @@ class nsHTMLContentSinkStream : public nsIHTMLContentSinkStream const nsAReadableString* aCharsetOverride, PRUint32 aFlags); - /******************************************************************* * The following methods are inherited from nsIContentSink. * Please see that file for details. @@ -147,6 +147,13 @@ public: protected: + PRBool IsBlockLevel(eHTMLTags aTag); + PRInt32 BreakBeforeOpen(eHTMLTags aTag); + PRInt32 BreakAfterOpen(eHTMLTags aTag); + PRInt32 BreakBeforeClose(eHTMLTags aTag); + PRInt32 BreakAfterClose(eHTMLTags aTag); + PRBool IndentChildren(eHTMLTags aTag); + void WriteAttributes(const nsIParserNode& aNode); void AddStartTag(const nsIParserNode& aNode); void AddEndTag(const nsIParserNode& aNode); @@ -170,6 +177,8 @@ protected: nsIOutputStream* mStream; nsAWritableString* mString; + nsIDTD* mDTD; + int mTabLevel; char* mBuffer; PRInt32 mBufferSize; diff --git a/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.cpp b/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.cpp index 9b38d704361..42d4ab664f9 100644 --- a/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.cpp +++ b/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.cpp @@ -56,6 +56,8 @@ static NS_DEFINE_CID(kCharsetConverterManagerCID, NS_ICHARSETCONVERTERMANAGER_CID); static NS_DEFINE_CID(kLWBrkCID, NS_LWBRK_CID); static NS_DEFINE_CID(kPrefServiceCID, NS_PREF_CID); +static NS_DEFINE_IID(kCParserIID, NS_IPARSER_IID); +static NS_DEFINE_IID(kCParserCID, NS_PARSER_IID); #define PREF_STRUCTS "converter.html2txt.structs" #define PREF_HEADER_STRATEGY "converter.html2txt.header_strategy" @@ -73,8 +75,6 @@ const PRInt32 gIndentSizeList = (gTabSize > gOLNumberWidth+3) ? gTabSize: gOLNu // Indention of non-first lines of ul and ol const PRInt32 gIndentSizeDD = gTabSize; // Indention of
-static PRBool IsInline(eHTMLTags aTag); -static PRBool IsBlockLevel(eHTMLTags aTag); static PRInt32 HeaderLevel(eHTMLTags aTag); static PRInt32 unicharwidth(PRUnichar ucs); static PRInt32 unicharwidth(const PRUnichar* pwcs, PRInt32 n); @@ -186,6 +186,7 @@ static const PRUint32 OLStackSize = 100; nsHTMLToTXTSinkStream::nsHTMLToTXTSinkStream() { NS_INIT_REFCNT(); + mDTD = 0; mColPos = 0; mIndent = 0; mCiteQuoteLevel = 0; @@ -236,6 +237,7 @@ nsHTMLToTXTSinkStream::~nsHTMLToTXTSinkStream() delete[] mBuffer; delete[] mTagStack; delete[] mOLStack; + NS_IF_RELEASE(mDTD); NS_IF_RELEASE(mUnicodeEncoder); NS_IF_RELEASE(mLineBreaker); } @@ -416,6 +418,7 @@ nsHTMLToTXTSinkStream::AddProcessingInstruction(const nsIParserNode& aNode){ NS_IMETHODIMP nsHTMLToTXTSinkStream::AddDocTypeDecl(const nsIParserNode& aNode, PRInt32 aMode) { + // Should probably set DTD return NS_OK; } @@ -1603,62 +1606,31 @@ nsHTMLToTXTSinkStream::NotifyError(const nsParserError* aError) return NS_OK; } -PRBool IsInline(eHTMLTags aTag) +PRBool nsHTMLToTXTSinkStream::IsBlockLevel(eHTMLTags aTag) { - PRBool result = PR_FALSE; - - switch (aTag) + if (!mDTD) { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_b: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_del: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_i: - case eHTMLTag_img: - case eHTMLTag_ins: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_q: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_td: - case eHTMLTag_textarea: - case eHTMLTag_th: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - result = PR_TRUE; - break; + nsCOMPtr parser; + nsresult rv = nsComponentManager::CreateInstance(kCParserCID, + nsnull, + kCParserIID, + (void **)&parser); + if (NS_FAILED(rv)) return rv; + if (!parser) return NS_ERROR_FAILURE; - default: - break; - } - return result; -} - -PRBool IsBlockLevel(eHTMLTags aTag) -{ - return !IsInline(aTag); + nsAutoString htmlmime (NS_LITERAL_STRING("text/html")); + rv = parser->CreateCompatibleDTD(&mDTD, 0, eViewNormal, + &htmlmime, eDTDMode_transitional); /* XXX Note: We output linebreaks for blocks. I.e. we output linebreaks for "unknown" inline tags. I just hunted such a bug for , same for , etc.. Better fallback to inline. /BenB */ + if (NS_FAILED(rv) || !mDTD) + return PR_FALSE; + } + + // Now we can get the inline status from the DTD: + return mDTD->IsBlockElement(aTag, eHTMLTag_unknown); } /* diff --git a/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.h b/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.h index 2c5b631428e..b608bb7e8e4 100644 --- a/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.h +++ b/mozilla/htmlparser/src/nsHTMLToTXTSinkStream.h @@ -46,6 +46,9 @@ #include "nsHTMLTags.h" #include "nsParserCIID.h" #include "nsCOMPtr.h" +#include "nsHTMLTokens.h" // for eHTMLTags + +class nsIDTD; #define NS_IHTMLTOTEXTSINKSTREAM_IID \ {0xa39c6bff, 0x15f0, 0x11d2, \ @@ -157,6 +160,7 @@ protected: PRBool DoOutput(); PRBool MayWrap(); + PRBool IsBlockLevel(eHTMLTags aTag); protected: nsIOutputStream* mStream; @@ -166,6 +170,8 @@ protected: nsAWritableString* mString; nsString mCurrentLine; + nsIDTD* mDTD; + PRInt32 mIndent; // mInIndentString keeps a header that has to be written in the indent. // That could be, for instance, the bullet in a bulleted list. diff --git a/mozilla/layout/html/document/src/nsHTMLDocument.cpp b/mozilla/layout/html/document/src/nsHTMLDocument.cpp index 2724a8009b2..3a8dc377e93 100644 --- a/mozilla/layout/html/document/src/nsHTMLDocument.cpp +++ b/mozilla/layout/html/document/src/nsHTMLDocument.cpp @@ -1475,7 +1475,6 @@ nsHTMLDocument::CreateEntityReference(const nsAReadableString& aName, return NS_ERROR_DOM_NOT_SUPPORTED_ERR; } - NS_IMETHODIMP nsHTMLDocument::GetDoctype(nsIDOMDocumentType** aDocumentType) { @@ -3162,53 +3161,6 @@ nsHTMLDocument::Resolve(JSContext *aContext, JSObject *aObj, jsval aID) return ret; } -//---------------------------- -static PRBool IsInline(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_b: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_img: - case eHTMLTag_i: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_textarea: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - - result = PR_TRUE; - break; - - default: - break; - - } - return result; -} - //---------------------------- class SubText { public: @@ -3427,22 +3379,11 @@ PRBool nsHTMLDocument::SearchBlock(BlockText & aBlockText, return found; } -/////////////////////////////////////////////////////// -// Check to see if a Content node is a block tag. -// We need to treat pre nodes as inline for selection -// purposes even though they're really block nodes. -/////////////////////////////////////////////////////// -PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock) const +//////////////////////////////////////////////////////////////// +// Methods to see if a Content node is a block or an inline tag. +//////////////////////////////////////////////////////////////// +PRInt32 nsHTMLDocument::GetTagID(nsString& aName) const { - nsIDOMElement* domElement; - nsresult rv = aNode->QueryInterface(kIDOMElementIID,(void **)&domElement); - if (NS_FAILED(rv)) - return PR_FALSE; - - nsAutoString tagName; - domElement->GetTagName(tagName); - NS_RELEASE(domElement); - if (!mParserService) { nsIParserService* parserService; @@ -3457,12 +3398,35 @@ PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock) const } PRInt32 id; - mParserService->HTMLStringTagToId(tagName, &id); + mParserService->HTMLStringTagToId(aName, &id); + return id; +} - if (id == eHTMLTag_pre) - return aPreIsBlock; +PRBool nsHTMLDocument::NodeIsBlock(nsIDOMNode* aNode) const +{ + if (!aNode) + return NS_ERROR_INVALID_ARG; - return !IsInline(nsHTMLTag(id)); + // Get the id of the tag itself: + nsAutoString tagName; + aNode->GetNodeName(tagName); + PRInt32 ID = GetTagID(tagName); + + // Get the parent + nsCOMPtr parentNode; + nsresult rv = aNode->GetParentNode(getter_AddRefs(parentNode)); + if (NS_FAILED(rv)) return rv; + + // and the parent's id + parentNode->GetNodeName(tagName); + PRInt32 parentID = GetTagID(tagName); + + // Now we can get the inline status from the DTD: + nsCOMPtr dtd; + rv = GetDTD(getter_AddRefs(dtd)); + if (NS_FAILED(rv) || !dtd) + return PR_FALSE; + return dtd->IsBlockElement(ID, parentID); } ///////////////////////////////////////////// @@ -4180,26 +4144,33 @@ nsHTMLDocument::IsInSelection(nsIDOMSelection* aSelection, const nsIContent* aContent) const { // HTML document has to include body in the selection, - // so that output can see style nodes on the body. -#if 0 //this was here to pass the wrap col around. this is NOT necessary any more + // so that output can see style nodes on the body + // in case the caller doesn't know to specify wrap column + // or preformatted or similar styles. nsIAtom* tag; nsresult rv = aContent->GetTag(tag); PRBool retval = (NS_SUCCEEDED(rv) && tag == nsHTMLAtoms::body); - NS_IF_RELEASE(tag); if (retval) - return retval; -#endif + { + NS_IF_RELEASE(tag); + return PR_TRUE; + } // If it's a block node, return true if the node itself // is in the selection. If it's inline, return true if // the node or any of its children is in the selection. - PRBool retval; nsCOMPtr node (do_QueryInterface((nsIContent*)aContent)); - if (NodeIsBlock(node, PR_FALSE)) - aSelection->ContainsNode(node, PR_FALSE, &retval); - else - aSelection->ContainsNode(node, PR_TRUE, &retval); + PRBool nodeIsBlock = (tag != nsHTMLAtoms::pre + && tag != nsHTMLAtoms::h1 + && tag != nsHTMLAtoms::h2 + && tag != nsHTMLAtoms::h3 + && tag != nsHTMLAtoms::h4 + && tag != nsHTMLAtoms::h5 + && tag != nsHTMLAtoms::h6 + && NodeIsBlock(node)); + aSelection->ContainsNode(node, !nodeIsBlock, &retval); + NS_IF_RELEASE(tag); return retval; } diff --git a/mozilla/layout/html/document/src/nsHTMLDocument.h b/mozilla/layout/html/document/src/nsHTMLDocument.h index dd7fc1d2fd6..1f21a911565 100644 --- a/mozilla/layout/html/document/src/nsHTMLDocument.h +++ b/mozilla/layout/html/document/src/nsHTMLDocument.h @@ -158,7 +158,8 @@ protected: nsString & aStr, nsIDOMNode * aCurrentBlock); - PRBool NodeIsBlock(nsIDOMNode * aNode, PRBool aPreIsBlock = PR_TRUE) const; + PRInt32 GetTagID(nsString& aName) const; + PRBool NodeIsBlock(nsIDOMNode * aNode) const; nsIDOMNode * FindBlockParent(nsIDOMNode * aNode, PRBool aSkipThisContent = PR_FALSE); diff --git a/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.cpp b/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.cpp index 95f057fcd26..1bf5aa101fb 100644 --- a/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.cpp +++ b/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.cpp @@ -50,20 +50,15 @@ static NS_DEFINE_CID(kSaveAsCharsetCID, NS_SAVEASCHARSET_CID); static NS_DEFINE_CID(kEntityConverterCID, NS_ENTITYCONVERTER_CID); +static NS_DEFINE_IID(kCParserIID, NS_IPARSER_IID); +static NS_DEFINE_IID(kCParserCID, NS_PARSER_IID); + static char* gHeaderComment = ""; static char* gDocTypeHeader = ""; const int gTabSize=2; static const nsString gMozDirty = NS_ConvertToString("_moz_dirty"); -static PRBool IsInline(eHTMLTags aTag); -static PRBool IsBlockLevel(eHTMLTags aTag); -static PRInt32 BreakBeforeOpen(eHTMLTags aTag); -static PRInt32 BreakAfterOpen(eHTMLTags aTag); -static PRInt32 BreakBeforeClose(eHTMLTags aTag); -static PRInt32 BreakAfterClose(eHTMLTags aTag); -static PRBool IndentChildren(eHTMLTags aTag); - /** * This method gets called as part of our COM-like interfaces. * Its purpose is to create an interface to parser object @@ -115,6 +110,7 @@ nsHTMLContentSinkStream::nsHTMLContentSinkStream() mLowerCaseTags = PR_TRUE; memset(mHTMLTagStack,0,sizeof(mHTMLTagStack)); memset(mDirtyStack,0,sizeof(mDirtyStack)); + mDTD = 0; mHTMLStackPos = 0; mColPos = 0; mIndent = 0; @@ -166,8 +162,10 @@ nsHTMLContentSinkStream::Initialize(nsIOutputStream* aOutStream, nsHTMLContentSinkStream::~nsHTMLContentSinkStream() { - if (mBuffer) - nsMemory::Free(mBuffer); + NS_IF_RELEASE(mDTD); + + if (mBuffer) + nsMemory::Free(mBuffer); } /** @@ -1111,6 +1109,159 @@ nsHTMLContentSinkStream::CloseContainer(const nsIParserNode& aNode){ return NS_OK; } +/** + * Find out from the parser whether a node is a block node. + */ +PRBool nsHTMLContentSinkStream::IsBlockLevel(eHTMLTags aTag) +{ + if (!mDTD) + { + nsCOMPtr parser; + nsresult rv = nsComponentManager::CreateInstance(kCParserCID, + nsnull, + kCParserIID, + (void **)&parser); + if (NS_FAILED(rv)) return rv; + if (!parser) return NS_ERROR_FAILURE; + + nsAutoString htmlmime (NS_LITERAL_STRING("text/html")); + rv = parser->CreateCompatibleDTD(&mDTD, 0, eViewNormal, + &htmlmime, eDTDMode_transitional); + /* XXX Note: We output linebreaks for blocks. + I.e. we output linebreaks for "unknown" inline tags. + I just hunted such a bug for , same for , etc.. + Better fallback to inline. /BenB */ + if (NS_FAILED(rv) || !mDTD) + return PR_FALSE; + } + + // Now we can get the inline status from the DTD: + return mDTD->IsBlockElement(aTag, eHTMLTag_unknown); +} + +/** + * **** Pretty Printing Methods ****** + * + */ + +/** + * Desired line break state before the open tag. + */ +PRBool nsHTMLContentSinkStream::BreakBeforeOpen(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + switch (aTag) + { + case eHTMLTag_html: + result = PR_FALSE; + break; + + default: + result = IsBlockLevel(aTag); + } + return result; +} + +/** + * Desired line break state after the open tag. + */ +PRBool nsHTMLContentSinkStream::BreakAfterOpen(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_body: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_table: + case eHTMLTag_tbody: + case eHTMLTag_style: + case eHTMLTag_br: + result = PR_TRUE; + break; + + default: + break; + } + return result; +} + +/** + * Desired line break state before the close tag. + */ +PRBool nsHTMLContentSinkStream::BreakBeforeClose(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_head: + case eHTMLTag_body: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_table: + case eHTMLTag_tbody: + case eHTMLTag_style: + result = PR_TRUE; + break; + + default: + break; + } + return result; +} + +/** + * Desired line break state after the close tag. + */ +PRBool nsHTMLContentSinkStream::BreakAfterClose(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_html: + case eHTMLTag_tr: + case eHTMLTag_th: + case eHTMLTag_td: + case eHTMLTag_pre: + result = PR_TRUE; + break; + + default: + result = IsBlockLevel(aTag); + } + return result; +} + +/** + * Indent/outdent when the open/close tags are encountered. + * This implies that BreakAfterOpen() and BreakBeforeClose() + * are true no matter what those methods return. + */ +PRBool nsHTMLContentSinkStream::IndentChildren(eHTMLTags aTag) +{ + PRBool result = PR_FALSE; + + switch (aTag) + { + case eHTMLTag_table: + case eHTMLTag_ul: + case eHTMLTag_ol: + case eHTMLTag_tbody: + case eHTMLTag_form: + case eHTMLTag_frameset: + result = PR_TRUE; + break; + + default: + result = PR_FALSE; + break; + } + return result; +} /** * This method gets called when the parser begins the process @@ -1176,181 +1327,3 @@ nsHTMLContentSinkStream::NotifyError(const nsParserError* aError) return NS_OK; } -///////////////////////////////////////////////////////////// -//// Useful static methods -///////////////////////////////////////////////////////////// - -static PRBool IsInline(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_b: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_img: - case eHTMLTag_i: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_textarea: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - result = PR_TRUE; - break; - - default: - break; - - } - return result; -} - -static PRBool IsBlockLevel(eHTMLTags aTag) -{ - return !IsInline(aTag); -} - -/** - * **** Pretty Printing Methods ****** - * - */ - -/** - * Desired line break state before the open tag. - */ -static PRBool BreakBeforeOpen(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - switch (aTag) - { - case eHTMLTag_html: - result = PR_FALSE; - break; - - default: - result = IsBlockLevel(aTag); - } - return result; -} - -/** - * Desired line break state after the open tag. - */ -static PRBool BreakAfterOpen(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_body: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_table: - case eHTMLTag_tbody: - case eHTMLTag_style: - case eHTMLTag_br: - result = PR_TRUE; - break; - - default: - break; - } - return result; -} - -/** - * Desired line break state before the close tag. - */ -static PRBool BreakBeforeClose(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_head: - case eHTMLTag_body: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_table: - case eHTMLTag_tbody: - case eHTMLTag_style: - result = PR_TRUE; - break; - - default: - break; - } - return result; -} - -/** - * Desired line break state after the close tag. - */ -static PRBool BreakAfterClose(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_html: - case eHTMLTag_tr: - case eHTMLTag_th: - case eHTMLTag_td: - case eHTMLTag_pre: - result = PR_TRUE; - break; - - default: - result = IsBlockLevel(aTag); - } - return result; -} - -/** - * Indent/outdent when the open/close tags are encountered. - * This implies that BreakAfterOpen() and BreakBeforeClose() - * are true no matter what those methods return. - */ -static PRBool IndentChildren(eHTMLTags aTag) -{ - PRBool result = PR_FALSE; - - switch (aTag) - { - case eHTMLTag_table: - case eHTMLTag_ul: - case eHTMLTag_ol: - case eHTMLTag_tbody: - case eHTMLTag_form: - case eHTMLTag_frameset: - result = PR_TRUE; - break; - - default: - result = PR_FALSE; - break; - } - return result; -} - diff --git a/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.h b/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.h index cf44cab7ac0..8f921e4009e 100644 --- a/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.h +++ b/mozilla/parser/htmlparser/src/nsHTMLContentSinkStream.h @@ -62,6 +62,7 @@ class ostream; class nsIParserNode; class nsIOutputStream; +class nsIDTD; class nsIHTMLContentSinkStream : public nsIHTMLContentSink { public: @@ -102,7 +103,6 @@ class nsHTMLContentSinkStream : public nsIHTMLContentSinkStream const nsAReadableString* aCharsetOverride, PRUint32 aFlags); - /******************************************************************* * The following methods are inherited from nsIContentSink. * Please see that file for details. @@ -147,6 +147,13 @@ public: protected: + PRBool IsBlockLevel(eHTMLTags aTag); + PRInt32 BreakBeforeOpen(eHTMLTags aTag); + PRInt32 BreakAfterOpen(eHTMLTags aTag); + PRInt32 BreakBeforeClose(eHTMLTags aTag); + PRInt32 BreakAfterClose(eHTMLTags aTag); + PRBool IndentChildren(eHTMLTags aTag); + void WriteAttributes(const nsIParserNode& aNode); void AddStartTag(const nsIParserNode& aNode); void AddEndTag(const nsIParserNode& aNode); @@ -170,6 +177,8 @@ protected: nsIOutputStream* mStream; nsAWritableString* mString; + nsIDTD* mDTD; + int mTabLevel; char* mBuffer; PRInt32 mBufferSize; diff --git a/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.cpp b/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.cpp index 9b38d704361..42d4ab664f9 100644 --- a/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.cpp +++ b/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.cpp @@ -56,6 +56,8 @@ static NS_DEFINE_CID(kCharsetConverterManagerCID, NS_ICHARSETCONVERTERMANAGER_CID); static NS_DEFINE_CID(kLWBrkCID, NS_LWBRK_CID); static NS_DEFINE_CID(kPrefServiceCID, NS_PREF_CID); +static NS_DEFINE_IID(kCParserIID, NS_IPARSER_IID); +static NS_DEFINE_IID(kCParserCID, NS_PARSER_IID); #define PREF_STRUCTS "converter.html2txt.structs" #define PREF_HEADER_STRATEGY "converter.html2txt.header_strategy" @@ -73,8 +75,6 @@ const PRInt32 gIndentSizeList = (gTabSize > gOLNumberWidth+3) ? gTabSize: gOLNu // Indention of non-first lines of ul and ol const PRInt32 gIndentSizeDD = gTabSize; // Indention of
-static PRBool IsInline(eHTMLTags aTag); -static PRBool IsBlockLevel(eHTMLTags aTag); static PRInt32 HeaderLevel(eHTMLTags aTag); static PRInt32 unicharwidth(PRUnichar ucs); static PRInt32 unicharwidth(const PRUnichar* pwcs, PRInt32 n); @@ -186,6 +186,7 @@ static const PRUint32 OLStackSize = 100; nsHTMLToTXTSinkStream::nsHTMLToTXTSinkStream() { NS_INIT_REFCNT(); + mDTD = 0; mColPos = 0; mIndent = 0; mCiteQuoteLevel = 0; @@ -236,6 +237,7 @@ nsHTMLToTXTSinkStream::~nsHTMLToTXTSinkStream() delete[] mBuffer; delete[] mTagStack; delete[] mOLStack; + NS_IF_RELEASE(mDTD); NS_IF_RELEASE(mUnicodeEncoder); NS_IF_RELEASE(mLineBreaker); } @@ -416,6 +418,7 @@ nsHTMLToTXTSinkStream::AddProcessingInstruction(const nsIParserNode& aNode){ NS_IMETHODIMP nsHTMLToTXTSinkStream::AddDocTypeDecl(const nsIParserNode& aNode, PRInt32 aMode) { + // Should probably set DTD return NS_OK; } @@ -1603,62 +1606,31 @@ nsHTMLToTXTSinkStream::NotifyError(const nsParserError* aError) return NS_OK; } -PRBool IsInline(eHTMLTags aTag) +PRBool nsHTMLToTXTSinkStream::IsBlockLevel(eHTMLTags aTag) { - PRBool result = PR_FALSE; - - switch (aTag) + if (!mDTD) { - case eHTMLTag_a: - case eHTMLTag_address: - case eHTMLTag_b: - case eHTMLTag_big: - case eHTMLTag_blink: - case eHTMLTag_br: - case eHTMLTag_cite: - case eHTMLTag_code: - case eHTMLTag_dfn: - case eHTMLTag_del: - case eHTMLTag_em: - case eHTMLTag_font: - case eHTMLTag_i: - case eHTMLTag_img: - case eHTMLTag_ins: - case eHTMLTag_kbd: - case eHTMLTag_keygen: - case eHTMLTag_nobr: - case eHTMLTag_q: - case eHTMLTag_samp: - case eHTMLTag_small: - case eHTMLTag_spacer: - case eHTMLTag_span: - case eHTMLTag_strike: - case eHTMLTag_strong: - case eHTMLTag_sub: - case eHTMLTag_sup: - case eHTMLTag_td: - case eHTMLTag_textarea: - case eHTMLTag_th: - case eHTMLTag_tt: - case eHTMLTag_u: - case eHTMLTag_var: - case eHTMLTag_wbr: - result = PR_TRUE; - break; + nsCOMPtr parser; + nsresult rv = nsComponentManager::CreateInstance(kCParserCID, + nsnull, + kCParserIID, + (void **)&parser); + if (NS_FAILED(rv)) return rv; + if (!parser) return NS_ERROR_FAILURE; - default: - break; - } - return result; -} - -PRBool IsBlockLevel(eHTMLTags aTag) -{ - return !IsInline(aTag); + nsAutoString htmlmime (NS_LITERAL_STRING("text/html")); + rv = parser->CreateCompatibleDTD(&mDTD, 0, eViewNormal, + &htmlmime, eDTDMode_transitional); /* XXX Note: We output linebreaks for blocks. I.e. we output linebreaks for "unknown" inline tags. I just hunted such a bug for , same for , etc.. Better fallback to inline. /BenB */ + if (NS_FAILED(rv) || !mDTD) + return PR_FALSE; + } + + // Now we can get the inline status from the DTD: + return mDTD->IsBlockElement(aTag, eHTMLTag_unknown); } /* diff --git a/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.h b/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.h index 2c5b631428e..b608bb7e8e4 100644 --- a/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.h +++ b/mozilla/parser/htmlparser/src/nsHTMLToTXTSinkStream.h @@ -46,6 +46,9 @@ #include "nsHTMLTags.h" #include "nsParserCIID.h" #include "nsCOMPtr.h" +#include "nsHTMLTokens.h" // for eHTMLTags + +class nsIDTD; #define NS_IHTMLTOTEXTSINKSTREAM_IID \ {0xa39c6bff, 0x15f0, 0x11d2, \ @@ -157,6 +160,7 @@ protected: PRBool DoOutput(); PRBool MayWrap(); + PRBool IsBlockLevel(eHTMLTags aTag); protected: nsIOutputStream* mStream; @@ -166,6 +170,8 @@ protected: nsAWritableString* mString; nsString mCurrentLine; + nsIDTD* mDTD; + PRInt32 mIndent; // mInIndentString keeps a header that has to be written in the indent. // That could be, for instance, the bullet in a bulleted list.