I am using the GETALLTEXT function contained in RVGETTEXTW to retrieve just the text of a document (from a DBRichviewEdit). In most all cases, this works without problems, but on a few documents, the returned string is nothing but the formatting, not the text.
I cannot find anything different about the actual documents, and I've also tried RVGETTEXT vs. RVGETTEXTW but in both cases, I get the same results.
GetAllText Problem
-
- Site Admin
- Posts: 17557
- Joined: Sat Aug 27, 2005 10:28 am
- Contact:
Sergey,
I was prepping a document to send to you, (Cannot send you an original due to sensitive nature of the document).
Noticed that when I removed the graphic from one of the documents that was producing the error, it no longer produced the same error. I thought Getalltext only returned the text of a document removing all formatting and graphics.
How do I manually remove the graphics from the document? What I want is a string with nothing more than the text that I can pass through my regular expression parser.
I was prepping a document to send to you, (Cannot send you an original due to sensitive nature of the document).
Noticed that when I removed the graphic from one of the documents that was producing the error, it no longer produced the same error. I thought Getalltext only returned the text of a document removing all formatting and graphics.
How do I manually remove the graphics from the document? What I want is a string with nothing more than the text that I can pass through my regular expression parser.
-
- Site Admin
- Posts: 17557
- Joined: Sat Aug 27, 2005 10:28 am
- Contact:
-
- Site Admin
- Posts: 17557
- Joined: Sat Aug 27, 2005 10:28 am
- Contact:
Received.
The problem is not in GetAllText, the problem is in error that occurs when loading this document from DB record.
When loading a document, TDBRichView tries to load it as RVF. If failed, it tries to load as RTF. If failed, it tries to load as a plain text.
In your case, the record contains RVF document, but DBRichView fails to read it, so it load this document as a plain text. As a result, DBRichView displays all RVF codes, and they are returned by GetAllText.
Why RVF reading fails? The document contains image of TPNGObject class.
To load it, TPNGObject must be registered.
Call RegisterClass(TPNGObject) one time before the first reading from the database.
The problem is not in GetAllText, the problem is in error that occurs when loading this document from DB record.
When loading a document, TDBRichView tries to load it as RVF. If failed, it tries to load as RTF. If failed, it tries to load as a plain text.
In your case, the record contains RVF document, but DBRichView fails to read it, so it load this document as a plain text. As a result, DBRichView displays all RVF codes, and they are returned by GetAllText.
Why RVF reading fails? The document contains image of TPNGObject class.
To load it, TPNGObject must be registered.
Call RegisterClass(TPNGObject) one time before the first reading from the database.