trichview.com

trichview.support




Re: Word Count Example?


Return to index


Author

Message

DavidRM

Posted: 01/10/2004 8:26:51


I came up with the following. It counts characters (with and without spaces),

words, and paragraphs. Not sure how to get a line count yet.


procedure GetStatistics(richViewEdit:TRichViewEdit; countSelection:boolean;

var charCnt, charNoSpaceCnt, wordCnt, lineCnt, paraCnt:integer);

   const

      WhiteSpace: set of char = [#9,#10,#13,#32];

      WordBreak: set of char = [#9,#10,#13,#32,'-'];

      Punctuation: set of char = ['!','@','#','$','%','^','&','*','(',')','-','=','+','`','~',';',':','''','"',',','.','<','>','/','?','\','|','[',']','{','}'];

   var

      statText: string;

      mStream: TMemoryStream;

      wordStarted, paraStarted, lastHyphenated,

         isWordBreak: boolean;

      ch: char;

      cc: integer;

   begin

   mStream:=TMemoryStream.Create;

   try

      if richViewEdit.SaveTextToStream('',mStream,0,countSelection,true)

then

         begin

         SetLength(statText,mStream.Size);

         mStream.Position:=0;

         mStream.ReadBuffer(PChar(statText)^,mStream.Size);

         end

      else

         statText:='';

   finally

      mStream.Free;

   end; // try

   if statText<>'' then

      begin

      // initialize flags

      wordStarted:=false;

      paraStarted:=false;

      lastHyphenated:=true;

      charCnt:=Length(statText);

      charNoSpaceCnt:=0;

      wordCnt:=0;

      lineCnt:=0;

      paraCnt:=0;

      for cc:=1 to Length(statText) do

         begin

         ch:=statText[cc];

         isWordBreak:=(ch in WordBreak);

         if isWordBreak then

            begin

            if wordStarted then

               begin

               // hyphenated words count as 1 word

               if (ch='-') and (not lastHyphenated) then

                  lastHyphenated:=true

               else

                  begin

                  inc(wordCnt);

                  wordStarted:=false;

                  lastHyphenated:=false;

                  end;

               end;

            // check for end of paragraph

            if (ch in WhiteSpace) then

               begin

               if (ch=#13) then

                  begin

                  if paraStarted then

                     begin

                     inc(paraCnt);

                     paraStarted:=false;

                     end;

                  dec(charCnt);

                  end

               else if (ch=#9) or (ch=#10) then

                  dec(charCnt);

               end

            else

               inc(charNoSpaceCnt);

            end

         else

            begin

            if not (ch in Punctuation) then

               wordStarted:=true;

            paraStarted:=true;

            inc(charNoSpaceCnt);

            end;

         end;

      if wordStarted then

         inc(wordCnt);

      if paraStarted then

         inc(paraCnt);

      end;

   end;


This will actually come very close to duplicating the word counts given by

MS Word. For English, anyway. Hard-wiring the white space, word breaks, and

punctuation bugs me. I'd prefer a a way to get language-specific sets for

those (or a Win32 API call that classifies a particular character). Maybe

I'll find that someday (or someone can slip me the answers... ;) ).


-David

http://www.davidrm.com


"DavidRM" <[email protected]> wrote:

>

>Is there an example somewhere of how to count the words of a TRichView/Edit?

>I'd also like to count paragraphs, lines, and characters.

>

>Thanks.

>

>-David

>http://www.davidrm.com





Powered by ABC Amber Outlook Express Converter