Word in the Office

 

David Shank
Microsoft Corporation

November 6, 2000

Microsoft Word is probably the most widely used of all the Office applications, and it often plays a significant role in custom Office solutions. Developers use Word in many different ways; some are simple, and some are extremely sophisticated. No matter what kind of custom solution is involved, the basics of working with Word documents using Visual Basic for Applications (VBA) is the same. In this month's column, I'm going to give a brief overview of how to work with Word. I'll also provide some detail on how to work with the content of a Word document by using the Range object.

Understanding the Basics

In Word, almost everything you do will involve the Document object itself or the contents of a Document object. When you are using VBA to work with Word, a Document object represents an open document, and all Document objects are part of the Application object's Documents collection.

A document is a collection of characters arranged into words, words arranged into sentences, sentences arranged into paragraphs, and so on. Therefore, each Document object has a Characters collection, a Words collection, a Sentences collection, and a Paragraphs collection. Furthermore, each document has a Sections collection of one or more sections, and each section has a HeadersFooters collection that contains the headers and footers for that section.

Note: You can see the complete Word object model in the Microsoft Office 2000 Developer Object Model Guide. In addition, you can use the Object Browser and Microsoft Word Visual Basic Reference Help to learn more about individual objects, properties, methods, and events.

The Document object is at the heart of working with Word using VBA. When you open a document or create a new document, you create a new Document object. Each document you open or create is added to the Documents collection. The document that has the focus is called the active document and is represented by the ActiveDocument property.

You can reference a Document object as a member of the Documents collection by using either its index value (where index value is the Document object's location in the Documents collection and where 1 is the first document in the collection) or its name. In addition, you can use the ActiveDocument property to return a reference to the document that currently has the focus. For example, if a document named Policies.doc is the only open document, the following three object variables will all point to Policies.doc:

Dim docOne As Word.Document
Dim docTwo As Word.Document
Dim docThree As Word.Document

Set docOne = Documents(1)
Set docTwo = Documents("Policies.doc")
Set docThree = ActiveDocument

You will rarely refer to a document by using its index value in the Documents collection because this value can change for a given document as other documents are opened and closed. Typically, you will use the ActiveDocument property or a Document object variable created by using the Documents collection's Add method or Open method. The following example shows how you can use the ActiveDocument property to add an address to the document that currently has the focus:

With ActiveDocument
   .Envelope.Insert Address:="Office Talk" _
      & vbCrLf & "One Microsoft Way" & vbCrLf _
      & "Redmond, WA 98052", ReturnAddress:= _
      "David Shank" & vbCrLf & _
      "77 First Street" & vbCrLf & _
      "Any Town, USA 12345"
End With

The next example illustrates how to instantiate a Document object variable by using the Documents collection's Open method.

Dim docPolicy As Word.Document
Set docPolicy = Documents.Open("c:\my documents\policies.doc")

The final example shows how to instantiate a Document object for a new, blank document using the Add method.

Dim docPolicy As Word.Document
Set docPolicy = Documents.Add

The document opened by using the Open method or the document created by using the Add method will also be the currently active document represented by the ActiveDocument property. If you want to make some other document in the Documents collection the active document, use the Document object's Activate method.

Once you've got a Document object to work with, most of what you'll want to do with VBA will involve working with text. The starting point will be to specify a part of the document and then to do something to it—for example, adding or removing text or formatting words or characters. The two objects you will use to accomplish much of this work are the Range object and the Selection object. I'm only going to talk about the Range object in this column. We will get into more detail on the Selection object next month.

Understanding Word's Paragraph Marks

When you work with text programmatically, it is important to understand how Word handles paragraph marks. At its most basic level, a Word document is nothing more than a vast stream of characters. We tend to think of documents as collections of words, sentences, and paragraphs, but basically all you really have are characters. Each character has a specific job to do. Some characters are letters, spaces, or tabs. Some characters are paragraph marks or page breaks.

Paragraph marks play a unique and sometimes misunderstood role in Word documents. A paragraph consists of a paragraph mark and all text that precedes the mark up to, but not including, a previous paragraph mark. In addition—and this is the important part—the paragraph mark itself contains all the information about how the paragraph is formatted.

When you copy a word, sentence, or paragraph and you include a paragraph mark, all the formatting information contained in the paragraph mark is also copied and applied to the paragraph when it is pasted in another location.

If you want to copy text from within one paragraph and paste it into another paragraph but do not want to copy the paragraph formatting as well, be sure that you do not copy the paragraph mark adjacent to the text you copy.

Every blank Word document contains a single paragraph mark that constitutes a Character object, a Word object, a Sentence object, and a Paragraph object all at the same time. However, the Statistics tab of the Properties dialog box (File menu) reports that there are no characters, words, sentences, or paragraphs in a blank document. This difference highlights an important aspect of Word that you will need to consider when manipulating these objects programmatically.

The Range Object

A Range object represents a contiguous area in a document, defined by a starting character position and an ending character position. The contiguous area can be as small as the insertion point or as large as the entire document. It can also be, but does not have to be, the area represented by the current selection. You can define a Range object that represents a different area than the current selection. You can also define multiple Range objects in a single document. The characters in a Range object include nonprinting characters, such as spaces, carriage returns, and paragraph marks.

Working with the Range Object

You typically create a Range object by declaring an object variable of type Range and then instantiating that variable by using either the Document object's Range method or the Range property of another object, such as a Character, Word, Sentence, or Selection object. For example, the following code creates two Range objects that both represent the second sentence in the active document.

Dim rngRangeMethod As Word.Range
Dim rngRangeProperty As Word.Range

With ActiveDocument
   If .Sentences.Count >= 2 Then
      Set rngRangeMethod = .Range(.Sentences(2).Start, _
         .Sentences(2).End)
      Set rngRangeProperty = .Sentences(2)
   End If
End With

When you use the Range method to specify a specific area of a document, you use the method's Start argument to specify the character position where the range should begin and you use the End argument to specify where the range should end. The first character in a document is at character position 0. The last character position is equal to the total number of characters in the document. You can determine the number of characters in a document by using the Characters collection's Count property. As shown in the preceding example, you can also use the Start and End properties of a Bookmark, Selection, or Range object to specify the Range method's Start and End arguments. You can set the Start and End arguments to the same number, which creates a range that does not include any characters.

You can set or redefine the contents of a Range object by using an object's SetRange method. You can specify or redefine the start of a range by using the Range object's Start property or its MoveStart method. Likewise, you can specify or redefine the end of a range by using the Range object's End property or its MoveEnd method.

The following example begins by using the Content property to create a Range object that covers the entire content of a document. It then changes the End property to specify that the end of the range will be at the end of the first sentence in the document. Next, it uses the SetRange method to redefine the range to cover the first paragraph in the document. Finally, it uses the MoveEnd method to extend the end of the range to the end of the second paragraph in the document. At each step in the example, the number of characters contained in the range is printed to the Immediate window.

Sub RangeExample()
   Dim rngSample As Range

   Set rngSample = ActiveDocument.Content

   With rngSample
      Debug.Print "The range now contains " & .Characters.Count _
         & " characters."
      .End = ActiveDocument.Sentences(1).End
      Debug.Print "The range now contains " & .Characters.Count _
         & " characters."
      .SetRange Start:=0, End:=ActiveDocument._
         Paragraphs(1).Range.End
      Debug.Print "The range now contains " & .Characters.Count _
         & " characters."
      .MoveEnd Unit:=wdParagraph, Count:=1
      Debug.Print "The range now contains " & .Characters.Count _
         & " characters."
   End With
End Sub

You can also redefine a Range object by using the object's Find property to return a Find object. The following example illustrates the use of the Find property to locate text within the active document. If the text is found, the Range object is automatically redefined to contain the text that matched the search criteria.

With rngRangeText.Find
   .ClearFormatting
   If .Execute(FindText:="Find Me!") Then
      ' rngRangeText is redefined.
   End If
End With

Many Word objects have a Range property that returns a Range object. You use an object's Range property to return a Range object under circumstances where you need to work with properties or methods of the Range object that are not available from the object itself. For example, the following code uses the Range property of a Paragraph object to return a Range object that is used to format the text in the first paragraph in a document:

Dim rngPara As Range

Set rngPara = ActiveDocument.Paragraphs(1).Range
With rngPara
   .Bold = True
   .ParagraphFormat.Alignment = wdAlignParagraphCenter
   .Font.Name = "Arial"
End With

After you identify the Range object, you can apply methods and properties of the object to modify the contents of the range or get information about the range. For example, you can use the Range object's StoryType property to determine where in the document the Range is located.

Working with the Text in a Range Object

You use a Range object's Text property to specify or determine the text the range contains. For example, the following code first displays the text within a Range object, then changes it and displays the new text, and finally restores the original text. The sample illustrates how to use the Range object's Text property to copy or paste text into a document while maintaining the original paragraph structure. Notice how the new text in the strNewText variable includes a paragraph mark (vbCrLf) to replace the paragraph mark that is included when the original paragraph is selected.

Public Sub ChangeTextSample()
   Dim rngText As Range
   Dim strOriginalText As String
   Dim strNewText As String

   strNewText = "This text is replacing the original" _
      & " text in the first paragraph of the active" _
      & " document. This is all done using only the" _
      & " Text property of the Range object!" & vbCrLf

   Set rngText = ActiveDocument.Paragraphs(1).Range
   With rngText
      MsgBox .Text, vbOKOnly, "This is the original text."
      strOriginalText = .Text
      .Text = strNewText
      MsgBox .Text, vbOKOnly, "This is the new text" _
         & " inserted in paragraph 1."
      .Text = strOriginalText
      MsgBox "The original text is restored."
   End With
End Sub

You can use the Range object's StoryType property to determine where the range is located in the document. Stories are distinct areas of a document that contain text. You can have up to 11 story type areas in a document, representing areas such as document text, headers, footers, footnotes, comments, and more. You use the StoryRanges property to return a StoryRanges collection. The StoryRanges collection contains Range objects representing each story in a document.

A new Word document contains a single story, called the Main Text story, which represents the text in the main part of the document. Even a blank document contains a character, a word, a sentence, and a paragraph.

You do not expressly add new stories to a document; rather, Word adds them for you when you add text to a portion of the document represented by one of the 11 story types. For example, if you add footnotes, Word adds a Footnotes story. If you add comments, Word adds a Comments story to the StoryRanges collection of the document.

You use the Range property to return a Range object representing each story in a document. For example, the following code prints the text associated with the Main Text story and the Comments story:

Dim rngMainText As Word.Range
Dim rngCommentsText As Word.Range

Set rngMainText = ActiveDocument.StoryRanges(wdMainTextStory)
Set rngComments = ActiveDocument.StoryRanges(wdCommentsStory)
Debug.Print rngMainText.Text
Debug.Print rngComments.Text

You use the Range object's InsertBefore or InsertAfter methods to add text to an existing Range object. In fact, there is an entire class of methods, with names that begin with "Insert," that you can use to manipulate a Range object.

It's useful to have a procedure that combines the Range object's InsertBefore and InsertAfter methods with the Text property. Having such a procedure creates a single place to handle much of the work you will do when manipulating text programmatically. The InsertTextInRange procedure shown below is just such a procedure. You can call the InsertTextInRange procedure any time you need to add text to a Range object. In other words, the procedure is useful any time you want to programmatically make any changes to existing text in a Word document.

The InsertTextInRange procedure uses two required arguments and one optional argument. The strNewText argument contains the text you want to add to the Range object specified in the rngRange argument. The intInsertMode optional argument specifies how the new text will be added to the range. The values for this argument are one of three custom enumerated constants that specify whether to use the InsertBefore method, the InsertAfter method, or the Text property to replace the existing range text.

Public Enum opgTextInsertMode
    Before
    After
    Replace
End Enum

Function InsertTextInRange(strNewText As String, _
         Optional rngRange As Word.Range, _
         Optional intInsertMode As opgTextInsertMode = _
         Replace) As Boolean
   ' This procedure inserts text specified by the strNewText
   ' argument into the Range object specified by the rngRange
   ' argument. It calls the IsLastCharParagraph procedure to
   ' strip off trailing paragraph marks from the rngRange object.
 
   Call IsLastCharParagraph(rngRange, True)

   With rngRange
      Select Case intInsertMode
         Case 0 ' Insert before text in range.
            .InsertBefore strNewText
         Case 1 ' Insert after text in range.
            .InsertAfter strNewText
         Case 2 ' Replace text in range.
            .Text = strNewText
         Case Else
      End Select
      InsertTextInRange = True
   End With
End Function

Note that the IsLastCharParagraph procedure is used to strip off any final paragraph marks before inserting text in the range. The example uses the Chr$() function with character code 13 to represent a paragraph mark.

Function IsLastCharParagraph(ByRef rngTextRange As Word.Range, _
         Optional blnTrimParaMark As Boolean = False) As Boolean
   ' This procedure accepts a character, word, sentence, or
   ' paragraph Range as the first argument and returns True
   ' if the last character in the range is a paragraph mark, and
   ' False if it is not. The procedure also accepts an optional
   ' Boolean argument that specifies whether the Range object
   ' should be changed to eliminate the paragraph mark if it
   ' exists. When the blnTrimParaMark argument is True, this
   ' procedure calls itself to strip off all trailing
   ' paragraph marks.

   Dim strLastChar As String

   strLastChar = Right$(rngTextRange.Text, 1)
   If InStr(strLastChar, Chr$(13)) = 0 Then
      IsLastCharParagraph = False
      Exit Function
   Else
      IsLastCharParagraph = True
      If Not blnTrimParaMark = True Then
         Exit Function
      Else
         Do
            rngTextRange.SetRange rngTextRange.Start, _
               rngTextRange.Start + _
               rngTextRange.Characters.Count - 1
            Call IsLastCharParagraph(rngTextRange, True)
        Loop While InStr(rngTextRange.Text, Chr$(13)) <> 0
      End If
   End If
End Function

In this example, the Count property of the Range object's Characters collection is used to redefine the Range object's end point.

More About Working with Paragraph Marks

In the ChangeTextSample procedure discussed earlier, note how the text in the strNewText variable uses the vbCrLf built-in constant to create a paragraph mark at the end of the text used to replace the existing text in paragraph 1 of the active document. This is done to prevent the new text from becoming part of the second paragraph.

When you create a Range object that represents a Character, Word, or Sentence object and that object falls at the end of a paragraph, the paragraph mark is automatically included within the range. Moreover, the Range object will include all additional subsequent empty paragraph marks. For example, in a two-paragraph document where the first paragraph consists of three sentences and the second paragraph is empty, the following code creates a Range object that represents the last sentence in the first paragraph:

Set rngCurrentSentence = ActiveDocument.Sentences(3)

Because the rngCurrentSentence Range object refers to the last sentence in the first paragraph, that paragraph mark (and any additional empty paragraph marks) will be automatically included in the range. If you then set the Text property of this object to a text string that didn't end with a paragraph mark, the second paragraph in the document would be deleted.

When you write VBA code that manipulates text in a Word document, you need to account for the presence of a paragraph mark in your text. There are two basic techniques you can use to account for paragraph marks when you are cutting and pasting text in Range objects:

  • Include a new paragraph mark (represented by the vbCrLf constant) in the text to be inserted in the document (as shown in the ChangeTextSample procedure).
  • Exclude the final paragraph mark from a Range object (as shown in the InsertTextInRange procedure's use of the IsLastCharParagraph function).

Where to Get More Info

The techniques and technologies discussed here should give you lots of ideas for working with Word documents using VBA. For additional information, check out the following resources:

  • For more information on working with the Word object model, see the technical articles in the MSDN Online Library.
  • As always, check in regularly at the Office Developer Center for information and technical articles on Office solution development.

David Shank is a programmer/writer on the Office team specializing in developer documentation. Rumor has it he lives high in the mountains to the east of Redmond and is one of the few native Northwesterners who still lives in the Northwest.