Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 2 of 2)

Summary: The Welcome to the Open XML Format SDK 1.0 is a library for accessing Open XML Format files. This article, the second in a series of two, describes the Open XML object model code that you can use to access and manipulate Microsoft Office Excel 2007 and Microsoft Office PowerPoint 2007 files. (17 printed pages)

Frank Rice, Microsoft Corporation

September 2007 (Revised August 2008)

Applies to: Microsoft Office Excel 2007, Microsoft Office PowerPoint 2007

Contents

  • Overview

  • Retrieving Comments from Worksheets as XML

  • Retrieving Worksheet Names

  • Writing to Cells in Worksheets

  • Retrieving Slide Titles in Presentations

  • Deleting Slides in Presentations

  • Counting Slides in Presentations

  • Changing Titles of Slides

  • Changing Slide Order in Presentations

  • Replacing Images on Slides

  • Conclusion

  • Additional Resources

Overview

The 2007 Microsoft Office system introduces new file formats that are based on XML called Open XML Formats. Microsoft Office Word 2007, Microsoft Office Excel 2007, and Microsoft Office PowerPoint 2007 all use these formats as the default file format. Open XML formats are useful because they are an open standard and are based on well-known technologies: ZIP and XML. Microsoft provides a library for accessing these files as part of the .NET Framework 3.0 technologies in the DocumentFormat.OpenXml namespace in the Welcome to the Open XML Format SDK 1.0. The Open XML Format members are contained in theDocumentFormat.OpenXml API and provide strongly-typed part classes to manipulate Open XML documents. The SDK simplifies the task of manipulating Open XML packages. The Open XML Format API encapsulates many common tasks that developers perform on Open XML Format packages, so you can perform complex operations with just a few lines of code.

Retrieving Comments from Worksheets as XML

The following code retrieves the comments from a worksheet and adds them to an XML document.

Public Function XLGetCommentsAsXML(ByVal fileName As String) As XmlDocument
   Const workbookSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   Const relationSchema As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships"
   Dim resultDoc As XmlDocument = New XmlDocument
   resultDoc.LoadXml("<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?><AllComments />")
   Dim resultDocElement As XmlNode = resultDoc.DocumentElement
   Dim xlDoc As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, False)
   '  Get the main document part (workbook.xml).
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(xlDoc.WorkbookPart.GetStream)
   '  Manage namespaces to perform XML XPath queries.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("w", workbookSchema)
   nsManager.AddNamespace("r", relationSchema)
   Dim search As String = String.Format("//w:sheets/w:sheet")
   Dim nodes As XmlNodeList = doc.SelectNodes(search, nsManager)
   For Each sheetNode As XmlNode In nodes
      Dim attr As XmlAttribute = sheetNode.Attributes("name")
      Dim sheetName As String = attr.Value
      attr = sheetNode.Attributes("r:id")
      Dim sheetID As String = attr.Value
      Dim sheetPart As WorksheetPart = CType(xlDoc.WorkbookPart.GetPartById(sheetID), WorksheetPart)
      If ((Not (sheetPart) Is Nothing) AndAlso (Not (sheetPart.WorksheetCommentsPart) Is Nothing)) Then
          Dim commentDoc As XmlDocument = New XmlDocument
          commentDoc.Load(sheetPart.WorksheetCommentsPart.GetStream)
          Dim newSheetNode As XmlNode = resultDoc.CreateNode(XmlNodeType.Element, "sheet", "")
          Dim nameAttr As XmlAttribute = newSheetNode.Attributes.Append(resultDoc.CreateAttribute("name"))
          nameAttr.Value = sheetName
          newSheetNode.InnerXml = commentDoc.DocumentElement.OuterXml
          resultDocElement.AppendChild(newSheetNode)
      End If
   Next
   Return resultDoc
End Function
public static XmlDocument XLGetCommentsAsXML(string fileName)
{
   const string workbookSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";
   const string relationSchema = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";

   XmlDocument resultDoc = new XmlDocument();
   resultDoc.LoadXml(@"<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?><AllComments />");
   XmlNode resultDocElement = resultDoc.DocumentElement;

   using (SpreadsheetDocument xlDoc = SpreadsheetDocument.Open(fileName, false))
   {
      //  Get the main document part (workbook.xml).
      XmlDocument doc = new XmlDocument();
      doc.Load(xlDoc.WorkbookPart.GetStream());

      //  Manage namespaces to perform XML XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("w", workbookSchema);
      nsManager.AddNamespace("r", relationSchema);

      string search = string.Format("//w:sheets/w:sheet");
      XmlNodeList nodes = doc.SelectNodes(search, nsManager);
      foreach (XmlNode sheetNode in nodes)
      {
         XmlAttribute attr = sheetNode.Attributes["name"];
         string sheetName = attr.Value;
         attr = sheetNode.Attributes["r:id"];
         string sheetID = attr.Value;
         WorksheetPart sheetPart = xlDoc.WorkbookPart.GetPartById(sheetID) as WorksheetPart;
         if (sheetPart != null && sheetPart.WorksheetCommentsPart != null)
         {
            XmlDocument commentDoc = new XmlDocument();
            commentDoc.Load(sheetPart.WorksheetCommentsPart.GetStream());
            XmlNode newSheetNode = resultDoc.CreateNode(XmlNodeType.Element, "sheet", "");
            XmlAttribute nameAttr = newSheetNode.Attributes.Append(resultDoc.CreateAttribute("name"));
            nameAttr.Value = sheetName;
            newSheetNode.InnerXml = commentDoc.DocumentElement.OuterXml;
            resultDocElement.AppendChild(newSheetNode);

         }
      }
   }
   return resultDoc;
}

In this procedure, you pass in one parameter: the full path to the workbook. Next, you create the shell of the XML document that contains the comments you find. The following code example shows the output.

<AllComments>
   <sheet name="Sheet1">
      <comments>
         <!-- The contents from the sheet's related comments part -->
      </comments>
   </sheet>
   ...
</AllComments>

Then you open the input file as an Open XML package by using the Open method of the SpreadsheetDocument object and load the data into an XML document. Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default workBookSchema namespace with the w qualifier and the relationSchema with the r qualifier. You then set up the search XPath expression //w:sheets/w:sheet to loop through each worksheet in the package.

Next, you retrieve the name and relationship id of each worksheet in the package. If there is a worksheetPart part with the id and a worksheetCommentsPart part, the worksheetCommentsPart is loaded into an XML document. Then, the code example inserts the comments from the worksheetPart part into the output document and returns the final XML document to the calling procedure.

Retrieving Worksheet Names

The following code retrieves a list of all worksheet names.

Public Function XLGetSheetInfo(ByVal fileName As String) As List(Of String)
   Dim sheets As List(Of String) = New List(Of String)
   Dim xlPackage As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, False)
   Dim workbook As WorkbookPart = xlPackage.WorkbookPart
   Dim workbookstr As Stream = workbook.GetStream
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(workbookstr)
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
   nsManager.AddNamespace("default", doc.DocumentElement.NamespaceURI)
   Dim nodelist As XmlNodeList = doc.SelectNodes("//default:sheets/default:sheet", nsManager)
   For Each node As XmlNode In nodelist
      Dim sheetName As String = String.Empty
      sheetName = node.Attributes("name").Value
      sheets.Add(sheetName)
   Next
   Return sheets
End Function
public static List<string> XLGetSheetInfo(string fileName)
{
   List<string> sheets = new List<string>();

   using (SpreadsheetDocument xlPackage = SpreadsheetDocument.Open(fileName, false))
   {
      WorkbookPart workbook = xlPackage.WorkbookPart;
      Stream workbookstr = workbook.GetStream();
      XmlDocument doc = new XmlDocument();
      doc.Load(workbookstr);

      XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
      nsManager.AddNamespace("default", doc.DocumentElement.NamespaceURI);
      XmlNodeList nodelist = doc.SelectNodes("//default:sheets/default:sheet", nsManager);

      foreach (XmlNode node in nodelist)
      {
         string sheetName = string.Empty;
         sheetName = node.Attributes["name"].Value;
         sheets.Add(sheetName);
      }
   }
   return sheets;
}

First, you open the input file as an Open XML package by using the Open method of the SpreadsheetDocument object and load the data into an XML document. Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default workBookSchema namespace with the w qualifier and the relationSchema with the r qualifier. You then set up the search XPath expression //default:sheets/default:sheet to loop through each node in the workbook part to retrieve the name of each worksheet in the package.

Finally, the code example returns the list of worksheet names to the calling procedure.

Writing to Cells in Worksheets

The following code writes a value into a cell in a worksheet.

Public Sub XLWrite2Cell(ByVal fileName As String, ByVal sheetName As String, ByVal addressName As String)
   Const worksheetNamespace As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
   Dim xlPackage As SpreadsheetDocument = SpreadsheetDocument.Open(fileName, True)
   '  Get the main document part (workbook.xml).
   Dim workbookPart As WorkbookPart = xlPackage.WorkbookPart
   '  Load the contents of the workbook.
   Dim doc As XmlDocument = New XmlDocument
   doc.Load(workbookPart.GetStream)
   '  Create a NamespaceManager to handle the default namespace, 
   '  and create a prefix for the default namespace.
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
   nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI)
   nsManager.AddNamespace("w", worksheetNamespace)
   Dim searchString As String = String.Format("//d:sheet[@name='{0}']", sheetName)
   Dim sheetNode As XmlNode = doc.SelectSingleNode(searchString, nsManager)
   If (Not (sheetNode) Is Nothing) Then
      '  Get the relationship id attribute.
      Dim relationAttribute As XmlAttribute = sheetNode.Attributes("r:id")
      If (Not (relationAttribute) Is Nothing) Then
         Dim relId As String = relationAttribute.Value
         Dim worksheetPart As WorksheetPart = CType(workbookPart.GetPartById(relId), WorksheetPart)
         Dim sheetDoc As XmlDocument = New XmlDocument
         sheetDoc.Load(worksheetPart.GetStream)
         Dim cellNode As XmlNode = sheetDoc.SelectSingleNode(String.Format("//w:sheetData/w:row/w:c[@r='{0}']", addressName), nsManager)
         If (cellNode Is Nothing) Then
            cellNode = sheetDoc.CreateElement("c", worksheetNamespace)
            cellNode.Attributes.Append(sheetDoc.CreateAttribute("r"))
            cellNode.Attributes("r").Value = addressName
            Dim r As System.Text.RegularExpressions.Regex = New System.Text.RegularExpressions.Regex("(?<col>\D+)(?<row>\d+)")
            Dim rowNumber As String = r.Match(addressName).Result("${row}")
            cellNode.Attributes.Append(sheetDoc.CreateAttribute("s"))
            cellNode.Attributes("s").Value = "0"
            Dim rowNode As XmlNode = sheetDoc.SelectSingleNode(String.Format("//w:sheetData/w:row[@r='{0}']", rowNumber), nsManager)
            If (rowNode Is Nothing) Then
               Dim sheetDataNode As XmlNode = sheetDoc.SelectSingleNode("//w:sheetData", nsManager)
               If (Not (sheetDataNode) Is Nothing) Then
                  rowNode = sheetDoc.CreateElement("row", worksheetNamespace)
                  rowNode.Attributes.Append(sheetDoc.CreateAttribute("r"))
                  rowNode.Attributes("r").Value = rowNumber
                  rowNode.AppendChild(cellNode)
                  sheetDataNode.AppendChild(rowNode)
               End If
            Else
               Dim styleAttr As XmlAttribute = rowNode.Attributes("s")
               If (Not (styleAttr) Is Nothing) Then
                  cellNode.Attributes("s").Value = styleAttr.Value
               End If
               Dim biggerNode As XmlNode = Nothing
               For Each node As System.Xml.XmlNode In rowNode.ChildNodes
                  If (Double.Parse(node.Attributes("r").Value) > Double.Parse(addressName)) Then
                     biggerNode = node
                  End If
               Next
               If (biggerNode Is Nothing) Then
                  rowNode.AppendChild(cellNode)
               Else
                  rowNode.InsertBefore(cellNode, biggerNode)
               End If
          End If
      End If
      '  =============
      '  Insert code to work with cellNode here.
      '  =============
      '  For example:
      Dim valueNode As XmlNode = cellNode.SelectSingleNode(".//w:v", nsManager)
      If (valueNode Is Nothing) Then
         valueNode = sheetDoc.CreateElement("v", worksheetNamespace)
         cellNode.AppendChild(valueNode)
      End If
      valueNode.InnerText = "12345"
      '  Note that this will not display string values--those must
      '  be retrieved from the shared strings part.
      '  See the code example that demonstrates retrieving
      '  a value from an Excel cell for the complete code.
      MessageBox.Show(("Value = " + valueNode.InnerText))
      '  Save the worksheet part.
      Dim st As Stream = worksheetPart.GetStream(FileMode.Create)
      sheetDoc.Save(st)
      'xlPackage.Save();
      'doc.Save(workbookPart.GetStream(FileMode.Create));
      '  ============
      '  End example code.
      '  ============
      End If
   End If
End Sub
public static void XLWrite2Cell(string fileName, string sheetName, string addressName)
{
   const string worksheetNamespace = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";

   using (SpreadsheetDocument xlPackage = SpreadsheetDocument.Open(fileName, true))
   {
      //  Get the main document part (workbook.xml).
      WorkbookPart workbookPart = xlPackage.WorkbookPart;
      //  Load the contents of the workbook.
      XmlDocument doc = new XmlDocument();
      doc.Load(workbookPart.GetStream());

      //  Create a NamespaceManager to handle the default namespace, 
      //  and create a prefix for the default namespace.
      XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
      nsManager.AddNamespace("d", doc.DocumentElement.NamespaceURI);
      nsManager.AddNamespace("w", worksheetNamespace);

      string searchString = string.Format("//d:sheet[@name='{0}']", sheetName);
      XmlNode sheetNode = doc.SelectSingleNode(searchString, nsManager);
      if (sheetNode != null)
      {
         //  Get the relationship id attribute.
         XmlAttribute relationAttribute = sheetNode.Attributes["r:id"];
         if (relationAttribute != null)
         {
            string relId = relationAttribute.Value;

            WorksheetPart worksheetPart = (WorksheetPart)workbookPart.GetPartById(relId);

            XmlDocument sheetDoc = new XmlDocument();
            sheetDoc.Load(worksheetPart.GetStream());

            XmlNode cellNode = sheetDoc.SelectSingleNode(string.Format("//w:sheetData/w:row/w:c[@r='{0}']", addressName), nsManager);
            if (cellNode == null)
            {
               cellNode = sheetDoc.CreateElement("c", worksheetNamespace);
               cellNode.Attributes.Append(sheetDoc.CreateAttribute("r"));
               cellNode.Attributes["r"].Value = addressName;
               System.Text.RegularExpressions.Regex r = new System.Text.RegularExpressions.Regex(@"(?<col>\D+)(?<row>\d+)");
               string rowNumber = r.Match(addressName).Result("${row}");

               cellNode.Attributes.Append(sheetDoc.CreateAttribute("s"));
               cellNode.Attributes["s"].Value = "0";

               XmlNode rowNode = sheetDoc.SelectSingleNode(string.Format("//w:sheetData/w:row[@r='{0}']", rowNumber), nsManager);
               if (rowNode == null)
               {
                  XmlNode sheetDataNode = sheetDoc.SelectSingleNode("//w:sheetData", nsManager);
                  if (sheetDataNode != null)
                  {
                     rowNode = sheetDoc.CreateElement("row", worksheetNamespace);
                     rowNode.Attributes.Append(sheetDoc.CreateAttribute("r"));
                     rowNode.Attributes["r"].Value = rowNumber;
                     rowNode.AppendChild(cellNode);
                     sheetDataNode.AppendChild(rowNode);
                  }
               }
               else
               {
                  XmlAttribute styleAttr = rowNode.Attributes["s"];
                  if (styleAttr != null)
                  {
                     cellNode.Attributes["s"].Value = styleAttr.Value;
                  }
                  XmlNode biggerNode = null;
                  foreach (System.Xml.XmlNode node in rowNode.ChildNodes)
                  {
                     if (double.Parse(node.Attributes["r"].Value) > double.Parse(addressName))
                     {
                        biggerNode = node;
                        break; /* TRANSWARNING: check that break is in correct scope */
                     }
                  }
                  if (biggerNode == null)
                  {
                     rowNode.AppendChild(cellNode);
                  }
                  else
                  {
                     rowNode.InsertBefore(cellNode, biggerNode);
                  }
               }
            }
            //  =============
            //  Insert code to work with cellNode here.
            //  =============
            //  For example:
            XmlNode valueNode = cellNode.SelectSingleNode(".//w:v", nsManager);
            if (valueNode == null)
            {
               valueNode = sheetDoc.CreateElement("v", worksheetNamespace);
               cellNode.AppendChild(valueNode);
            }
            valueNode.InnerText = "12345";
            //  Note that this won't display string values--those must
            //  be retrieved from the shared strings part.
            //  See the code example that demonstrates retrieving
            //  a value from an Excel cell for the complete code.
            MessageBox.Show("Value = " + valueNode.InnerText);

            //  Save the worksheet part.
            Stream st = worksheetPart.GetStream(FileMode.Create);
            sheetDoc.Save(st);
            //xlPackage.Save();
            //doc.Save(workbookPart.GetStream(FileMode.Create));
            //  ============
            //  End example code.
            //  ============
         }
      }
   }
}

In this procedure, you pass in three parameters: the full path to the workbook, the name of the worksheet, and the address of the cell to which you want to write. Then you open the input file as an Open XML package by using the Open method of the SpreadsheetDocument object and load the data into an XML document. Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default NamespaceURI namespace with the d qualifier and the worksheetNamespace with the w qualifier.

Next, you retrieve the node that represents the specified worksheet in the main workbook part by selecting the nameattribute of the //d:sheet node. If it finds the node, it retrieves the relationship id of the worksheet and uses it to load the worksheet into an XML document. The next few statements find the cell to write to. If it does not find the cell, then you look for the row. If it does not find the cell, you need to create it. Likewise, if it does not find the row, you need to create the node for it and then create the cell in the correct location.

After the row and cell exist, you need the code to work with the cell. The procedure contains sample code that writes a string value to a cell.

Retrieving Slide Titles in Presentations

The following code retrieves a list of the slide titles in a PowerPoint 2007 presentation.

Public Function PPTGetSlideTitles(ByVal fileName As String) As List(Of String)
   Const presentationmlNamespace As String = "http://schemas.openxmlformats.org/presentationml/2006/main"
   Dim titles As List(Of String) = New System.Collections.Generic.List(Of String)
   Dim pptDoc As PresentationDocument = PresentationDocument.Open(fileName, False)
   '  Manage namespaces to perform XML XPath queries.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("p", presentationmlNamespace)
   For Each slidePart As SlidePart In pptDoc.PresentationPart.SlideParts
      Dim slideDoc As XmlDocument = New XmlDocument(nt)
      slideDoc.Load(slidePart.GetStream)
      Dim titleNode As XmlNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager)
      If (Not (titleNode) Is Nothing) Then
         titles.Add(titleNode.ParentNode.ParentNode.ParentNode.InnerText)
      End If
   Next
   Return titles
End Function
public static List<string> PPTGetSlideTitles(string fileName)
{
   const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";

   List<string> titles = new System.Collections.Generic.List<string>();

   using (PresentationDocument pptDoc = PresentationDocument.Open(fileName, false))
   {
      //  Manage namespaces to perform Xml XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("p", presentationmlNamespace);

      foreach (SlidePart slidePart in pptDoc.PresentationPart.SlideParts)
      {
         XmlDocument slideDoc = new XmlDocument(nt);
         slideDoc.Load(slidePart.GetStream());
         XmlNode titleNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager);

         if (titleNode != null)
         {
             titles.Add(titleNode.ParentNode.ParentNode.ParentNode.InnerText);
         }
      }
   }
   return titles;
}

In this procedure, you pass in one parameter: the full path to the PowerPoint 2007 presentation document. Then, you open the document by using the Open method of the PresentationDocument object. Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default presentationML namespace, using the p qualifier. You then loop through the slide parts and load each into an XML document. Then, you retrieve the title of each slide by using the title or ctrTitle attribute of the //p:sp//p:ph node. Finally, the procedure returns the list of slide titles.

Deleting Slides in Presentations

The following code deletes a slide in a presentation deck.

Public Function PPTDeleteSlide(ByVal fileName As String, ByVal title As String) As Boolean
   Const presentationmlNamespace As String = "http://schemas.openxmlformats.org/presentationml/2006/main"
   Const relationshipNamespace As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships"
   Dim returnValue As Boolean = False
   Dim pptDoc As PresentationDocument = PresentationDocument.Open(fileName, True)
   '  Manage namespaces to perform Xml XPath queries.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("p", presentationmlNamespace)
   nsManager.AddNamespace("rel", relationshipNamespace)
   Dim xDoc As XmlDocument = New XmlDocument(nt)
   xDoc.Load(pptDoc.PresentationPart.GetStream)
   Dim partToDelete As SlidePart = Nothing
   Dim partId As String = Nothing
   For Each slidePart As SlidePart In pptDoc.PresentationPart.SlideParts
      Dim slideDoc As XmlDocument = New XmlDocument(nt)
      slideDoc.Load(slidePart.GetStream)
      Dim titleNode As XmlNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager)
      If (Not (titleNode) Is Nothing) Then
         Dim titleText As String = titleNode.ParentNode.ParentNode.ParentNode.InnerText
         If (String.Compare(titleText, title, True) = 0) Then
            partToDelete = slidePart
            partId = pptDoc.PresentationPart.GetIdOfPart(slidePart)
         End If
      End If
   Next
   If (Not (partToDelete) Is Nothing) Then
      pptDoc.PresentationPart.DeletePart(partToDelete)
      Dim searchString As String = String.Format("//p:sldId[@rel:id='{0}']", partId)
      Dim slideRelNode As XmlNode = xDoc.SelectSingleNode(searchString, nsManager)
      If (Not (slideRelNode) Is Nothing) Then
         slideRelNode.ParentNode.RemoveChild(slideRelNode)
      End If
      xDoc.Save(pptDoc.PresentationPart.GetStream(FileMode.Create))
      returnValue = True
   End If
   Return returnValue
End Function
public static bool PPTDeleteSlide(string fileName, string title)
{
   const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";
   const string relationshipNamespace = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";

   bool returnValue = false;

   using (PresentationDocument pptDoc = PresentationDocument.Open(fileName, true))
   {
      //  Manage namespaces to perform Xml XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("p", presentationmlNamespace);
      nsManager.AddNamespace("rel", relationshipNamespace);
      XmlDocument xDoc = new XmlDocument(nt);
      xDoc.Load(pptDoc.PresentationPart.GetStream());

      SlidePart partToDelete = null;
      string partId = null;
      foreach (SlidePart slidePart in pptDoc.PresentationPart.SlideParts)
      {
         XmlDocument slideDoc = new XmlDocument(nt);
         slideDoc.Load(slidePart.GetStream());
         XmlNode titleNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager);
         if (titleNode != null)
         {
            string titleText = titleNode.ParentNode.ParentNode.ParentNode.InnerText;
            if (string.Compare(titleText, title, true) == 0)
            {
               partToDelete = slidePart;
               partId = pptDoc.PresentationPart.GetIdOfPart(slidePart);
               break;
             }
          }
        }
        if (partToDelete != null)
        {
           pptDoc.PresentationPart.DeletePart(partToDelete);
           string searchString = string.Format("//p:sldId[@rel:id='{0}']", partId);
           XmlNode slideRelNode = xDoc.SelectSingleNode(searchString, nsManager);
           if (slideRelNode != null)
           {
              slideRelNode.ParentNode.RemoveChild(slideRelNode);
           }
           xDoc.Save(pptDoc.PresentationPart.GetStream(FileMode.Create));
           returnValue = true;
      }
   }
   return returnValue;
}

In this procedure, you pass in two parameters: the full path to the PowerPoint 2007 document and the name of the slide you want to delete. You then open the document by using the Open method of the PresentationDocument object. Next, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default presentationML namespace, using the p qualifier and the relationshipNamespace namespace with the rel qualifier. You then loop through the slide parts and load each into an XML document. Then you retrieve the title of each slide by using the title or ctrTitle attribute of the //p:sp//p:ph node.

Next, you compare the title you retrieved from the slide part and the slide title argument that you passed into the procedure. If it finds the specified slide, you delete the slide part. You also delete the relationship node in the main presentation part. Finally, the procedure returns a Boolean value indicating whether you successfully deleted the slide.

Counting Slides in Presentations

The following code returns a count of the number of slides in a presentation deck.

Public Function PPTGetSlideCount(ByVal fileName As String) As Integer
   Dim returnValue As Integer = 0
   Dim pptDoc As PresentationDocument = PresentationDocument.Open(fileName, False)
   Using (pptDoc)
      returnValue = pptDoc.PresentationPart.GetPartsCountOfType(Of SlidePart)()
   End Using
   Return returnValue
End Function
public static int PPTGetSlideCount(string fileName)
{
   int returnValue = 0;
   using (PresentationDocument pptDoc = PresentationDocument.Open(fileName, false))
   {
      returnValue = pptDoc.PresentationPart.GetPartsCountOfType<SlidePart>();
   }
   return returnValue;
}

In this procedure, you pass in one parameter: the full path to the PowerPoint 2007 document. You then open the document by using the Open method of the PresentationDocument object. Next, you retrieve the value of the GetPartsCountOfType property from the presentation part. The procedure returns that value.

Changing Titles of Slides

The following code changes the title of a slide.

Public Function PPTModifySlideTitle(ByVal fileName As String, ByVal oldSlideTitle As String, ByVal newSlideTitle As String) As Boolean
   Const presentationmlNamespace As String = "http://schemas.openxmlformats.org/presentationml/2006/main"
   Const drawingmlNamespace As String = "http://schemas.openxmlformats.org/drawingml/2006/main"
   Dim returnValue As Boolean = False
   Dim pptPackage As PresentationDocument = PresentationDocument.Open(fileName, True)
   Dim presentationPart As PresentationPart = pptPackage.PresentationPart
   Dim slideParts As IEnumerable = presentationPart.SlideParts
   '  Manage namespaces to perform Xml XPath queries.
   Dim nt As NameTable = New NameTable
   Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
   nsManager.AddNamespace("p", presentationmlNamespace)
   nsManager.AddNamespace("a", drawingmlNamespace)
   For Each slidePart As SlidePart In slideParts
      Dim doc As XmlDocument = New XmlDocument(nt)
      doc.Load(slidePart.GetStream)
      Dim xNode As XmlNode = doc.SelectSingleNode("//a:t", nsManager)
      If (Not (xNode) Is Nothing) Then
         If (String.Compare(xNode.InnerText, oldSlideTitle, True) = 0) Then
            xNode.InnerText = newSlideTitle
            doc.Save(slidePart.GetStream)
            'pptPackage.Save();
            returnValue = True
         End If
      End If
   Next
   Return returnValue
End Function
public static bool PPTModifySlideTitle(string fileName, string oldSlideTitle, string newSlideTitle)
{
   const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";
   const string drawingmlNamespace = "http://schemas.openxmlformats.org/drawingml/2006/main";

   bool returnValue = false;

   using (PresentationDocument pptPackage = PresentationDocument.Open(fileName, true))
   {
      PresentationPart presentationPart = pptPackage.PresentationPart;

      IEnumerable<SlidePart> slideParts = presentationPart.SlideParts;

      //  Manage namespaces to perform Xml XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("p", presentationmlNamespace);
      nsManager.AddNamespace("a", drawingmlNamespace);

      foreach (SlidePart slidePart in slideParts)
      {
         XmlDocument doc = new XmlDocument(nt);
         doc.Load(slidePart.GetStream());

         XmlNode xNode = doc.SelectSingleNode("//a:t", nsManager);
         if (xNode != null)
         {
            if (string.Compare(xNode.InnerText, oldSlideTitle, true) == 0)
            {
                xNode.InnerText = newSlideTitle;

                doc.Save(slidePart.GetStream());
                //pptPackage.Save();
                returnValue = true;
                break;
            }
         }
      }
   }
   return returnValue;
}

In this procedure, you pass in three parameters: the full path to the PowerPoint 2007 document, the title of the target slide, and new title of the slide. You then open the document by using the Open method of the PresentationDocument object. Next, you iterate over each slide to find the one with the title you passed into the procedure. To do this, you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default presentationml namespace, using the p qualifier and the drawingmlNamespace namespace with the a qualifier. You then loop through the slide parts and load each into an XML document. Next, you select each slide part from the //a:t node.

NoteNote

This code assumes that the first text found is the title. In addition, if the title contains more than one font or is anything other than plain text, PowerPoint 2007 breaks the title into multiple elements. This code does not find a match in those cases.

Then you compare the title you retrieved from the slide part with the slide title that you passed into the procedure. If you find the specified slide, you write the new title back to the slide part. Finally, the procedure returns a Boolean value indicating whether you successfully deleted the slide.

Changing Slide Order in Presentations

The following code changes the order in which slides appear in a presentation deck of slides.

Public Function PPTReorderSlides(ByVal fileName As String, ByVal originalPosition As Integer, ByVal newPosition As Integer) As Integer
   Const presentationmlNamespace As String = "http://schemas.openxmlformats.org/presentationml/2006/main"
   Dim returnValue As Integer = -1
   Dim pptPackage As PresentationDocument = PresentationDocument.Open(fileName, True)
   Dim presentationPart As PresentationPart = pptPackage.PresentationPart
   If (Not (presentationPart) Is Nothing) Then
      Dim doc As XmlDocument = New XmlDocument
      doc.Load(presentationPart.GetStream)
      '  Manage namespaces to perform XPath queries.
      Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(doc.NameTable)
      nsManager.AddNamespace("p", presentationmlNamespace)
      Dim nodes As XmlNodeList = doc.SelectNodes("//p:sldId", nsManager)
      If (Not (nodes) Is Nothing) Then
         Dim slideCount As Integer = nodes.Count
         Dim maxPosition As Integer = slideCount
         If (slideCount > 0) Then
            If (originalPosition = -1) Then
               originalPosition = maxPosition
         End If
         If (newPosition = -1) Then
            newPosition = maxPosition
         End If
         If (originalPosition > maxPosition) Then
            originalPosition = maxPosition
         End If
         If (newPosition > maxPosition) Then
            newPosition = maxPosition
         End If
         If ((originalPosition < 0) OrElse (newPosition < 0)) Then
            '  Something is wrong. Raise an exception.
            Throw New ArgumentException("Invalid parameter. Both original and new positions must be integers between -1 and the total number of slides.")
         End If
         If (originalPosition <> newPosition) Then
            Dim parentNode As XmlNode = doc.SelectSingleNode("//p:sldIdLst", nsManager)
              If (Not (parentNode) Is Nothing) Then
                 '  Get references to both slide nodes.
                 Dim originalNode As XmlNode = Nothing
                 Dim newNode As XmlNode = Nothing
                 Dim nodeSearch As String = Nothing
                 Dim attr As XmlAttribute = nodes((originalPosition - 1)).Attributes("id")
                 If (Not (attr) Is Nothing) Then
                    nodeSearch = String.Format("./p:sldId[@id='{0}']", attr.Value)
                    originalNode = parentNode.SelectSingleNode(nodeSearch, nsManager)
                 End If
                 attr = nodes((newPosition - 1)).Attributes("id")
                 If (Not (attr) Is Nothing) Then
                     nodeSearch = String.Format("./p:sldId[@id='{0}']", attr.Value)
                     newNode = parentNode.SelectSingleNode(nodeSearch, nsManager)
                  End If
                  If ((Not (originalNode) Is Nothing) AndAlso (Not (newNode) Is Nothing)) Then
                     If (newPosition > originalPosition) Then
                        parentNode.InsertAfter(originalNode, newNode)
                     Else
                        parentNode.InsertBefore(originalNode, newNode)
                     End If
                     returnValue = newPosition
                     '  Save the presentation XML back to its part.
                     doc.Save(presentationPart.GetStream)
                  End If
               End If
            End If
         End If
       End If
   End If
   Return returnValue
End Function
public static int PPTReorderSlides(string fileName, int originalPosition, int newPosition)
{
   const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";

   int returnValue = -1;

   using (PresentationDocument pptPackage = PresentationDocument.Open(fileName, true))
   {
      PresentationPart presentationPart = pptPackage.PresentationPart;

      if (presentationPart != null)
      {
         XmlDocument doc = new XmlDocument();
         doc.Load(presentationPart.GetStream());

         //  Manage namespaces to perform XPath queries.
         XmlNamespaceManager nsManager = new XmlNamespaceManager(doc.NameTable);
         nsManager.AddNamespace("p", presentationmlNamespace);

         XmlNodeList nodes = doc.SelectNodes("//p:sldId", nsManager);
         if (nodes != null)
         {
            int slideCount = nodes.Count;
            int maxPosition = slideCount;
            if (slideCount > 0)
            {
               if (originalPosition == -1)
               {
                  originalPosition = maxPosition;
               }
               if (newPosition == -1)
               {
                  newPosition = maxPosition;
               }
               if (originalPosition > maxPosition)
               {
                  originalPosition = maxPosition;
               }
               if (newPosition > maxPosition)
               {
                  newPosition = maxPosition;
               }
               if (originalPosition < 0 || newPosition < 0)
               {
                  //  Something is wrong. Raise an exception.
                  throw new ArgumentException("Invalid parameter. Both original and new positions must be integers between -1 and the total number of slides.");
                }

                if (originalPosition != newPosition)
                {
                   XmlNode parentNode = doc.SelectSingleNode("//p:sldIdLst", nsManager);
                   if (parentNode != null)
                   {
                      XmlNode originalNode = null;
                      XmlNode newNode = null;
                      string nodeSearch = null;

                      XmlAttribute attr = nodes[originalPosition - 1].Attributes["id"];
                      if (attr != null)
                      {
                         nodeSearch = string.Format("./p:sldId[@id='{0}']", attr.Value);
                         originalNode = parentNode.SelectSingleNode(nodeSearch, nsManager);
                      }
                      attr = nodes[newPosition - 1].Attributes["id"];
                      if (attr != null)
                      {
                          nodeSearch = string.Format("./p:sldId[@id='{0}']", attr.Value);
                          newNode = parentNode.SelectSingleNode(nodeSearch, nsManager);
                      }
                      if (originalNode != null && newNode != null)
                      {
                         if (newPosition > originalPosition)
                         {
                            parentNode.InsertAfter(originalNode, newNode);
                         }
                         else
                         {
                            parentNode.InsertBefore(originalNode, newNode);
                         }
                         returnValue = newPosition;

                        doc.Save(presentationPart.GetStream());
                      }
                   }
                }
            }
         }
      }
   }
   return returnValue;
}

In this procedure, you pass in three parameters: the full path to the PowerPoint 2007 document, an ordinal value indicating the current position of the target slide, and the new position of the slide. You then open the document by using the Open method of the PresentationDocument object. Next, you retrieve the contents of the main presentation part. Then you set up a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default presentationml namespace, using the p qualifier. You then get the main presentation part and select the //p:sldId nodes that contain the references to the presentation slides. Then you get the target position of the slide.

NoteNote

If the specified original position of the slide is outside the range of the number of slides in the deck, use the last slide. If the new position is outside the range of slides in the deck, put the selected slide at the end of the deck.

Next, you determine what the caller wanted to do by first retrieving the number of slides in the deck. If the value of the original position or the target position is larger than the number of slides in the deck, set one or both equal to the last slide position. Otherwise, find the nodes for the slides and switch them. Finally, save the presentation XML back to its part.

Replacing Images on Slides

The following code replaces the first image on a slide, if it exists, with an image from an external file.

Public Sub PPTReplaceImageOnSlide(ByVal fileName As String, ByVal slideTitle As String, ByVal imagePath As String)
   Const presentationmlNamespace As String = "http://schemas.openxmlformats.org/presentationml/2006/main"
   Const drawingmlNamespace As String = "http://schemas.openxmlformats.org/drawingml/2006/main"
   Const relationshipNamespace As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships"
   Dim pptPackage As PresentationDocument = PresentationDocument.Open(fileName, True)
   Using (pptPackage)
      Dim presentationPart As PresentationPart = pptPackage.PresentationPart
      '  Manage namespaces to perform Xml XPath queries.
      Dim nt As NameTable = New NameTable
      Dim nsManager As XmlNamespaceManager = New XmlNamespaceManager(nt)
      nsManager.AddNamespace("p", presentationmlNamespace)
      nsManager.AddNamespace("a", drawingmlNamespace)
      nsManager.AddNamespace("r", relationshipNamespace)
      Dim slideParts As List(Of SlidePart) = New List(Of SlidePart)()
      presentationPart.GetPartsOfType(Of SlidePart)(slideParts)
      For Each slidePart As SlidePart In slideParts
         Dim slideDoc As XmlDocument = New XmlDocument(nt)
         slideDoc.Load(slidePart.GetStream)
         Dim titleNode As XmlNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager)
         If (Not (titleNode) Is Nothing) Then
            Dim titleText As String = titleNode.ParentNode.ParentNode.ParentNode.InnerText
             If (String.Compare(titleText, slideTitle, True) = 0) Then
               Dim imageParts As List(Of ImagePart) = New List(Of ImagePart)()
               slidePart.GetPartsOfType(Of ImagePart)(imageParts)
               For Each imagePart As ImagePart In imageParts
                  If (Not (imagePart) Is Nothing) Then
                     Dim oldRelID As String = slidePart.GetIdOfPart(imagePart)
                    Dim imageFile As String = Path.GetFileName(imagePath)
                    Dim newImageUri As Uri = New Uri(("/ppt/media/" + imageFile), UriKind.Relative)
                    Dim newImagePart As ImagePart = slidePart.AddImagePart(ImagePartType.Jpeg)
                    Dim outputStream As Stream = newImagePart.GetStream
                    Dim inputStream As FileStream = New FileStream(imagePath, FileMode.Open, FileAccess.Read)
                    Dim len As Integer = Convert.ToInt32(inputStream.Length)
                    Dim bytes() As Byte = New Byte((len) - 1) {}
                    Dim bytesRead As Integer = inputStream.Read(bytes, 0, len)
                    If (bytesRead = len) Then
                       outputStream.Write(bytes, 0, len)
                    End If
                    Dim searchString As String = String.Format("//p:pic//a:blip[@r:embed='{0}']", oldRelID)
                    Dim relNode As XmlNode = slideDoc.SelectSingleNode(searchString, nsManager)
                    If (Not (relNode) Is Nothing) Then
                       relNode.Attributes("r:embed").Value = slidePart.GetIdOfPart(newImagePart)
                    End If
                    slidePart.DeletePart(imagePart)

                   End If
               Next
               '  Reset the stream, and save the slide XML back to its part.
               slideDoc.Save(slidePart.GetStream)

            End If
         End If
      Next
   End Using
End Sub
public static void PPTReplaceImageOnSlide(string fileName, string slideTitle, string imagePath)
{
   const string presentationmlNamespace = "http://schemas.openxmlformats.org/presentationml/2006/main";
   const string drawingmlNamespace = "http://schemas.openxmlformats.org/drawingml/2006/main";
   const string relationshipNamespace = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";

   using (PresentationDocument pptPackage = PresentationDocument.Open(fileName, true))
   {
      PresentationPart presentationPart = pptPackage.PresentationPart;

      //  Manage namespaces to perform Xml XPath queries.
      NameTable nt = new NameTable();
      XmlNamespaceManager nsManager = new XmlNamespaceManager(nt);
      nsManager.AddNamespace("p", presentationmlNamespace);
      nsManager.AddNamespace("a", drawingmlNamespace);
      nsManager.AddNamespace("r", relationshipNamespace);

      List<SlidePart> slideParts = new List<SlidePart>();
      presentationPart.GetPartsOfType<SlidePart>(slideParts);

      foreach (SlidePart slidePart in slideParts)
      {
         XmlDocument slideDoc = new XmlDocument(nt);
         slideDoc.Load(slidePart.GetStream());

         XmlNode titleNode = slideDoc.SelectSingleNode("//p:sp//p:ph[@type='title' or @type='ctrTitle']", nsManager);
         if (titleNode != null)
         {
            string titleText = titleNode.ParentNode.ParentNode.ParentNode.InnerText;

            if (string.Compare(titleText, slideTitle, true) == 0)
            {
               List<ImagePart> imageParts = new List<ImagePart>();

               slidePart.GetPartsOfType<ImagePart>(imageParts);
               foreach (ImagePart imagePart in imageParts)
               {
                  if (imagePart != null)
                  {
                     string oldRelID = slidePart.GetIdOfPart(imagePart);
                     string imageFile = Path.GetFileName(imagePath);
                     Uri newImageUri = new Uri("/ppt/media/" + imageFile, UriKind.Relative);
                     ImagePart newImagePart = slidePart.AddImagePart(ImagePartType.Jpeg);

                     using (Stream outputStream = newImagePart.GetStream())
                     {
                        using (FileStream inputStream = new FileStream(imagePath, FileMode.Open, FileAccess.Read))
                        {
                           int len = Convert.ToInt32(inputStream.Length);
                           byte[] bytes = new byte[len];
                           int bytesRead = inputStream.Read(bytes, 0, len);
                           if (bytesRead == len)
                           {
                              outputStream.Write(bytes, 0, len);
                           }
                        }
                        }
                        string searchString = string.Format("//p:pic//a:blip[@r:embed='{0}']", oldRelID);
                        XmlNode relNode = slideDoc.SelectSingleNode(searchString, nsManager);
                        if (relNode != null)
                        {
                           relNode.Attributes["r:embed"].Value = slidePart.GetIdOfPart(newImagePart);
                        }
                        slidePart.DeletePart(imagePart);
                        break;
                  }
               }

              slideDoc.Save(slidePart.GetStream());
               break;
            }
         }
      }
   }
}

In this procedure, you pass in three parameters: the full path to the PowerPoint 2007 document, the title of the target slide, and the full path of the file containing the new image. You then open the document by using the Open method of the PresentationDocument object. Next, you retrieve the contents of the main presentation part. Then you setup a namespace manager by using the XmlNamespaceManager object and by setting a reference to the default presentationml namespace using the p qualifier, the drawingmlNamespace namespace with the a qualifier, and the relationshipNamespace namespace with the r qualifier. Then you retrieve the title of each slide by using the title or ctrTitle attribute of the //p:sp//p:ph node.

Next, you compare the title you retrieved from the slide part with the title of the slide passed into the procedure. If you find the specified slide, you need to modify the slide part. First, you look through the slide's relationships to see if there is an image. If so, you create a part for the new image and copy the image into the new part. Then update the relationship in the slide part and delete the old relationship.

NoteNote

PowerPoint cleans up the orphaned image part the next time you save the package file.

Finally, save the slide XML back to its part.

Conclusion

As this article demonstrates, working with Excel 2007 and PowerPoint 2007 files is much easier with the Welcome to the Open XML Format SDK 1.0. You can use the procedures in this article to experiment and adapt them for your own use. For more ideas about how to work programmatically with Excel and PowerPoint files, see Manipulating Excel 2007 and PowerPoint 2007 Files with the Open XML Format API (Part 1 of 2).

Additional Resources

For more information, see the following resources: