• Blog
  • Docs
  • Pricing
  • We’re hiring!
Log inSign up
jslez

jslez

slidebot

Public
Like
slidebot
Home
Code
9
backend
4
backup
1
debug
1
frontend
1
pptx_examples
2
README-content-editing.md
README-development.md
README-how_pptx-works.md
README.md
Branches
1
Pull requests
Remixes
History
Environment variables
7
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in milliseconds.
Sign up now
Code
/
README-how_pptx-works.md
Code
/
README-how_pptx-works.md
Search
…
README-how_pptx-works.md

PPTX Open XML Cheat Sheet for Editing Agents

This document explains how a .pptx file works internally so you can parse and edit PowerPoint presentations via XML: change text, swap images, tweak layouts, and safely add or remove slides.


1. What a PPTX File Actually Is

A .pptx file is a ZIP archive following the Open Packaging Convention (OPC) and the Office Open XML (OOXML) PresentationML standard.

Inside the ZIP you’ll find:

  • XML “parts” (content)
  • Binary assets (images, media, embedded files)
  • Relationship files (*.rels)
  • A content-type manifest ([Content_Types].xml)

You must:

  1. Treat the PPTX as a ZIP.
  2. Use the relationships to navigate, not just filenames.
  3. Preserve content types and relationship integrity when editing.

2. High-Level Directory Layout

Typical PPTX structure (paths are inside the ZIP):

  • /_rels/.rels
    • Root relationships (e.g., to ppt/presentation.xml).
  • /[Content_Types].xml
    • Declares MIME types for each part type.
  • /ppt/presentation.xml
    • Main presentation part (list of slide references, masters, etc.).
  • /ppt/_rels/presentation.xml.rels
    • Relationships from the presentation to slide parts, slide masters, theme, etc.
  • /ppt/slides/slide1.xml, slide2.xml, ...
    • Individual slide parts.
  • /ppt/slides/_rels/slide1.xml.rels, ...
    • Relationships from slides to layouts, images, charts, etc.
  • /ppt/slideMasters/slideMaster1.xml, ...
    • Slide master parts.
  • /ppt/slideLayouts/slideLayout1.xml, ...
    • Slide layout parts.
  • /ppt/notesSlides/notesSlide1.xml, ...
    • Notes for slides.
  • /ppt/theme/theme1.xml, ...
    • Theme (colors, fonts).
  • /ppt/media/image1.png, image2.jpeg, ...
    • Images and other media.

Other optional parts: charts, embedded objects, custom XML, etc.


3. Core OPC Concepts: Parts & Relationships

3.1 Parts

A part is a file inside the package, such as:

  • ppt/slides/slide3.xml (XML)
  • ppt/media/image5.png (PNG)

Each part has:

  • A path inside the ZIP
  • A content type (from [Content_Types].xml)
  • Zero or more relationships to other parts or external URIs

3.2 Relationships (*.rels)

Relationships are stored in .rels XML files next to their “source” part.

Each <Relationship> has:

  • Id – local identifier like rId1
  • Type – URI describing the relationship type
  • Target – relative path to target part
  • TargetMode – Internal (default) or External

Important:

  • rId values are only unique within their source part.
  • You must resolve targets via the appropriate .rels file, not by guessing filenames.

4. Presentation Structure: How Slides Are Wired

The entry point is ppt/presentation.xml.

4.1 Main Presentation XML

ppt/presentation.xml root is typically <p:presentation>.

Key elements:

  • <p:sldIdLst> – ordered list of slide instances
    • Child <p:sldId> elements:
      • id: unique numeric ID within this list
      • r:id: references a relationship in presentation.xml.rels
  • <p:sldMasterIdLst> – slide masters
  • <p:notesMasterIdLst> – notes master
  • <p:handoutMasterIdLst> – handout master
  • <p:sldSz> – slide size
  • <p:defaultTextStyle> – default text styles

Slide order is determined by the order of <p:sldId> elements in <p:sldIdLst>, not by the numeric suffix in slideN.xml.

4.2 How to Find All Slides

Algorithm:

  1. Open ppt/presentation.xml.
  2. Read all <p:sldId> elements in <p:sldIdLst> in order.
  3. For each <p:sldId>, get its r:id.
  4. In ppt/_rels/presentation.xml.rels, find the <Relationship> with Id="<that r:id>".
  5. Its Target is a slide part path, e.g. slides/slide3.xml.

Use this relationship-based mapping instead of assuming slideN.xml is slide number N.


5. Slide XML (PresentationML Basics)

A slide part (e.g. ppt/slides/slide1.xml) typically looks like:

<p:sld xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main" xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships"> <p:cSld> <p:spTree> <!-- shapes, pictures, groups, etc. --> </p:spTree> </p:cSld> <p:clrMapOvr>...</p:clrMapOvr> </p:sld>

Inside <p:spTree> you’ll see:

  • <p:sp> – shapes (rectangles, text boxes, titles, etc.)
  • <p:pic> – images
  • <p:grpSp> – group shapes
  • <p:cxnSp> – connectors
  • <p:graphicFrame> – charts, tables, SmartArt, etc.

A typical <p:sp> (shape) has:

  • <p:nvSpPr> – non-visual properties (ID, name, placeholder info)
  • <p:spPr> – shape properties (geometry, line, fill, transform)
  • <p:txBody> – text content (if the shape holds text)

6. Text Model: Finding and Editing Text

Text is stored using DrawingML (a: namespace) inside <p:txBody>.

6.1 Text Body Structure

Typical text-bearing shape:

<p:sp> <p:nvSpPr>...</p:nvSpPr> <p:spPr>...</p:spPr> <p:txBody> <a:bodyPr/> <!-- text box properties --> <a:lstStyle/> <!-- optional paragraph style list --> <a:p> <!-- paragraph --> <a:pPr>...</a:pPr> <!-- paragraph properties --> <a:r> <!-- run --> <a:rPr .../> <!-- run properties: font, size, color, etc. --> <a:t>Some text</a:t> <!-- actual text --> </a:r> <a:br/> <!-- line break --> <a:r> <a:rPr .../> <a:t>More text</a:t> </a:r> </a:p> <a:p>...</a:p> <!-- another paragraph --> </p:txBody> </p:sp>

Elements:

  • <a:p> – paragraph
  • <a:pPr> – paragraph properties (bullets, level, alignment)
  • <a:r> – run (a sequence of uniformly formatted text)
  • <a:rPr> – run properties (size, font, color, bold, etc.)
  • <a:t> – literal text
  • <a:br/> – manual line break
  • <a:fld> – field (date, slide number, etc.), also containing <a:t>

6.2 Practical Rules for Text Editing

  • To collect text from a slide:
    • Traverse p:sld → p:cSld → p:spTree → p:sp with a p:txBody.
    • For each a:p:
      • Concatenate all a:t text nodes in order.
      • Insert line breaks when encountering a:br/.
  • For search/replace:
    • Operate at the a:t level where possible.
    • Preserve the structure of a:r and a:rPr to keep formatting.
    • Handle text that may be split across multiple runs, even mid-word.
  • Placeholders (title, body, footer, etc.) are identified by <p:ph> in the shape’s non-visual properties, not by the text itself.

7. Layouts, Masters, and Placeholders

7.1 Slide Masters and Layouts

Hierarchy:

  • Presentation → Slide Masters (/ppt/slideMasters/slideMasterN.xml)
  • Each Slide Master → Slide Layouts (/ppt/slideLayouts/slideLayoutN.xml) via relationships and <p:sldLayoutIdLst>
  • Each Slide → a Slide Layout via its slide .rels file

Slide master/layout parts look similar to slides:

  • <p:sldMaster> / <p:sldLayout> root
  • <p:cSld>/<p:spTree> – shapes and placeholders
  • <p:txStyles> – default text styles
  • References to theme (/ppt/theme/themeN.xml)

Slides inherit positioning and styles from their layout and master.

7.2 Placeholders (<p:ph>)

Placeholders designate semantic roles:

<p:sp> <p:nvSpPr> <p:cNvPr id="2" name="Title 1"/> <p:cNvSpPr/> <p:nvPr> <p:ph type="title" idx="0"/> </p:nvPr> </p:nvSpPr> ... </p:sp>

Common type values:

  • title, ctrTitle – title, centered title
  • subTitle – subtitle
  • body – main content
  • pic – picture placeholder
  • dt – date
  • sldNum – slide number
  • ftr – footer

Slides override placeholder text by providing shapes with matching placeholder metadata (type, idx).

7.3 Changing Layout of a Slide

Conceptually:

  1. In the slide’s .rels file, change the relationship of type slideLayout to target a different slideLayoutX.xml.
  2. Ensure the new layout’s placeholders are compatible (same logical roles), or be prepared to reposition or re-create shapes.

8. Images, Graphics, and Other Media

8.1 Images (<p:pic> and <a:blip>)

Typical picture shape:

<p:pic> <p:nvPicPr>...</p:nvPicPr> <p:blipFill> <a:blip r:embed="rId5"/> ... </p:blipFill> <p:spPr>...</p:spPr> <!-- transform, size, etc. --> </p:pic>
  • r:embed="rId5" refers to a relationship in ppt/slides/_rels/slideX.xml.rels.
  • That relationship’s Target is something like ../media/image3.png.
  • The actual image is at ppt/media/image3.png.

To swap an image while preserving position and size:

  1. Find the <a:blip> with r:embed="rIdX".
  2. Resolve rIdX in the slide’s .rels file.
  3. Overwrite the binary file at the Target path with the new image bytes (preferably same format).
  4. Do not change XML unless pointing to a new media part.

8.2 Shapes and Coordinates

Shapes use EMUs (English Metric Units):

  • 1 inch = 914400 EMUs
  • 1 cm ≈ 360000 EMUs

Positions and sizes are defined in <a:xfrm>:

<a:xfrm> <a:off x="914400" y="914400"/> <!-- position --> <a:ext cx="4572000" cy="3200400"/> <!-- width/height --> </a:xfrm>

Adjust these attributes to move/resize shapes.

8.3 Charts, Tables, SmartArt

These reside in <p:graphicFrame> with <a:graphic> inside. Charts often reference:

  • /ppt/charts/chartN.xml
  • Possibly embedded Excel parts under /xl/

For a generic editing agent:

  • Avoid restructuring chart/table XML unless you fully know the schema.
  • For simple text edits (titles, labels), locate <a:t> text nodes within chart/table parts using similar traversal.

9. Themes, Colors, and Fonts

Themes live under /ppt/theme/themeN.xml.

They define:

  • Color schemes
  • Font schemes
  • Effects and other defaults

Slide masters reference themes, and slides/masts can override color mapping using <p:clrMap> and <p:clrMapOvr>.

For text-only edits, you normally do not need to touch theme parts. Just keep them intact.


10. [Content_Types].xml: Do Not Break It

/[Content_Types].xml declares MIME types for parts.

Examples:

<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"> <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/> <Default Extension="xml" ContentType="application/xml"/> <Override PartName="/ppt/presentation.xml" ContentType="application/vnd.openxmlformats-officedocument.presentationml.presentation.main+xml"/> <Override PartName="/ppt/slides/slide1.xml" ContentType="application/vnd.openxmlformats-officedocument.presentationml.slide+xml"/> <Default Extension="png" ContentType="image/png"/> </Types>

Rules:

  • When adding a new slide slideN.xml, if an <Override> for /ppt/slides/slideN.xml or a generic slide override is already present, you typically don’t need to change anything.
  • When introducing a totally new extension (e.g., .foo), add a <Default> or <Override> entry.
  • Do not remove or corrupt existing entries.

Breaking this file can make the PPTX unreadable.


11. Common Editing Tasks: Recommended Algorithms

11.1 Get All Text in All Slides

  1. From ppt/presentation.xml, list all <p:sldId> in order.
  2. For each r:id, resolve to slides/slideX.xml via presentation.xml.rels.
  3. For each slide:
    • Parse ppt/slides/slideX.xml.
    • Traverse p:cSld → p:spTree → p:sp.
    • For each p:sp that has p:txBody:
      • For each a:p:
        • Collect a:t values, respecting a:br/ as line breaks.

Optional: use <p:ph> info to categorize text (title, body, footer, etc.).

11.2 Replace Text with New Text (Within a Shape)

Scenario: Replace entire text of a title or content placeholder.

Simplest robust approach:

  1. Identify the target shape by placeholder (p:ph type="title", body, etc.) or by existing text.
  2. Inside its p:txBody, remove existing a:p children.
  3. Insert new structure:
<a:p> <a:r> <a:rPr/> <!-- can clone from existing run or leave minimal --> <a:t>New text here</a:t> </a:r> </a:p>

For multiple paragraphs, add multiple a:p elements.

Formatting-preserving approach:

  • Keep a:pPr and a:rPr nodes.
  • Only modify the content of a:t, preserving run structure.

11.3 Replace an Image

  1. Identify which p:pic to edit (by placeholder type pic, by name, or by its current image).
  2. Find <a:blip r:embed="rIdX"> inside its p:blipFill.
  3. In ppt/slides/_rels/slideX.xml.rels, find Relationship Id="rIdX".
  4. Resolve its Target, e.g., ../media/image5.png → ppt/media/image5.png.
  5. Overwrite that file with your new image bytes (matching format, e.g., PNG → PNG).

No changes needed in XML or [Content_Types].xml if you reuse the same part.

11.4 Add a New Slide Based on an Existing Layout

  1. Choose a layout:

    • From slide masters and their <p:sldLayoutIdLst>, or
    • From an existing slide’s slideLayout relationship.
  2. Create a new slide part:

    • Option 1: clone an existing slide and clear the content you want to reset.
    • Option 2: construct a minimal valid p:sld referencing the chosen layout.
  3. Save it as ppt/slides/slideN.xml (unique filename).

  4. In ppt/_rels/presentation.xml.rels, add a relationship:

<Relationship Id="rIdNew" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slide" Target="slides/slideN.xml"/>
  1. In ppt/presentation.xml, add a new <p:sldId> under <p:sldIdLst>:
<p:sldId id="uniqueNumericId" r:id="rIdNew"/>
  • id must be a unique integer within <p:sldIdLst>.
  1. In ppt/slides/_rels/slideN.xml.rels, add a slideLayout relationship:
<Relationship Id="rIdLayout" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/slideLayout" Target="../slideLayouts/slideLayoutX.xml"/>
  1. Ensure [Content_Types].xml has an appropriate <Override> for slide parts (usually already present).

11.5 Reorder Slides

  1. In ppt/presentation.xml, locate <p:sldIdLst>.
  2. Reorder the <p:sldId> elements themselves.

Don’t change their id or r:id; slide order is determined solely by element order.


12. Robustness Guidelines for an Editing Agent

To make safe, reliable changes:

  1. Always use relationships:

    • Never infer master/layout/media links purely from filenames or numeric IDs.
  2. Preserve unknown XML:

    • If you don’t need to modify it, copy elements and attributes through unchanged.
  3. Preserve namespaces:

    • Keep existing xmlns:p, xmlns:a, xmlns:r declarations and prefixes as-is.
  4. Make minimal edits:

    • Focus on local text/shape/media changes instead of global restructures.
  5. Keep [Content_Types].xml consistent:

    • Only add entries when introducing new part types or extensions.
    • Avoid changing or removing existing entries.
  6. Maintain well-formed XML:

    • Ensure all tags are properly closed and nesting is valid.

13. Minimal Parsing Strategy Summary

For most real-world editing tasks, the following subset is sufficient:

  • Understand the ZIP and locate:

    • ppt/presentation.xml
    • ppt/_rels/presentation.xml.rels
    • ppt/slides/slide*.xml and their .rels
    • ppt/media/*
  • Use:

    • <p:sldIdLst> + presentation.xml.rels to enumerate slides in order.
    • p:cSld → p:spTree → p:sp → p:txBody → a:p → a:r → a:t to read/write text.
    • p:pic + a:blip r:embed="..." + slide .rels to handle images.
    • p:ph metadata to recognize placeholders (title, body, footer, etc.).
    • a:xfrm (a:off, a:ext) for shape coordinates (optional for text-only edits).

Staying within this structure lets you:

  • Extract and modify text (titles, bullet lists, body text).
  • Replace images without disturbing layouts.
  • Add, remove, and reorder slides.
  • Respect existing themes, masters, and layouts without needing to deeply modify them.
FeaturesVersion controlCode intelligenceCLIMCP
Use cases
TeamsAI agentsSlackGTM
DocsShowcaseTemplatesNewestTrendingAPI examplesNPM packages
PricingNewsletterBlogAboutCareers
We’re hiring!
Brandhi@val.townStatus
X (Twitter)
Discord community
GitHub discussions
YouTube channel
Bluesky
Open Source Pledge
Terms of usePrivacy policyAbuse contact
© 2025 Val Town, Inc.