Content Best Practices


Content Best Practices

Overview

When adding content to InSkill, you have two primary options depending on how you want that content to be used:

Resources are static content such as documents, websites/links, videos, and other reference materials. Use resources when you want to provide readable, contextual information that users can browse or that InSkill can reference directly.

Agents are designed for interactive or dynamic use cases, where the system needs to process, query, or act on data.

  • Excel Agent – Used for structured data (CSV/Excel) that needs to be searched, filtered, or analyzed (e.g., lookups, tables, large datasets).
  • Script Agent – Used for custom logic or workflows, enabling more advanced or dynamic behavior beyond static content.

In general:

Use Resources for content you want to present or reference as-is

Use Agents when you need the system to interact with or compute on the data

Choosing a Format

Prefer the original format.
If your content already exists in a specific format, keep it as-is.

Choosing a Format for New Content:

Tabular Data

  1. Stand Alone Resource (CSV/Excel)
    • Tables are small to medium sized
    • The tabular data is the primary content
    • Users will ask questions about the data (not query it directly)

    Currently only tables are extracted...surrounding text is ignored

  2. Tables Inside Documents (PDF/Word)
    • Tables are part of a larger narrative
    • Context matters (e.g., explanation + table together)
    • Tables are small to medium sized
  3. Agents (CSV/Excel)
    • Data is large or frequently queried
    • Users need to:
      • Filter
      • Lookup values
      • Aggregate data

      Examples:
    • “Find the value in column C where column F = X”
    • “What’s the average pressure for pumps in region A?”
  4. Agents (Scripts)
    • APIs, etc.
  5. Other Data Sources
    If your data lives in:
    • Databases
    • Other structured systems

👉 Contact us to discuss integration options.

Decision Guide

ScenarioRecommended Option
Table with supporting contextTable within original document
Small/medium standalone tableCSV / Excel
Large dataset or query use caseAgent

Frequently Changing Content

Dynamic resources (HTML/Markdown)

Best for:

  • Content that changes frequently
  • Unstructured or miscellaneous content
  • E.g. FAQs, employee phone extensions, etc.

Benefits:

  • Renders cleanly in the browser
  • Easy to edit and maintain

Everything Else

PDF, Word (.docx), Text (.txt), Markdown (.md), etc.

Note: PDF files open in the browser. Other file types download when opened.

Document Best Practices

These best practices help ensure your content is structured in a way that InSkill can effectively parse, understand, and use.

Word

  • Use predefined styles (Heading 1, Heading 2, etc.)

HTML

  • Use semantic HTML
    • Headings (<h1>, <h2>, etc.)
    • Navigation elements (to exclude menus from parsing)
  • Add title attributes to images and links
  • Ensure tables include a header row
  • Avoid using tables for layout
  • Use inskill-skip (attribute or class) to exclude irrelevant content (e.g. cookies banners, shopping cart, language selectors, etc.)

PDF

  • Prefer tables with visible borders/lines for better detection

Excel

  • Named tables or filtered ranges improve table detection
  • Top row should be the headers when possible
  • Avoid vertical headers