Overseas access: www.kdjingpai.com
Bookmark Us

POML (Prompt Orchestration Markup Language) is a new markup language developed by Microsoft, specifically designed to address the challenges encountered in the engineering of advanced prompt words for large-scale language models (LLMs). In practice, complex prompts often lack effective organization, are difficult to integrate multiple data formats, and are susceptible to the model's sensitivity to specific formats.POML separates the content of prompts from their presentation by providing a set of structured syntax similar to HTML and a style system similar to CSS. This design not only improves the readability, maintainability, and reusability of cue words, but also allows developers to seamlessly embed multiple external data sources such as text, tables, images, etc. through specialized tags. In addition, it has a built-in template engine that supports variables, loops and conditional judgments to dynamically generate complex prompts.POML also provides development tools including VS Code plug-ins and multi-language SDKs to help developers build and test complex LLM applications more efficiently.

 

Function List

  • structured markup syntax (computing): Use HTML-like tags (such as<role><task><example>) to organize cue words for a modular design that enhances readability and reusability.
  • Diversified data processing:: Adoption<document><table><img>Dedicated components, such as the ability to embed or reference external data directly and customize the formatting.
  • Separation of content and style: A CSS-like style system that allows the use of the<stylesheet>or inline attributes to adjust the output format (e.g., redundancy level, syntax format) without having to modify the core logic.
  • Built-in template engine:: Support for the use of{{ }}Variables,forLoop,ifconditional judgment and<let>Variable definition for dynamic generation of data-driven cue words.
  • Rich development tools: Provides a Visual Studio Code plug-in that supports syntax highlighting, auto-completion, real-time preview and error diagnosis.
  • Multi-language SDK: Provides software development kits for Node.js (JavaScript/TypeScript) and Python for easy integration into existing workflows and LLM frameworks.

Using Help

POML simplifies the process of creating, testing, and maintaining complex cues by providing a markup language, development tools, and SDK. Below you will find details on how to install and use POML.

1. Installation

POML offers a variety of installation methods, and developers can choose the most suitable one according to their working environment.

Visual Studio Code plug-in

This is the most recommended way to get started, as plugins offer a wealth of features to enhance development efficiency.

  1. Open Visual Studio Code.
  2. Click on the "Extensions" icon in the Activity Bar on the left.
  3. Type "POML" in the search box.
  4. Find the officially released plugin and click "Install".

After installing the plugin, you'll get syntax highlighting, code auto-completion, hover document prompts, live preview and error checking.

Node.js (NPM)

If your project is based on a Node.js environment, you can install POML's JavaScript/TypeScript libraries via npm.

npm install pomljs

Python (PyPI)

For Python developers, the POML library can be installed via pip.

pip install poml

If you wish to do a local development installation from a cloned GitHub repository, you can use the following command:

pip install -e .

2. Configuring the LLM model

In order to test prompt words in the VS Code plugin, you need to configure the Big Language Model API used.

  1. In VS Code, via the menu bar 文件 > 首选项 > 设置 Open the Settings screen.
  2. Type "POML" in the search box to find the relevant configuration item.
  3. Set the following information according to your model provider (e.g. OpenAI, Azure, Google, etc.):
    • Model Provider: Select your model provider.
    • API Key: Fill in your API key.
    • Endpoint URL: Fill in the address of the model's API endpoint.

You can also add this configuration information directly to yoursettings.jsonDocumentation.

3. Preparation of the first POML document

The syntax of POML is very intuitive and similar to HTML.Here is a basic example, save it asexample.pomlDocumentation.

<poml>
<role>你是一位耐心的老师,正在向一个10岁的孩子解释概念。</role>
<task>参考提供的图片,解释光合作用的概念。</task>
<img src="photosynthesis_diagram.png" alt="光合作用示意图" />
<output-format>
让解释保持简单、有趣,并且不超过100个字。
开头请说“你好,未来的科学家!”。
</output-format>
</poml>

Code Explanation.

  • <poml>: The root tag for all content.
  • <role>: Defines the role that LLM is expected to play.
  • <task>: Describes the specific tasks that need to be accomplished by LLM.
  • <img>: Embed an image as a contextual reference.srcattribute points to the local image file path.
  • <output-format>: Explicitly specifies the formatting requirements for the output content.

4. Core syntax and functional operations

Data Embedding

One of the most powerful features of POML is the ability to easily integrate different types of data.

  • (computer) file: Use<document>Tags are embedded in external text files.
<document src="./report.txt" />
  • forms: Use<table>Tabs can directly define table data or reference CSV files.
<table src="./data.csv" />

template engine

POML has a built-in template function to dynamically generate cue word content.

  • variant: Use<let>Define the variables and pass the{{ }}Use in text.
<let name="concept">光合作用</let>
<task>请解释什么是{{concept}}。</task>
  • circulate: Useforattribute traverses the data.
<let name="topics" type="json">["光合作用", "细胞呼吸", "基因遗传"]</let>
<task>
请依次解释以下概念:
<for each="item" in="topics">- {{item}}
</for>
</task>
  • conditional judgment: Useifattribute to generate different content based on conditions.
<let name="is_simple" type="boolean">true</let>
<if condition="is_simple">
<task>请用简单的语言解释这个概念。</task>
</if>

stylized

pass (a bill or inspection etc)<stylesheet>tag, which defines the "style" of the cue word in the same way as CSS, separating the content from the formatting. This helps to cope with different LLM's preferences for specific formats.

<stylesheet>
task {
format: "markdown";
verbosity: "high";
}
</stylesheet>
<task>解释黑洞的形成过程。</task>

In this example, the<task>The content of the tag will be presented to LLM in a highly detailed Markdown format as defined by the stylesheet.

application scenario

  1. Building complex conversational intelligences
    When developing customer service bots or personal assistants, you need to deal with multiple rounds of dialog, external knowledge bases, and variable output formats. Dialogue logic, user information, document data and output requirements can be modularized using POML to make the prompts well-structured and easy to maintain and extend.
  2. Automated content generation
    In scenarios such as report generation, code writing or marketing copy creation, highly customized content can be generated in bulk by dynamically combining data (e.g. CSV files, JSON data) with fixed text structures using POML's template engine.
  3. Education and training tools
    A dynamic learning aid can be created. For example, the language style and level of detail used to explain the same scientific concept (e.g., photosynthesis) can be dynamically adapted according to the age of the students (as a variable) and combined with a variety of media such as pictures and tables.
  4. Multimodal Application Development
    For LLM applications that need to handle multiple inputs such as text and images, POML provides a unified interface. For example, when developing an image analysis tool, you can use the<task>Describe the analysis task with<img>tag passed in the image to be analyzed with the<output-format>Specifies to output the analysis results in JSON format.

QA

  1. What is POML? What problem does it solve?
    POML is a markup language designed for large-scale language models (LLMs). It mainly solves the problems of confusing structure, difficult data integration, variable formatting requirements and lack of professional tool support encountered in writing complex prompts, making the development and maintenance of prompts more systematic and efficient.
  2. What is the difference between POML and writing direct text prompt words?
    Writing text prompts directly is like writing code in Notepad, which is simple but becomes unmanageable when the logic becomes complex. pOML provides web-like structure (HTML), style (CSS) and dynamic capabilities (template engine) to decouple the different parts of the prompt (e.g., roles, tasks, data, formatting requirements), making it easier to read, modify and reuse.
  3. Do I need to learn a whole new language to use POML?
    No. The syntax of POML is borrowed from HTML and is very intuitive and easy to understand. If you have experience with any markup language (such as HTML or XML), you'll find it very quick to get started with POML. The official VS Code plug-in also greatly reduces the learning curve.
  4. Does the use of POML affect the interactive performance with LLM?
    No. POML itself does not interact directly with the LLM; the POML file is "rendered" into a final plain text or multimodal request when used with the SDK or tool before it is sent to the LLM, so it's a tool to improve efficiency during the development phase without adding extra overhead to the final API request.
0Bookmarked
0kudos

Recommended

Can't find AI tools? Try here!

Just type in the keyword Accessibility Bing SearchYou can quickly find all the AI tools on this site.

Top

en_USEnglish