Skip to content

Formatting

Formatting is implemented by deriving from pegium::AbstractFormatter.

The formatter is a regular class stored in services->lsp.formatter. It works on top of the CST, which is why formatting becomes much easier once your grammar and AST shape are already reasonably stable.

Creating a formatter

Typical header:

class DomainModelFormatter : public pegium::AbstractFormatter {
public:
  explicit DomainModelFormatter(const pegium::Services &services);

protected:
  virtual void formatEntity(pegium::FormattingBuilder &builder,
                            const ast::Entity *entity) const;
  virtual void formatLineComment(HiddenNodeFormatter &comment) const;
};

Typical implementation:

DomainModelFormatter::DomainModelFormatter(
    const pegium::Services &services)
    : AbstractFormatter(services) {
  on<ast::Entity>(&DomainModelFormatter::formatEntity);
  onHidden("SL_COMMENT", &DomainModelFormatter::formatLineComment);
}

That constructor is where you declare which formatting method should run for each exact AST type or hidden terminal rule.

Formatting one node

Inside a formatting method:

  1. get a node-scoped formatter from the builder
  2. select CST-backed regions
  3. attach whitespace actions to those regions

Example:

void DomainModelFormatter::formatEntity(
    pegium::FormattingBuilder &builder,
    const ast::Entity *entity) const {
  auto formatter = builder.getNodeFormatter(entity);
  formatter.keyword("entity").append(oneSpace);

  if (entity->superType.has_value()) {
    formatter.keyword("extends").prepend(oneSpace).append(oneSpace);
  }

  const auto openBrace = formatter.keyword("{");
  const auto closeBrace = formatter.keyword("}");
  formatBlock(openBrace, closeBrace, formatter.interior(openBrace, closeBrace));
}

This does three things:

  • forces one space after entity
  • normalizes spacing around extends
  • formats the { ... } block with the generic block helper

This is the basic pattern of the whole formatting DSL: select a region, then attach spacing, line-break, or indentation actions to it.

Registering several rules

The preferred style is one method per exact AST type:

on<ast::DomainModel>(&MyFormatter::formatDomainModel);
on<ast::PackageDeclaration>(&MyFormatter::formatPackageDeclaration);
on<ast::Entity>(&MyFormatter::formatEntity);
on<ast::Feature>(&MyFormatter::formatFeature);

The formatter engine walks the AST and dispatches to the registered method when it encounters that exact node type.

Use onHidden("RULE_NAME", ...) for hidden tokens such as comments:

onHidden("ML_COMMENT", &MyFormatter::formatComment);
onHidden("SL_COMMENT", &MyFormatter::formatLineComment);

Selecting regions

builder.getNodeFormatter(node) returns a NodeFormatter<T> scoped to the CST subtree of that AST node.

Common selections:

  • property<&T::member>()
  • property<&T::vectorMember>(index)
  • properties<&T::member...>()
  • keyword("...")
  • keywords("...", "...")
  • hidden("RULE")
  • hiddens("RULE")
  • interior(start, end)

Example:

auto formatter = builder.getNodeFormatter(feature);
formatter.keyword(":").prepend(noSpace).append(oneSpace);

Built-in actions

Inside AbstractFormatter, the main whitespace actions are:

  • noSpace
  • oneSpace
  • spaces(count)
  • newLine
  • newLines(count)
  • indent
  • noIndent
  • fit(...)

These are protected members, so they can be used directly inside your formatter methods:

formatter.keyword("entity").append(oneSpace);
formatter.keyword(":").prepend(noSpace).append(oneSpace);

Generic helpers

AbstractFormatter also provides higher-level helpers for recurring layout patterns:

  • formatBlock(...)
  • formatSeparatedList(...)
  • formatLineComment(...)
  • formatMultilineComment(...)

Use them whenever the rule is a standard block, comma-separated list, or comment normalization. That keeps the formatter small and consistent.

Formatting hidden nodes

Hidden nodes are handled through HiddenNodeFormatter.

Typical comment formatting method:

void MyFormatter::formatLineComment(HiddenNodeFormatter &comment) const {
  comment.replace(AbstractFormatter::formatLineComment(comment));
}

This is the right place to:

  • normalize line comments
  • reflow multiline comments
  • keep documentation tags such as @param ... consistent

Wiring the formatter into services

Creating the formatter class is not enough. You must also install it into the language services:

auto services = pegium::services::makeDefaultServices(
    sharedServices, "domain-model");

services->parser =
    std::make_unique<const domainmodel::parser::DomainModelParser>(*services);

services->lsp.formatter =
    std::make_unique<lsp::DomainModelFormatter>(*services);

makeDefaultServices(...) creates the service container and installs the common defaults, but you still assign your language parser explicitly.

Without the formatter assignment, the formatter slot stays empty and formatting requests do nothing.

Practical advice

  • start with one or two node types
  • keep one formatting method per exact AST type
  • prefer formatBlock(...) and formatSeparatedList(...) over repeating the same brace or comma logic everywhere
  • use hidden-node formatting only when comment text itself needs rewriting

Use the formatter DSL reference for the full API surface.