On this page
A piece of paper on a typewriter with the words "AngularJS" written on it. Image by Markus Winkler from Pixabay.

Removing Angular-specific data from HTML

Clean Angular HTML easily: remove framework-specific attributes, comments, and whitespace for pure markup.

When working with Angular applications, the rendered HTML often contains framework‑specific attributes such as _ngcontent, ng-*, or data-ng-*.

These attributes are essential for Angular’s internal mechanics, but they can clutter the markup when you want to:

  • Export or snapshot HTML for testing
  • Compare DOM structures
  • Generate framework‑agnostic reports
  • Ensure clean output for accessibility or SEO audits

To solve this, we’ll use a utility method that parses HTML, removes Angular-specific attributes, strips comments, and cleans up insignificant whitespace – leaving only meaningful content.

The utility method

Cleaning up HTML from Angular-specific data (TypeScript version)
/**
 * Removes Angular-specific attributes, HTML comments, and insignificant whitespace
 * from HTML string while preserving meaningful content.
 *
 * @param html Valid HTML string. Passing null/undefined throws TypeError.
 * @throws {TypeError} If html is not a string.
 * @returns Cleaned HTML string.
 */
public static getCleanHtml(html: string): string {
  if (typeof html !== 'string') {
    throw new TypeError('html must be a string');
  }

  if (html.trim().length === 0) {
    return '';
  }

  const parser: DOMParser = new DOMParser();
  const doc: Document = parser.parseFromString(html, 'text/html');
  const container: HTMLElement = doc.body;

  if (!container.firstChild) {
    return '';
  }

  const walker: TreeWalker = doc.createTreeWalker(
    container,
    NodeFilter.SHOW_ELEMENT | NodeFilter.SHOW_COMMENT | NodeFilter.SHOW_TEXT
  );

  // Nodes are collected and removed after traversal to avoid invalidating the TreeWalker while iterating.
  const nodesToRemove: Node[] = [];

  while (true) {
    const node: Node = walker.nextNode();
    if (node === null) break;

    switch (node.nodeType) {
      case Node.COMMENT_NODE:
        nodesToRemove.push(node);
        break;

      case Node.ELEMENT_NODE: {
        const element = node as Element;
        const attrsToRemove: Attr[] = [];

        for (const attr of Array.from(element.attributes)) {
          if (/^(?:_ng(?:content|host)|ng-|data-ng-)/.test(attr.name)) {
            attrsToRemove.push(attr);
          }
        }

        for (const attr of attrsToRemove) {
          element.removeAttributeNode(attr);
        }
        break;
      }

      case Node.TEXT_NODE: {
        const textNode = node as Text;
        const content = textNode.textContent;

        // Whitespace is only removed at the top level to avoid collapsing meaningful spacing inside inline elements.
        if (content && !content.trim() && textNode.parentNode === container) {
          nodesToRemove.push(textNode);
        }
        break;
      }
    }
  }

  for (const node of nodesToRemove) {
    if (node.parentNode) {
      node.parentNode.removeChild(node);
    }
  }

  let cleanedHtml = container.innerHTML;

  // Prevent adjacent tags from collapsing visually (e.g., </div><span>)
  cleanedHtml = cleanedHtml.replace(/></g, '> <');

  return cleanedHtml.trim();
}

Example

Input HTML:

<div _ngcontent-c0 ng-reflect-name="example">
  <!-- Angular comment -->
  <span> Hello World </span>
</div>

Output after cleaning:

<div>
  <span> Hello World </span>
</div>

Why this matters

  • Cleaner snapshots: useful for testing frameworks like Jest or Cypress.
  • Accessibility audits: removes noise so tools can focus on meaningful content.
  • SEO and reporting: produces framework‑agnostic HTML for analysis.
  • Portability: makes markup easier to reuse outside Angular.

Node.js and Jest limitations note

This implementation uses DOMParser, which is a browser-only API and is not available in Node.js environments.

What this means:

  • Tests running in Jest (Node.js) will throw ReferenceError: DOMParser is not defined. For Jest tests, you may want to use JSDOM, which provides a browser-like DOM implementation. Jest already ships with JSDOM by default (unless testEnvironment: "node" is set).
  • Server-side rendering (SSR) scenarios will fail.
  • Any non-browser JavaScript runtime will encounter this issue.

Security note

This method is not a security sanitizer and provides no protection against XSS attacks. Because Angular sanitizes template bindings by default, this utility is safe only when applied to Angular-generated HTML, not arbitrary user input.

What it does not do:

  • Remove malicious script tags.
  • Sanitize dangerous URL protocols (javascript:, data:).
  • Remove event handlers (onclick, onerror, etc.).
  • Protect against SVG-based XSS or other advanced injection techniques.

Intended use only for:

  • Cleaning trusted Angular framework output.
  • Removing Angular-specific attributes from controlled HTML.
  • Use cases where HTML source is already sanitized by Angular.

Closing thought

Angular’s attributes are powerful inside the framework, but they don’t belong in exported or analyzed HTML. With a simple utility like getCleanHtml, you can strip away the noise and focus on what really matters: the content.

Search in sitelint.com

Is your site slow?

Discover performance bottlenecks to provide a better customer experience and meet your company’s revenue targets.