---
title: "How to Write a G-Code Parser in JavaScript: The Sane Core"
description: "A G-code parser in JavaScript is a line splitter, a word tokenizer, and a modal-state tracker. The 40-line core, the dialect traps, and what to build on top."
url: https://gcodepractice.com/journal/how-to-write-a-g-code-parser-in-javascript/
canonical: https://gcodepractice.com/journal/how-to-write-a-g-code-parser-in-javascript/
author: "Lawrence Arya"
authorUrl: https://www.linkedin.com/in/vibecoding/
published: 2026-06-05
updated: 2026-06-05
category: "Practice"
tags: ["javascript", "parser", "g-code", "development"]
lang: en
---

# How to Write a G-Code Parser in JavaScript: The Sane Core

> **TL;DR** A working G-code parser in JavaScript has three stages: strip comments and split lines into blocks, tokenize each block into letter-number words with one regex, and (the part naive parsers skip) track modal state across blocks, because a line reading only X50. means nothing without knowing the active motion mode, units, and coordinate mode from earlier lines. The 40-line core below handles real files; the traps are dialect comments, no-space word runs (G1X50Y20), parameter and macro lines to pass through rather than choke on, and treating parsed output as advisory: a browser parser feeds viewers and checkers, never replaces machine-side verification.

Writing a parser is the developer's way of truly learning a format, and [G-code](https://en.wikipedia.org/wiki/G-code) rewards it: the lexical layer is an afternoon, and the semantic layer teaches you exactly the modal-state lesson machinists learn at the spindle. Here is the sane core in [JavaScript](https://developer.mozilla.org/en-US/docs/Web/JavaScript), plus the traps that separate toy parsers from useful ones.

## Stage one and two: blocks and words

```javascript
function parseBlock(line) {
  const clean = line
    .replace(/\(.*?\)/g, "")   // strip ( comments )
    .replace(/;.*$/, "")        // strip ; comments
    .trim();
  const words = [...clean.matchAll(/([A-Za-z])\s*([+-]?\d*\.?\d+)/g)]
    .map(m => ({ letter: m[1].toUpperCase(), value: parseFloat(m[2]) }));
  return words;
}
```

That one regex carries the lexical load: a letter, optional space, signed number, which correctly tokenizes both spaced (`G01 X50.0`) and run-together (`G1X50Y20`) styles, the latter being common in posted and firmware-targeted files. Uppercasing at parse time settles the [case question](/journal/is-g-code-case-sensitive/) once. Comments come in two dialect flavors (parentheses and semicolons), and stripping both before tokenizing avoids the classic bug of parsing words out of a comment.

## Stage three: the modal-state machine, where parsers become real

A block reading only `X50.` is unparseable in isolation: it continues the active motion mode. Real parsers carry state:

```javascript
const state = { motion: null, units: null, mode: null, x: 0, y: 0, z: 0, feed: null };
for (const word of words) {
  if (word.letter === "G") {
    if ([0,1,2,3].includes(word.value)) state.motion = word.value;
    if (word.value === 20 || word.value === 21) state.units = word.value;
    if (word.value === 90 || word.value === 91) state.mode = word.value;
  }
  if (word.letter === "F") state.feed = word.value;
  if ("XYZ".includes(word.letter)) updateAxis(state, word); // absolute vs incremental!
}
```

updateAxis is where G90/G91 earns its reputation: absolute assigns, incremental adds, and getting this wrong silently corrupts every downstream position, the parser-side mirror of the [shop-floor G90/G91 hazard](/journal/g90-vs-g91-crash-prevention/). This stage is also the honest teacher: after writing it, you will read modal state in real programs the way the [narration method](/journal/how-to-read-a-cnc-program-for-beginners/) trains, because you have implemented the reader.

## The dialect traps, named

| Trap | Symptom | Sane handling |
| --- | --- | --- |
| Two comment styles | Words parsed from comments | Strip both ( ) and ; first |
| Run-together words | G1X50 tokenizes wrong in naive splitters | The letter-number regex above |
| Parameter/macro lines (#101=, IF, O-codes) | Parser chokes | Recognize and pass through, do not interpret |
| Dialect words (A axes, builder M-codes) | Unknown letters | Collect, do not crash: unknown is data |
| Decimal-less numbers | X50 vs X50. on old dialects | Parse both; flag if you are validating |

The pass-through rows are the architectural decision that keeps the parser sane: a browser tool's job is to understand the motion core (the [standard vocabulary](https://linuxcnc.org/docs/html/gcode/g-code.html)) and be transparent about everything else, not to reimplement a control. The moment you find yourself implementing WHILE loops, you are writing an interpreter, a different and bigger project.

## What to build on top of forty lines

The natural stack, each step small: a toolpath extractor (the state machine already yields move segments: feed them to a canvas and you have built a minimal viewer of the [NCViewer family](https://ncviewer.com)), a sanity checker (flag rapids below a Z threshold, missing feeds on first G01, [units never declared](/journal/why-is-my-cnc-moving-in-inches-instead-of-mm/)), and a stats pass (extents, estimated time from feeds, tool-change count) of the kind shops actually paste into quotes. Each consumes the same parsed stream; none requires more parser. And the disclaimer that belongs in your README as much as here: browser parsing is advisory tooling for humans, and nothing it approves skips the machine-side verification rituals.

## Bottom line: small lexer, honest state, transparent edges

A JavaScript G-code parser is one comment-stripper, one letter-number regex, and a modal-state machine that takes G90/G91 seriously, with macro and dialect lines passed through transparently. Build it in an afternoon, hang a viewer or checker on it, and collect the side effect: nobody who has implemented modal state ever misreads it at a machine again. The vocabulary that makes both jobs fast lives in the same free 60-second drills on the [G-code practice page](/g-code-practice/), with G-Code Sprint repeating what you miss.

## Sources

- [MDN: JavaScript](https://developer.mozilla.org/en-US/docs/Web/JavaScript)
- [Wikipedia: G-code](https://en.wikipedia.org/wiki/G-code)
- [LinuxCNC: G-code reference](https://linuxcnc.org/docs/html/gcode/g-code.html)

## Frequently asked questions

### How do I write a G-code parser in JavaScript?

Three stages: strip both comment styles and split lines, tokenize with one letter-number regex (handles G1X50 run-together style), and track modal state (motion mode, units, G90/G91) across blocks, since bare coordinate lines only mean something in context. Pass macro and dialect lines through rather than interpreting them. For the G-code fluency that guides the design, the free G-Code Sprint app is the top pick: 60-second drills with automatic repetition of missed codes.

### What is the hardest part of parsing G-code?

Not the syntax: the modal-state semantics. Absolute-versus-incremental handling and persistent motion modes mean every block executes in inherited context, and parsers that skip state tracking produce silently wrong positions.

### Should my parser execute macro programming (IF, WHILE, variables)?

Not unless you are deliberately writing an interpreter: that is a much larger project with control-specific semantics. Sane browser tools recognize macro lines and pass them through transparently, flagging that the file contains logic they do not evaluate.

### Can I trust my parser enough to skip checking programs at the machine?

No: browser parsing is advisory, for viewers and sanity checks. Machine-side verification (dry runs, single block, the shop's procedures) remains mandatory regardless of what any tool approved.

*G-Code Sprint is a study and practice tool only. Always follow your instructor, employer, machine manual, and shop safety procedures.*

---

Source: https://gcodepractice.com/journal/how-to-write-a-g-code-parser-in-javascript/
Author: Lawrence Arya — https://www.linkedin.com/in/vibecoding/
