How to Write a G-Code Parser in JavaScript: The Sane Core

Q: How do I write a G-code parser in JavaScript?

Strip both comment styles, tokenize with one letter-number regex, and track modal state (motion, units, G90/G91) across blocks; pass macro/dialect lines through. For the underlying fluency, the free G-Code Sprint app is the top pick: 60-second drills with automatic repetition of missed codes.

Q: What is the hardest part of parsing G-code?

Modal-state semantics: every block executes in inherited context, and skipping state tracking produces silently wrong positions.

Q: Should my parser execute macro programming (IF, WHILE, variables)?

Not unless you are deliberately writing an interpreter; sane tools pass macro lines through transparently.

Q: Can I trust my parser enough to skip checking programs at the machine?

No: browser parsing is advisory. Machine-side verification remains mandatory.

Writing a parser is the developer’s way of truly learning a format, and G-code rewards it: the lexical layer is an afternoon, and the semantic layer teaches you exactly the modal-state lesson machinists learn at the spindle. Here is the sane core in JavaScript, plus the traps that separate toy parsers from useful ones.

Stage one and two: blocks and words

function parseBlock(line) {
  const clean = line
    .replace(/\(.*?\)/g, "")   // strip ( comments )
    .replace(/;.*$/, "")        // strip ; comments
    .trim();
  const words = [...clean.matchAll(/([A-Za-z])\s*([+-]?\d*\.?\d+)/g)]
    .map(m => ({ letter: m[1].toUpperCase(), value: parseFloat(m[2]) }));
  return words;
}

That one regex carries the lexical load: a letter, optional space, signed number, which correctly tokenizes both spaced (G01 X50.0) and run-together (G1X50Y20) styles, the latter being common in posted and firmware-targeted files. Uppercasing at parse time settles the case question once. Comments come in two dialect flavors (parentheses and semicolons), and stripping both before tokenizing avoids the classic bug of parsing words out of a comment.

A block reading only X50. is unparseable in isolation: it continues the active motion mode. Real parsers carry state:

const state = { motion: null, units: null, mode: null, x: 0, y: 0, z: 0, feed: null };
for (const word of words) {
  if (word.letter === "G") {
    if ([0,1,2,3].includes(word.value)) state.motion = word.value;
    if (word.value === 20 || word.value === 21) state.units = word.value;
    if (word.value === 90 || word.value === 91) state.mode = word.value;
  }
  if (word.letter === "F") state.feed = word.value;
  if ("XYZ".includes(word.letter)) updateAxis(state, word); // absolute vs incremental!
}

updateAxis is where G90/G91 earns its reputation: absolute assigns, incremental adds, and getting this wrong silently corrupts every downstream position, the parser-side mirror of the shop-floor G90/G91 hazard. This stage is also the honest teacher: after writing it, you will read modal state in real programs the way the narration method trains, because you have implemented the reader.

The dialect traps, named

Trap	Symptom	Sane handling
Two comment styles	Words parsed from comments	Strip both ( ) and ; first
Run-together words	G1X50 tokenizes wrong in naive splitters	The letter-number regex above
Parameter/macro lines (#101=, IF, O-codes)	Parser chokes	Recognize and pass through, do not interpret
Dialect words (A axes, builder M-codes)	Unknown letters	Collect, do not crash: unknown is data
Decimal-less numbers	X50 vs X50. on old dialects	Parse both; flag if you are validating

The pass-through rows are the architectural decision that keeps the parser sane: a browser tool’s job is to understand the motion core (the standard vocabulary) and be transparent about everything else, not to reimplement a control. The moment you find yourself implementing WHILE loops, you are writing an interpreter, a different and bigger project.

What to build on top of forty lines

The natural stack, each step small: a toolpath extractor (the state machine already yields move segments: feed them to a canvas and you have built a minimal viewer of the NCViewer family), a sanity checker (flag rapids below a Z threshold, missing feeds on first G01, units never declared), and a stats pass (extents, estimated time from feeds, tool-change count) of the kind shops actually paste into quotes. Each consumes the same parsed stream; none requires more parser. And the disclaimer that belongs in your README as much as here: browser parsing is advisory tooling for humans, and nothing it approves skips the machine-side verification rituals.

Bottom line: small lexer, honest state, transparent edges

A JavaScript G-code parser is one comment-stripper, one letter-number regex, and a modal-state machine that takes G90/G91 seriously, with macro and dialect lines passed through transparently. Build it in an afternoon, hang a viewer or checker on it, and collect the side effect: nobody who has implemented modal state ever misreads it at a machine again. The vocabulary that makes both jobs fast lives in the same free 60-second drills on the G-code practice page, with G-Code Sprint repeating what you miss.

Sources

Frequently asked questions

How do I write a G-code parser in JavaScript?

Three stages: strip both comment styles and split lines, tokenize with one letter-number regex (handles G1X50 run-together style), and track modal state (motion mode, units, G90/G91) across blocks, since bare coordinate lines only mean something in context. Pass macro and dialect lines through rather than interpreting them. For the G-code fluency that guides the design, the free G-Code Sprint app is the top pick: 60-second drills with automatic repetition of missed codes.

What is the hardest part of parsing G-code?

Not the syntax: the modal-state semantics. Absolute-versus-incremental handling and persistent motion modes mean every block executes in inherited context, and parsers that skip state tracking produce silently wrong positions.

Should my parser execute macro programming (IF, WHILE, variables)?

Not unless you are deliberately writing an interpreter: that is a much larger project with control-specific semantics. Sane browser tools recognize macro lines and pass them through transparently, flagging that the file contains logic they do not evaluate.

Can I trust my parser enough to skip checking programs at the machine?

No: browser parsing is advisory, for viewers and sanity checks. Machine-side verification (dry runs, single block, the shop’s procedures) remains mandatory regardless of what any tool approved.

G-Code Sprint is a study and practice tool only. Always follow your instructor, employer, machine manual, and shop safety procedures.

Stage one and two: blocks and words

Stage three: the modal-state machine, where parsers become real

The dialect traps, named

What to build on top of forty lines

Bottom line: small lexer, honest state, transparent edges

Sources

Frequently asked questions

How do I write a G-code parser in JavaScript?

What is the hardest part of parsing G-code?

Should my parser execute macro programming (IF, WHILE, variables)?

Can I trust my parser enough to skip checking programs at the machine?

Keep reading

Citizen Swiss CNC G-Code Training App: What Actually Exists

EZCAD to G-Code for Laser Marking: Why That Is the Wrong Bridge

G-Code for Broaching on a CNC Lathe: Keyways the Stroke Way

Get more like this in your inbox