r/o

flake.lock: sync. 64dc3746 parent ef2f2687

authored by ~talya

๐Ÿ‘€
.github
.github/workflows: avoid substitution wonkiness from step output. 5 months ago
๐Ÿ‘€
benches
Add benchmark script, use hyperfine 2 years ago
๐Ÿ‘€
examples
add example from @slonkazoid -- thank you! 10 months ago
๐Ÿ‘€
fuzz
Cargo.toml: v0.41.1. 5 months ago
๐Ÿ‘€
hooks
update-readme: use Comrak as an example of Markdown editing 4 years ago
๐Ÿ‘€
script
script: build-releases. 8 months ago
๐Ÿ‘€
src
Fix the range of non-emoji general purpose variation selector
๐Ÿ‘€
5 months ago
๐Ÿ‘€
vendor
commonmark: pull in upstream spec, test against it. 1 year ago
๐Ÿ‘€
.editorconfig
Add clippy, clean up lots 8 years ago
๐Ÿ‘€
.gitattributes
scanners: begin port to re2c 2 years ago
๐Ÿ‘€
.gitignore
script: build-releases. 8 months ago
๐Ÿ‘€
.gitmodules
.gitmodules: point to my cmark-gfm where the reffed commit actually is. 1 year ago
๐Ÿ‘€
CODE_OF_CONDUCT.md
CODE_OF_CONDUCT: use @Xe's Creator's Code 2 years ago
๐Ÿ‘€
COPYING
README: update, minimise cm. fix missing doc. 1 year ago
๐Ÿ‘€
Cargo.lock
changelog.txt: generate. 5 months ago
๐Ÿ‘€
Cargo.toml
Cargo.toml: v0.41.1. 5 months ago
๐Ÿ‘€
Makefile
Merge branch 'main' into bw-wikilinks 1 year ago
๐Ÿ‘€
README.md
changelog.txt: generate. 6 months ago
๐Ÿ‘€
RELEASE_CHECKLIST.md
RELEASE_CHECKLIST: update as of https://github.com/kivikakk/comrak/pull/395. 1 year ago
๐Ÿ‘€
changelog.txt
changelog.txt: update. 5 months ago
๐Ÿ‘€
flake.lock
flake.lock: sync. 5 months ago
๐Ÿ‘€
flake.nix
flake.nix: add cargo-nextest back! 6 months ago
๐Ÿ‘€
rustfmt.toml
cargo fmt 5 years ago
๐Ÿ‘€
spec_out.txt
Merge https://github.com/kivikakk/comrak 6 years ago

README.md

Comrak

Build status CommonMark: 652/652 GFM: 670/670 crates.io version docs.rs

Rust port of githubโ€™s cmark-gfm.

Compliant with CommonMark 0.31.2 in default mode. GFM support synced with release 0.29.0.gfm.13.

Installation

Specify it as a requirement in Cargo.toml:

[dependencies]
comrak = "0.41"

Comrakโ€™s library supports Rust 1.65+.

CLI

  • Anywhere with a Rust toolchain:
    • cargo install comrak
  • Many Unix distributions:
    • pacman -S comrak
    • brew install comrak
    • dnf install comrak
    • nix run nixpkgs#comrak

You can also find builds Iโ€™ve published in GitHub Releases, but theyโ€™re limited to machines I have access to at the time of making them! webinstall.dev offers curl | shell-style installation of the latest of these for your OS.

Usage

Click to expand the CLI --help output.
$ comrak --help
A 100% CommonMark-compatible GitHub Flavored Markdown parser and formatter
Usage: comrak [OPTIONS] [FILE]...
Arguments:
[FILE]...
CommonMark file(s) to parse; or standard input if none passed
Options:
-c, --config-file <PATH>
Path to config file containing command-line arguments, or 'none'
[default: /home/runner/.config/comrak/config]
-i, --inplace
To perform an in-place formatting
--hardbreaks
Treat newlines as hard line breaks
--smart
Use smart punctuation
--github-pre-lang
Use GitHub-style <pre lang> for code blocks
--full-info-string
Enable full info strings for code blocks
--gfm
Enable GitHub-flavored markdown extensions: strikethrough, tagfilter, table, autolink, and
tasklist. Also enables --github-pre-lang and --gfm-quirks
--gfm-quirks
Enables GFM-style quirks in output HTML, such as not nesting <strong> tags, which
otherwise breaks CommonMark compatibility
--relaxed-tasklist-character
Enable relaxing which character is allowed in a tasklists
--relaxed-autolinks
Enable relaxing of autolink parsing, allow links to be recognized when in brackets and
allow all url schemes
--tasklist-classes
Output classes on tasklist elements so that they can be styled with CSS
--default-info-string <INFO>
Default value for fenced code block's info strings if none is given
--unsafe
Allow raw HTML and dangerous URLs
--gemojis
Translate gemojis into UTF-8 characters
--escape
Escape raw HTML instead of clobbering it
--escaped-char-spans
Wrap escaped characters in span tags
-e, --extension <EXTENSION>
Specify extension name(s) to use
Multiple extensions can be delimited with ",", e.g. --extension strikethrough,table
[possible values: strikethrough, tagfilter, table, autolink, tasklist, superscript,
footnotes, description-lists, multiline-block-quotes, math-dollars, math-code,
wikilinks-title-after-pipe, wikilinks-title-before-pipe, underline, subscript, spoiler,
greentext, alerts]
-t, --to <FORMAT>
Specify output format
[default: html]
[possible values: html, xml, commonmark]
-o, --output <FILE>
Write output to FILE instead of stdout
--width <WIDTH>
Specify wrap width (0 = nowrap)
[default: 0]
--header-ids <PREFIX>
Use the Comrak header IDs extension, with the given ID prefix
--front-matter-delimiter <DELIMITER>
Ignore front-matter that starts and ends with the given string
--syntax-highlighting <THEME>
Syntax highlighting for codefence blocks. Choose a theme or 'none' for disabling
[default: base16-ocean.dark]
--list-style <LIST_STYLE>
Specify bullet character for lists (-, +, *) in CommonMark output
[default: dash]
[possible values: dash, plus, star]
--sourcepos
Include source position attribute in HTML and XML output
--ignore-setext
Ignore setext headers
--ignore-empty-links
Ignore empty links
--experimental-minimize-commonmark
Minimize escapes in CommonMark output using a trial-and-error algorithm
-h, --help
Print help information (use `-h` for a summary)
-V, --version
Print version information
By default, Comrak will attempt to read command-line options from a config file specified by
--config-file. This behaviour can be disabled by passing --config-file none. It is not an error if
the file does not exist.

And thereโ€™s a Rust interface. You can use comrak::markdown_to_html directly:

use comrak::{markdown_to_html, Options};
assert_eq!(markdown_to_html("Hello, **ไธ–็•Œ**!", &Options::default()),
"<p>Hello, <strong>ไธ–็•Œ</strong>!</p>\n");

Or you can parse the input into an AST yourself, manipulate it, and then use your desired formatter:

use comrak::nodes::NodeValue;
use comrak::{format_html, parse_document, Arena, Options};
fn replace_text(document: &str, orig_string: &str, replacement: &str) -> String {
// The returned nodes are created in the supplied Arena, and are bound by its lifetime.
let arena = Arena::new();
// Parse the document into a root `AstNode`
let root = parse_document(&arena, document, &Options::default());
// Iterate over all the descendants of root.
for node in root.descendants() {
if let NodeValue::Text(ref mut text) = node.data.borrow_mut().value {
// If the node is a text node, perform the string replacement.
*text = text.replace(orig_string, replacement);
}
}
let mut html = vec![];
format_html(root, &Options::default(), &mut html).unwrap();
String::from_utf8(html).unwrap()
}
fn main() {
let doc = "This is my input.\n\n1. Also [my](#) input.\n2. Certainly *my* input.\n";
let orig = "my";
let repl = "your";
let html = replace_text(&doc, &orig, &repl);
println!("{}", html);
// Output:
//
// <p>This is your input.</p>
// <ol>
// <li>Also <a href="#">your</a> input.</li>
// <li>Certainly <em>your</em> input.</li>
// </ol>
}

For a slightly more real-world example, see how I generate my GitHub user README from a base document with embedded YAML, which itself has embedded Markdown, or check out some of Comrakโ€™s dependents on crates.io or on GitHub.

Security

As with cmark and cmark-gfm, Comrak will scrub raw HTML and potentially dangerous links. This change was introduced in Comrak 0.4.0 in support of a safe-by-default posture, and later adopted by our contemporaries. :)

To allow these, use the unsafe_ option (or --unsafe with the command line program). If doing so, we recommend the use of a sanitisation library like ammonia configured specific to your needs.

Extensions

Comrak supports the five extensions to CommonMark defined in the GitHub Flavored Markdown Spec:

Comrak additionally supports its own extensions, which are yet to be specced out (PRs welcome!):

  • Superscript
  • Header IDs
  • Footnotes
  • Description lists
  • Front matter
  • Multi-line blockquotes
  • Math
  • Emoji shortcodes
  • Wikilinks
  • Underline
  • Spoiler text
  • โ€œGreentextโ€
  • CJK friendly emphasis

By default none are enabled; they are individually enabled with each parse by setting the appropriate values in the ExtensionOptions struct.

Plugins

Fenced code block syntax highlighting

You can provide your own syntax highlighting engine.

Create an implementation of the SyntaxHighlighterAdapter trait, and then provide an instance of such adapter to Plugins.render.codefence_syntax_highlighter. For formatting a Markdown document with plugins, use the markdown_to_html_with_plugins function, which accepts your plugins object as a parameter.

See the syntax_highlighter.rs and syntect.rs examples for more details.

Syntect

syntect is a syntax highlighting library for Rust. By default, comrak offers a plugin for it. In order to utilize it, create an instance of plugins::syntect::SyntectAdapter and use it in your Plugins option.

Related projects

Comrakโ€™s design goal is to model the upstream cmark-gfm as closely as possible in terms of code structure. The upside of this is that a change in cmark-gfm has a very predictable change in Comrak. Likewise, any bug in cmark-gfm is likely to be reproduced in Comrak. This could be considered a pro or a con, depending on your use case.

The downside, of course, is that the code often diverges from idiomatic Rust, especially in the ASTโ€™s extensive use of RefCell, and while contributors have made it as fast as possible, it simply wonโ€™t be as fast as some other CommonMark parsers depending on your use-case. Here are some other projects to consider:

  • Raph Levienโ€™s pulldown-cmark. Itโ€™s very fast, uses a novel parsing algorithm, and doesnโ€™t construct an AST (but you can use it to make one if you want). cargo doc uses this, as do many other projects in the ecosystem.
  • markdown-rs (1.x) looks worth watching.
  • Know of another library? Please open a PR to add it!

As far as I know, Comrak is the only library to implement all of the GitHub Flavored Markdown extensions rigorously.

Elixir bindings

  • mdex - Elixir bindings for this library built with Rustler. Available on Hex as mdex.

Python bindings

  • comrak (Python package) โ€” Python bindings for this library built with PyO3. Available on PyPI as comrak, benchmarked at 15-60x faster than pure Python alternatives.

Ruby bindings

Benchmarking

Youโ€™ll need to install hyperfine, and CMake if you want to compare against cmark-gfm.

If you want to just run the benchmark for the comrak binary itself, run:

make bench-comrak

This will build Comrak in release mode, and run benchmark on it. You will see the time measurements as reported by hyperfine in the console.

The Makefile also provides a way to run benchmarks for comrak current state (with your changes), comrak main branch, cmark-gfm, pulldown-cmark and markdown-it.rs. Youโ€™ll need CMake, and ensure submodules are prepared.

make bench-all

This will build and run benchmarks across all, and report the time taken by each as well as relative time.

Contributing

Contributions are highly encouraged; if youโ€™d like to assist, consider checking out the good first issue label! Iโ€™m happy to help provide direction and guidance throughout, even if (especially if!) youโ€™re new to Rust or open source.

Where possible I practice Optimistic Merging as described by Peter Hintjens. Please keep the code of conduct in mind too.

Thank you to Comrakโ€™s many contributors for PRs and issues opened!

Code Contributors

Small chart showing Comrak contributors.

Financial Contributors

Become a financial contributor and help sustain Comrakโ€™s development. Iโ€™m self-employed โ€” open-source software relies on the collective.

Contact

Asherah Connor

Legal

Copyright (c) 2017โ€“2025, Comrak contributors. Licensed under the 2-Clause BSD License.

cmark itself is is copyright (c) 2014, John MacFarlane.

See COPYING for all the details.