Skip to content

tgeng/dotty-parser-combinators

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dotty Parser Combinators Actions Status

This is a parser combinator library for Dotty. Since Dotty is still under active development, the language features and APIs are still unstable. As a result, this library offers no guarantees on backward compatibility for each minor releases. In addition, the library would try to catch up with each release of Dotty.

This project focuses on simplicity over performance. It has a flexible API and is powerful enough for typical use cases in implementing a programming language. It also has good support for error reporting.

It's not production ready yet. I created this library since I haven't found a parser library that works with Dotty. I use this for my personal projects and would try my best to fix bugs and improve performance along the way. Anyone is welcome to contribute, either by raising issues or sending PRs. I will try to address them as soon as possible.

The library takes inspiration from parsec and Lightyear.

Get Started

TODO: add this section after this is published.

Usage

Import the following where this library is needed.

import io.github.tgeng.parse._
import io.github.tgeng.parse.string.{_, given _}
  • io.github.tgeng.parse: the core ParserT[I, T] trait and various generic combinators.

  • io.github.tgeng.parse.string: string parser trait Parser[T], which is a type alias to `ParserT[Char, T], and other combinators that are specific to parsing strings.

Basics

For the sake of simplicity, assuming we are parsing strings. Essentially we want to use this library to build a Parser[T] object that parses a string and output a result of type T. For example,

scala> val positiveInt = "[0-9]+".rp.map(_.toInt)
val positiveInt: io.github.tgeng.parse.ParserT[Char, Int] = Parser{/[0-9]+/}

scala> positiveInt.parse("123")
val res0: Either[io.github.tgeng.parse.ParserError[Char], Int] = Right(123)

JSON parser Example

Assuming we have the following data classes representing JSON values.

enum JValue {
  case JNull
  case JBoolean(value: Boolean)
  case JNumber(value: Double)
  case JString(value: String)
  case JArray(value: Vector[JValue])
  case JObject(value: Map[String, JValue])
}

Below is a simple parser that converts a JSON string to a JValue.

  //          ┌ a macro that names the parser according to the enclosing definition, For example, in
  //          | this case, the created parser is named "<jNull>". `S` in `PS` is for strong name,
  //          | which means the `jNull` parser will not report its internals in an error message. TO
  //          | allow reporting internals, use `P` instead, as shown below with `jArray`, `jObject`,
  //          | and `jObjectEntry`.
  //          |
  //          |         ┌───── as ────┐
  //          |  ┌───  >> ───┐        │
  val jNull = PS { 'n' >>! "ull" as JNull }
  //                    │        │    │
  //                    │        │    └ give it a intuitive name so the error message is
  //                    │        │      easier to understand
  //                    │        │
  //                    │        └ convert the string parser returning "ull" to a parser returning
  //                    │          `JNull`
  //
  //                    └ throw away result from the first parser, which matches 'n', and return
  //                      the result of the second parser, which, in this case, returns "ull". In
  //                      addition, commit right after seeing 'n' to speed up parsing failure in
  //                      case 'n' is followed by things other than "ull". Without committing, the
  //                      parser would keep trying <jBoolean>, <jNumber>, and so on.
  //
  //

  //                                               ┌ if matching "true" fails, try the following
  //                                               │ to match false
  //
  def jBoolean = ('t' >>! "rue" as JBoolean(true)) |
                 ('f' >>! "alse" as JBoolean(false)) withName "<jBoolean>"
  //                                                 |
  //                                                 └ one could name the parser explicitly like so
  //                                                   without the `P` macro as well

  val jNumber = PS { commitAfter(double.map(JNumber(_))) }
  //
  //                                     └ similar to `as`, but it consumes the result from the
  //                                       double parser

  def jString = PS { quoted().map(JString(_)) }

  def jArray : Parser[JValue] =
    P { '[' >>! (jValue sepBy ',').map(JArray(_)) << commitAfter(']') }
  //
  //                      └ matches `JValue` objects separated by `,` zero or more times and
  //                        returns the matched `JValue`s inside a `Vector`

  val jObjectKey = PS { whitespaces >> quoted() << whitespaces }

  def jObjectEntry : Parser[(String, JValue)] =
    P { lift(jObjectKey << commitAfter(":"), jValue) }
  //
  //     └ combines two parsers `jObjectKey << ":"` and `jValue` and produce a parser that returns a
  //       tuple containing the parsed key string and `JValue` object.

  def jObject : Parser[JValue] = P {
    '{' >>!
    (jObjectEntry sepBy ',').map(c => JObject(c.toMap))
    << commitAfter('}')
  }

  def jValue : Parser[JValue] = P {
    whitespaces >>
    (jNull | jBoolean | jNumber | jString | jArray | jObject)
    << whitespaces
  }

jValue.parse("""[1, "a", {}]""") // JArray([JNumber(1), JString("a"), JObject({})])

Matching strings

Scala Char, String, and Regex can be implicitly converted to parsers that matches things intuitively. In addition to implicit conversion, extension methods rp (regex parser) and rpm (regex parser outputing Match object) are defined on Strings to convert the string to a regex parser that output the matched String and scala.util.matching.Regex.Match, respectively.

scala> import scala.language.implicitConversions

scala> val p1 : Parser[String] = "abc"
val p1: io.github.tgeng.parse.string.Parser[String] = Parser{"abc"}

scala> p1.parse("abc")
val res1: Either[io.github.tgeng.parse.ParserError[Char], String] = Right(abc)

scala> p1.parse("def")
val res2: Either[io.github.tgeng.parse.ParserError[Char], String] = Left(0: "abc")

scala> val p2 : Parser[String] = "[0-9]".rp
val p2: io.github.tgeng.parse.string.Parser[String] = Parser{/[0-9]/}

scala> p2.parse("123")
val res3: Either[io.github.tgeng.parse.ParserError[Char], String] = Right(1)

scala> p2.parse("abc")
val res4: Either[io.github.tgeng.parse.ParserError[Char], String] = Left(0: /[0-9]/)

scala> val p3 : Parser[Char] = 'x'
val p3: io.github.tgeng.parse.string.Parser[Char] = Parser{'x'}

scala> p3.parse("x")
val res5: Either[io.github.tgeng.parse.ParserError[Char], Char] = Right(x)

scala> p3.parse("y")
val res6: Either[io.github.tgeng.parse.ParserError[Char], Char] = Left(0: 'x')

Combinators

Many common combinators are provided. To see all of them please refer to core.scala and extension.scala.

For more examples, please refer to ParserTest.scala.

Versions

Dotty dotty-parser-combinators
0.23-RC1 0.1.0
0.24-RC1 0.1.1+
0.24-RC1 0.2.0-8
0.25-RC2 0.2.8+

About

A simple parser combinator library for dotty (Scala 3)

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages