paint-brush
How to Develop a DSL in Kotlinby@yaf
2,484 reads
2,484 reads

How to Develop a DSL in Kotlin

by Fedor YaremenkoDecember 11th, 2023
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

Domain-specific languages are easier, more convenient, and more expressive to describe the tasks of the subject area. Kotlin is a modern programming language with many features and syntactic sugar, so it is great as the host language for internal DSL. The article describes how to use Kotlin's various features to create a DSL for defining business processes.

Company Mentioned

Mention Thumbnail
featured image - How to Develop a DSL in Kotlin
Fedor Yaremenko HackerNoon profile picture


Programmers are constantly arguing about which language is the best. Once we compared C and Pascal, but time passed. The battles of Python/Ruby and Java/C# are already behind us. Each language has its pros and cons, which is why we compare them. Ideally, we would like to expand the languages to suit our own needs. Programmers have had this opportunity for a very long time. We know different ways of metaprogramming, that is, creating programs to create programs. Even trivial macros in C allow you to generate large chunks of code from small descriptions. However, these macros are unreliable, limited, and not very expressive. Modern languages have much more expressive means of extension. One of these languages is Kotlin.


The definition of a domain-specific language

A domain-specific language (DSL) is a language that is developed specifically for a specific subject area, unlike general-purpose languages such as Java, C#, C++, and others. This means that it is easier, more convenient, and more expressive to describe the tasks of the subject area, but at the same time it is inconvenient, and impractical for solving everyday tasks, i.e. it is not a universal language.As an example of DSL, you can take the regular expression language. The subject area of regular expressions is the format of strings.


To check the string for compliance with the format, it is enough to simply use a library that implements support for regular expressions:

private boolean isIdentifierOrInteger(String s) {
  return s.matches("^\\s*(\\w+\\d*|\\d+)$");
}


If you check the string for compliance with the specified format in a universal language, for example, Java, you will get the following code:

private boolean isIdentifierOrInteger(String s) {
  int index = 0;

  while (index < s.length() && isSpaceChar(s.charAt(index))) {
     index++;
  }

  if (index == s.length()) {
     return false;
  }

  if (isLetter(s.charAt(index))) {
     index++;

     while (index < s.length() && isLetter(s.charAt(index)))
        index++;

     while (index < s.length() && isDigit(s.charAt(index)))
        index++;
  } else if (Character.isDigit(s.charAt(index))) {
     while (index < s.length() && isDigit(s.charAt(index)))
        index++;
  }

  return index == s.length();
}


The code above is harder to read than regular expressions, it's easier to make mistakes and trickier to make changes.


Other common examples of DSL are HTML, CSS, SQL, UML, and BPMN (the latter two use graphical notation). DSLs are used not only by developers but also by testers and non-IT specialists.


Types of DSL

DSLs are divided into two types: external and internal. External DSL languages have their own syntax and they do not depend on the universal programming language in which their support is implemented.


Pros and cons of external DSLs:

🟢 Code generation in different languages / ready-made libraries

🟢 More options for setting your syntax

🔴 Use of specialized tools: ANTLR, yacc, lex

🔴 Sometimes it is difficult to describe grammar

🔴 There is no IDE support, you need to write your plugins


Internal DSLs are based on a specific universal programming language (host language). That is, with the help of standard tools of the host language, libraries are created that allow you to write more compactly. As an example, consider the Fluent API approach.


Pros and cons of internal DSLs:

🟢 Uses the expressions of the host language as a basis

🟢 It is easy to embed DSL into the code in the host languages and vice versa

🟢 Does not require code generation

🟢 Can be debugged as a subroutine in the host language

🔴 Limited possibilities in setting the syntax


A real-life example

Recently, we at the company faced the need to create our DSL. Our product has implemented the functionality of purchase acceptance. This module is a mini-engine of BPM (Business Process Management). Business processes are often represented graphically. For example, the BPMN notation below shows a process that consists of executing Task 1, and then executing Task 2 and Task 3 in parallel.


It was important for us to be able to create business processes programmatically, including dynamically building a route, setting performers for approval stages, setting the deadline for stage execution, etc. To do this, we first tried to solve this problem using the Fluent API approach.


Then we concluded that setting acceptance routes using the Fluent API still turns out to be cumbersome and our team considered the option of creating its own DSL. We investigated at what the acceptance route would look like on an external DSL and an internal DSL based on Kotlin (because our product code is written in Java and Kotlin).


External DSL:

acceptance
	addStep
		executor: HEAD_OF_DEPARTMENT
		duration: 7 days
		protocol should be formed
	parallel
		addStep
			executor: FINANCE_DEPARTMENT or CTO or CEO
			condition: ${!request.isInternal}
			duration: 7 work days after start date
		addStep
			executor: CTO
			dueDate: 2022-12-08 08:00 PST
			can change
	addStep
		executor: SECRETARY
		protocol should be signed


Internal DSL:

acceptance {
	addStep {
		executor = HEAD_OF_DEPARTMENT
		duration = days(7)
		protocol shouldBe formed
	}
	parallel {
		addStep {
			executor = FINANCE_DEPARTMENT or CTO or CEO
			condition = !request.isInternal
			duration = startDate() + workDays(7)
		}
		addStep {
			executor = CTO
            dueDate = "2022-12-08 08:00" timezone PST
			+canChange
		}
	}
	addStep {
		executor = SECRETARY
		protocol shouldBe signed
	}
}


Except for curly brackets, both options are almost the same. Therefore, it was decided not to waste time and effort on developing an external DSL, but to create an internal DSL.


Implementation of the basic structure of the DSL

Let’s start to develop an object model

interface AcceptanceElement

class StepContext : AcceptanceElement {

	lateinit var executor: ExecutorCondition
	var duration: Duration? = null
	var dueDate: ZonedDateTime? = null
	val protocol = Protocol()
	var condition = true
	var canChange = ChangePermission()

}

class AcceptanceContext : AcceptanceElement {

	val elements = mutableListOf<AcceptanceElement>()

	fun addStep(init: StepContext.() -> Unit) {
		elements += StepContext().apply(init)
	}

	fun parallel(init: AcceptanceContext.() -> Unit) {
		elements += AcceptanceContext().apply(init)
	}
}

object acceptance {

	operator fun invoke(init: AcceptanceContext.() -> Unit): AcceptanceContext {
		val acceptanceContext = AcceptanceContext()
		acceptanceContext.init()
		return acceptanceContext
	}
	
}


Lambdas

First, let's look at the AcceptanceContext class. It is designed to store a collection of route elements and is used to represent the entire diagram as well as parallel-blocks.

The addStep and parallel methods take a lambda with a receiver as a parameter.


A lambda with a receiver is a way to define a lambda expression that has access to a specific receiver object. Inside the body of the function literal, the receiver object passed to a call becomes an implicit this, so that you can access the members of that receiver object without any additional qualifiers, or access the receiver object using a this expression.


Also, if the last argument of a method call is lambda, then the lambda can be placed outside the parentheses. That's why in our DSL we can write a code like the following:

parallel {
	addStep {
		executor = FINANCE_DEPARTMENT
		...
	}
	addStep {
		executor = CTO
		...
	}
}


This is equivalent to a code without syntactic sugar:

parallel({
	this.addStep({
		this.executor = FINANCE_DEPARTMENT
		...
	})
	this.addStep({
		this.executor = CTO
		...
	})
})


Lambdas with receivers and Lambda outside the parentheses are Kotlin features that are particularly useful when working with DSLs.


Object declaration

Now let's look at the entity acceptance. acceptance is an object. In Kotlin, an object declaration is a way to define a singleton — a class with only one instance. So, the object declaration defines both the class and its single instance at the same time.


“invoke” operator overloading

In addition, the invoke operator is overloaded for the accreditation object. The invoke operator is a special function that you can define in your classes. When you invoke an instance of a class as if it were a function, the invoke operator function is called. This allows you to treat objects as functions and call them in a function-like manner.


Note that the parameter of the invoke method is also a lambda with a receiver. Now we can define an acceptance route …

val acceptanceRoute = acceptance {
	addStep {
		executor = HEAD_OF_DEPARTMENT
		...
	}
	parallel {
		addStep {
			executor = FINANCE_DEPARTMENT
			...
		}
		addStep {
			executor = CTO
			...
		}
	}
	addStep {
		executor = SECRETARY
		...
	}
}


…and walk through it

val headOfDepartmentStep = acceptanceRoute.elements[0] as StepContext
val parallelBlock = acceptanceRoute.elements[1] as AcceptanceContext
val ctoStep = parallelBlock.elements[1] as StepContext


Adding details

Infix functions

Take a look at this code

addStep {
	executor = FINANCE_DEPARTMENT or CTO or CEO
	...
}


We can implement this by the following:

enum class ExecutorConditionType {
	EQUALS, OR
}

data class ExecutorCondition(
	private val name: String,
	private val values: Set<ExecutorCondition>,
	private val type: ExecutorConditionType,
) {
	infix fun or(another: ExecutorCondition) =
		ExecutorCondition("or", setOf(this, another), ExecutorConditionType.OR)
}

val HEAD_OF_DEPARTMENT =
	ExecutorCondition("HEAD_OF_DEPARTMENT", setOf(), ExecutorConditionType.EQUALS)
val FINANCE_DEPARTMENT =
	ExecutorCondition("FINANCE_DEPARTMENT", setOf(), ExecutorConditionType.EQUALS)
val CHIEF = ExecutorCondition("CHIEF", setOf(), ExecutorConditionType.EQUALS)
val CTO = ExecutorCondition("CTO", setOf(), ExecutorConditionType.EQUALS)
val SECRETARY =
	ExecutorCondition("SECRETARY", setOf(), ExecutorConditionType.EQUALS)


The ExecutorCondition class allows us to set several possible task executors. In ExecutorCondition we define the infix function or. An infix function is a special kind of function that allows you to call it using a more natural, infix notation.


Without using this feature of the language, we would have to write like this:

addStep {
	executor = FINANCE_DEPARTMENT.or(CTO).or(CEO)
	...
}


Infix functions are also used to set the required state of the protocol and time with a timezone.

enum class ProtocolState {
    formed, signed
}

class Protocol {
    var state: ProtocolState? = null

    infix fun shouldBe(state: ProtocolState) {
        this.state = state
    }
}


enum class TimeZone {
	...
    PST,
    ...
}

infix fun String.timezone(tz: TimeZone): ZonedDateTime {
    val format = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm z")
    return ZonedDateTime.parse("$this $tz", format)
}


Extension functions

String.timezone is an extension function. In Kotlin, extension functions allow you to add new functions to existing classes without modifying their source code. This feature is particularly useful when you want to augment the functionality of classes that you don't have control over, such as classes from standard or external libraries.


Usage in the DSL:

addStep {
	...
	protocol shouldBe formed
	dueDate = "2022-12-08 08:00" timezone PST
	...
}


Here "2022-12-08 08:00" is a receiver object, on which the extension function timezone is called, and PST is the parameter. The receiver object is accessed using this keyword.

Operator overloading

The next Kotlin feature that we use in our DSL is operator overloading. We have already considered the overload of the invoke operator. In Kotlin, you can overload other operators, including arithmetic ones.

addStep {
	...
	+canChange
}


Here the unary operator + is overloaded. Below is the code that implements this:

class StepContext : AcceptanceElement {
    ...
    var canChange = ChangePermission()
}

data class ChangePermission(
    var canChange: Boolean = true,
) {
    operator fun unaryPlus() {
        canChange = true
    }

    operator fun unaryMinus() {
        canChange = false
    }
}


Finishing touch

Now we can describe acceptance routes on our DSL. However, DSL users should be protected from possible errors. For example, in the current version, the following code is acceptable:

val acceptanceRoute = acceptance {
    addStep {
        executor = HEAD_OF_DEPARTMENT
        duration = days(7)
        protocol shouldBe signed

        addStep {
            executor = FINANCE_DEPARTMENT
        }
    }
}


addStep within addStep looks strange, doesn't it? Let's figure out why this code successfully compiles without any errors. As mentioned above, the methods acceptance#invoke and AcceptanceContext#addStep take a lambda with a receiver as a parameter, and a receiver object is accessible by a this keyword. So we can rewrite the previous code like this:

val acceptanceRoute = acceptance {
    [email protected] {
        [email protected] = HEAD_OF_DEPARTMENT
        [email protected] = days(7)
        [email protected] shouldBe signed

        [email protected] {
            executor = FINANCE_DEPARTMENT
        }
    }
}


Now you can see that [email protected] is called both times. Especially for such cases, Kotlin has a DslMarker annotation. You can use @DslMarker to define custom annotations. Receivers marked with the same such annotation cannot be accessed inside one another.

@DslMarker
annotation class AcceptanceDslMarker

@AcceptanceDslMarker
class AcceptanceContext : AcceptanceElement {
	...
}

@AcceptanceDslMarker
class StepContext : AcceptanceElement {
	...
}


Now the code

val acceptanceRoute = acceptance {
    addStep {
    	...

        addStep {
            ...
        }
    }
}


will not compile due to an error 'fun addStep(init: StepContext.() -> Unit): Unit' can't be called in this context by implicit receiver. Use the explicit one if necessary

Links

Below are links to the official Kotlin documentation on the language features that were considered in this article:



Conclusion

Domain-specific languages offer a powerful means to enhance productivity, reduce errors, and improve collaboration by providing a specialized and expressive way to model and solve problems within a specific domain. Kotlin is a modern programming language with many features and syntactic sugar, so it is great as the host language for internal DSL.