Skip to main content

2 posts tagged with "cdk"

View All Tags

Avoiding mutability pitfalls in constructs-based API design: part 2

Β· 9 min read
Chris Rybicki
Software Engineer at Wing Cloud

Hey there! πŸ‘‹ My name's Chris and I'm a software engineer on the Wing Cloud team. Lately I've been helping out building Wing's compiler, and designing APIs for Wing's standard library.

In the first post in this series, I introduced some of the challenges of designing APIs for constructs and frameworks like AWS CDK, CDKTF, and cdk8s when mutation is involved.

To recap, a construct can have public methods that mutate the objects' private state. But if this state is replaced or destroyed, then application code becomes more sensitive to the order of statements and method calls. This is usually undesirable when our constructs are modeling declarative information, like infrastructure configuration.

To that end, we proposed two solutions for designing construct methods:

  1. Only add state, never subtract or update
  2. Document destructive APIs

These measures are great for addressing a lot of our concerns with mutation. But as we'll see, mutation has wider effects than just the design of methods.

Sharing construct state through properties​

Another key capability of most constructs is that they can expose parts of their state for other constructs to use through properties. We can see how they're used in an example from the AWS CDK framework below, written in Wing:

bring "aws-cdk-lib" as cdk;

let table = new cdk.aws_dynamodb.Table(
partitionKey: {
name: "path",
type: cdk.aws_dynamodb.AttributeType.STRING
}
) as "hits";

let handler = new cdk.aws_lambda.Function(
runtime: cdk.aws_lambda.Runtime.NODEJS_14_X,
handler: "hitcounter.handler",
code: cdk.aws_lambda.Code.fromAsset("./lambda"),
environment: {
HITS_TABLE_NAME: table.tableName
}
);

The construct named Table has a public property named tableName that stores the table's physical name for identifying it on AWS. The table's property tableName is passed as the HITS_TABLE_NAME environment variable so that the AWS Lambda function can use the table's dynamic name at runtime -- for example, to query the table (not shown).

Any construct state that isn't meant to be a private implementation detail can be made public. But, as we've mentioned before, it's also possible for construct state to change after it was first initialized in the code.

Uh oh - this smells like a recipe for problems.

When properties get stale​

Let's understand what causes properties to not play well with mutating methods through an example. I'll start by taking my Flower class from the previous post and adding options to specify the regions in the world where it's natively found. (Note that most of the code snippets here on out are in TypeScript.)

class Flower extends Construct {
constructor(scope, id, props) {
super(scope, id);
this._kind = props.kind;
this._nativeRegions = props.nativeRegions;
}

addNativeRegion(region) {
this._nativeRegions.push(nativeRegion);
}

toJson() {
return {
id: this.node.path,
kind: this._kind,
nativeRegions: this._nativeRegions,
};
}
}

I've prefixed the instance fields with underscores to indicate that they're not meant to be accessed outside of the class's implementation. (JavaScript technically supports private class members, but it's a somewhat recent addition, so you don't find them in the wild too often.1)

Here's how the updated construct is used:

let flower = new Flower(garden, `tulip`, {
kind: "tulip",
nativeRegions: ["Turkey", "Greece"],
});
flower.addNativeRegion("Romania");

Everything's good so far. If we try synthesizing a garden.json file with the new Flower, it will output the flower's definition in JSON as we expect:

[
{
"id": "root/rose",
"kind": "rose",
"color": "red",
"nativeRegions": [
"Denmark"
]
},
// ... rest of the garden data
]

Now let's say we add the capability for users to get the native regions of a flower. I'll also add a construct for representing a signpost in front of our garden.

class Flower extends Construct {
get nativeRegions() {
return [...this._nativeRegions];
}

// ... rest of the class unchanged
}

class Signpost extends Construct {
constructor(scope, id, props) {
super(scope, id);
const allRegions = new Set(props.flowers.flatMap((f) => f.nativeRegions));

this._message = "Welcome to Tulip Trove, home to flowers from: ";
this._message += [...allRegions].join(", ");
this._message += ";";
}

toJson() {
return {
id: this.node.path,
message: this._message,
};
}
}

Inside Signpost, I'm collecting all of the native regions of the flowers passed to the signpost, de-duplicating them, and embedding them into a friendly message.

Finally, I'll write some client code that tries using the signpost with some flowers:

const garden = new Garden(undefined, "root");

// add a flower
const rose = new Flower(garden, "rose", { kind: "rose", color: "red" });
rose.addNativeRegion("Denmark");

// add a signpost
new Signpost(garden, "signpost", { flowers: [rose] });

// add more regions to our first flower
rose.addNativeRegion("Turkey");
rose.addNativeRegion("Greece");

garden.synth();

When I synthesize my garden with node garden.js, I'm expecting the signpost to have a message like "Welcome to Tulip Trove, home to flowers from: Denmark, Turkey, Greece". But when I check garden.json, I find the signpost message only mentions Denmark:

[
{
"id": "root/rose",
"kind": "rose",
"color": "red",
"nativeRegions": [
"Denmark",
"Turkey",
"Greece"
]
},
{
"id": "root/signpost",
"message": "Welcome to Tuple Trove, home to flowers from: Denmark."
}
]

Aw shucks.

The problem, as you may have guessed, is that the state read by Signpost was stale. Since the signpost's message was calculated immediately, it wasn't changed when the rose's native regions were added to.

But in some sense, it's not entirely Signpost's fault - how was it supposed to know the field could change? It doesn't seem right to have to look at the implementation of Flower in order to determine whether the data will be calculated later or not. We need a better way.

Laziness is a virtue​

The approach we're going to take to solve this problem is to add support for a way of modeling values that aren't available yet, called Lazy values.

Each construct framework has a slightly different way of doing this, but the general idea is that instead of returning some state that could become stale, as we did here in Flower:

class Flower extends Construct {
get nativeRegions() {
return [...this._nativeRegions];
}

// ... rest of the class unchanged
}

... we will instead return a Lazy value that promises to return the correct value:

class Flower extends Construct {
get nativeRegions() {
return new Lazy(() => [...this._nativeRegions]);
}

// ... rest of the class unchanged
}

Representing delayed values with lazy values (sometimes called "thunks") is a well-trodden path in the history of computer science, which sees popular use in all kinds of frameworks. React's useEffect hook is a good example of this pattern being used in one of the most popular web frameworks.

If we were using TypeScript for these examples, we would also model this with a different type. Instead of the nativeRegions getter returning Array<string>, it will return Lazy<Array<string>>. This extra Lazy "wrapper" matches up with the fact that to access the value stored inside, we have to write some extra code to unwrap it.

Now let's update Signpost to make it work with the fixed Flower construct:

class Signpost extends Construct {
constructor(scope, id, props) {
super(scope, id);

this._message = new Lazy(() => {
const allRegions = new Set(props.flowers.flatMap((f) => f.nativeRegions.produce()));

let message = "Welcome to Tuple Trove, home to flowers from: ";
message += [...allRegions].join(", ");
message += ".";
return message;
});
}

// toJson unchanged
}

Since nativeRegions is a Lazy value, and the message depends on nativeRegions, it's clear that the message also needs to be a Lazy value -- so in the code above, we've wrapped it in new Lazy(() => { ... }).

Besides that, we also have to call produce() on the Lazy value in order to force its value to be computed. In the example above, I've replaced f.nativeRegions with f.nativeRegions.produce().

The core implementation of Lazy requires some changes to Garden as well, but they're not too interesting to look at. But if you're curious, the code from this post in its entirety is available as a gist here2 for your perusal.

Ideas for making Lazy less complicated​

Lazy values can be pretty powerful -- but one thing holding them back is the ergonomics of using them. In the code above, we saw that in order to create a Lazy value, the code for producing the value had to be wrapped in this clunky new Lazy(() => { ... }) syntax.

But even with that aside, we have also potentially introduced new issues, because of this fact:

Lazy.produce() should only be called inside of other Lazy definitions

If we tried calling f.nativeRegions.produce() directly inside of Signpost's constructor, we'd obtain a list of native regions that could get stale, putting us back at square one. The only way to guarantee we're using Lazy properly is if it's only evaluated at the end of our app, when we call garden.synth().

In addition, having to call produce() on each Lazy is tedious and it's easy to forget.

But perhaps... there's a better way?

It turns out the issues I've described above (like checking for errors in your code, and automatically generating code for you) are the kinds of problems that compilers are perfect for!

We don't have an RFC available yet, but it's possible in a future version of Wing, the language could have built-in support for safe and easy Lazy usage:

// in Wing...

class Flower {
// ...

get nativeRegions(): Lazy<Array<str>> {
// easier syntax!
return lazy { this._nativeRegions.copy() };
}
}

class Signpost {
new(props) {
this._message = lazy {
let allRegions = Set<string>.from(
// no need to call .produce() manually - it's automatically called
// since this code is inside a `lazy { ... }` block
props.flowers.flatMap((f) => f.nativeRegions)
);

let var message = "Welcome to Tuple Trove, home to flowers from: ";
message += allRegions.toArray().join(", ");
message += ".";
return message;
};
}
}

What do you think? Let us know on our GitHub or Discord if you have any thoughts or feedback about the ideas in this post, or if you have suggestions for new topics!

Footnotes​

  1. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Classes/Private_class_fields ↩

  2. https://gist.github.com/Chriscbr/58384bdd7b8ce5e8fedf24ddba55e103 ↩

Avoiding mutability pitfalls in constructs-based API design

Β· 6 min read
Chris Rybicki
Software Engineer at Wing Cloud

constructs-api-blog-banner

At Wing Cloud, we're building a programming language named Winglang that makes it easier to build cloud applications. One of the main features of Winglang is that it lets you model an app's cloud resources alongside its application code. Every cloud resource in Winglang is modeled as a construct, similar to the AWS CDK and CDKTF infrastructure-as-code frameworks. In the Wing application below, the classes named Bucket and Function are both constructs:

bring cloud;

let bucket = new cloud.Bucket();
new cloud.Function(inflight () => {
bucket.put("Hello", "world");
});

One of the cooler capabilities of constructs is that their properties can be configured after they have been initialized, through methods. For example, environment variables can be added to a serverless function during or after initialization:

let fn = new cloud.Function(
inflight () => { /* code */ },
env: { DB_HOST: "af43b12" }
);

// ...later

fn.addEnvironment("DB_NAME", "orders");

However, the flexibility to mutate constructs introduces some challenges once we try to compose them together. In this blog post I'll highlight some of these challenges, and explain several of the best practices for designing APIs that avoid these pitfalls.

What are constructs?​

First, let's familiarize ourselves with constructs to get an idea of how they work.

constructs is a JavaScript library that provides an API for organizing classes into trees. A construct is created in JavaScript by writing a class that extends the Construct class, with a signature of (scope, id, props). Constructs are always created in the scope of another construct1 and must always have an identifier which must be unique within the scope it’s created. A construct's identifier is used to generate unique names for every cloud component.

In this running example, I'll create a made-up construct framework for modeling gardens. Let's imagine our garden framework produces a garden.json file that declaratively specifies all of the flowers in our garden in a flat list.

// garden.js
const { Construct } = require("constructs");
const { writeFileSync } = require("node:fs");

// --- garden framework classes ---

class Flower extends Construct {
constructor(scope, id, props) {
super(scope, id);
this.kind = props.kind;
this.color = props.color;
}

toJson() {
return {
id: this.node.path,
kind: this.kind,
color: this.color,
};
}
}

class Garden extends Construct {
constructor() {
super(undefined, "root");
}

synth() {
const isFlower = (node) => node instanceof Flower;
// every construct class has a `.node` field for accessing construct-related APIs
const flowers = this.node.findAll().filter(isFlower).map((c) => c.toJson());
writeFileSync("garden.json", JSON.stringify(flowers, null, 2));
}
}

// --- application code ---

const garden = new Garden();
for (let i = 0; i < 5; i++) {
new Flower(garden, `tulip${i}`, {
kind: "tulip",
color: "yellow",
});
}
garden.synth();

Above, we have two constructs: Flower and Garden.

Flower represents a single flower, with two pieces of state (its kind and color).

Garden is the root of our garden application, and it will contain all of the flower constructs. It will also be responsible for finding all flowers in the constructs tree, converting them to JSON, and writing the garden.json file.

By running node garden.js, we produce a garden.json, which looks like:

[
{
"id": "root/tulip0",
"kind": "tulip",
"color": "yellow"
},
{
"id": "root/tulip1",
"kind": "tulip",
"color": "yellow"
},
{
"id": "root/tulip2",
"kind": "tulip",
"color": "yellow"
},
{
"id": "root/tulip3",
"kind": "tulip",
"color": "yellow"
},
{
"id": "root/tulip4",
"kind": "tulip",
"color": "yellow"
}
]

When you create an app in Wing and compile it to a target like tf-azure, instead of creating garden.json, it creates a Terraform JSON file that describes all of the resources in your app -- but the essential structure is the same.

Using methods to mutate state​

The default way to configure a construct is to provide a list of properties (sometimes called "props") during initialization. We saw this in the previous example when creating new flowers:

let flower = new Flower(garden, `tulip${i}`, {
kind: "tulip",
color: "yellow",
});

But as we saw in the introduction, it's also possible for methods to change a construct's properties. For example, we could add a method that changes the flower's color:

flower.setColor("blue");

This works like you'd imagine - and it's easy to implement. However, it's not without drawbacks.

By making the construct's state mutable, it's possible for it to be changed in more than one place. This can lead to surprising behavior.

For example, take the following code where I've defined two new constructs, an OrangePatch and PurplePatch, both accepting a flower in its props:

class OrangePatch extends Construct {
constructor(scope, id, props) {
super(scope, id, props);
props.flower.setColor("orange");
}
}

class PurplePatch extends Construct {
constructor(scope, id, props) {
super(scope, id, props);
props.flower.setColor("purple");
}
}

const garden = new Garden();
const rose = new Flower(garden, "rose", {
kind: "rose",
color: "red",
});
new OrangePatch(garden, "orange-patch", { flower: rose });
new PurplePatch(garden, "purple-patch", { flower: rose });

Since they both set the color of rose, one of them is going to override the decision of the other (in this case, the final rose will be purple). Uh oh!

To avoid these kinds of issues, I recommend following these two rules when designing methods on constructs:

Rule 1: Only add state, never subtract or update​

Methods should add state, not update or subtract state. If you're always adding state, then state that was configured or added earlier in the application won't get removed or overridden. The additions should also be commutative - which means re-ordering them should not change the application's functional behavior.

We can see an example of this rule with the addEnvironment method on cloud.Function:

let fn = new cloud.Function(/* props */);
fn.addEnvironment("DB_NAME", "orders");

If you try calling addEnvironment with the same string twice, it throws an error. Since environment variables can only be added, you can pass around fn throughout your codebase - including third party libraries! - without worrying about environment variables being removed or changed.

Rule 2: Document destructive APIs​

While methods that destroy existing state are worth avoiding, if there's a need for them, document the APIs accordingly.

For example, if changing a flower's color is truly necessary, it's a good practice to give the method a descriptive name like overrideColor() to make it clear when reading the code that something exceptional is happening.

Another common use case for mutating APIs are to provide escape hatches for when an abstraction doesn't expose all of the capabilities you need. You shouldn't need them often, but when you do, you're usually glad they're available.

Summary​

By following the rules above, you'll design safer APIs like the classes in the AWS CDK and Wing's standard library, that lead to fewer mutation surprises when they're used by other developers.

If you're interested in learning more about constructs, check out the constructs documentation on the AWS CDK website.

Footnotes​

  1. An exception is the "root" construct, often named App or something similar. ↩