The End of the Road for Bespoke Web

NextJS: A case study in bespoke APIs

Nextjs is a widely used, but still controversial piece of technology. On one hand, it pushes the boundaries of capability for full-stack frameworks. They have adopted experimental React functionality in their main, “stable” branch, which is simultaneously risky and bleeding edge. On the other hand, this enables greater power with new technologies.

However, the down side of this philosophy is that stability and interoperability takes a back seat. On the other hand, Remix, a Next competitor, has opted to avoid the experimental React functionality in favour of “using the platform”. This philosophy prefers approaches that are not framework-specific. As such they tacitly promote developing skills and knowledge that is more transferable.

Nextjs, for example, provides the bespoke functions cookies and headers from the next/{cookies,headers} packages, respectively. How do these work? I have no idea. It also monkey-patches the fetch API, implicitly overriding the standard fetch function with a slightly different implementation with a next configuration object, which is otherwise missing from fetch. None of this works outside of next.

The Crux

Aside from now needing to learn new stuff, these decisions cause tooling to break, or at least require adjusting to get working with this particular flavour. “Tooling” here might be middleware, or utilities for handling cookies and headers.

When every framework adopts their own idiosyncratic approach, this problem multiplies, and burden is then placed on either libraries or framework authors responsible or interested in implementing the n necessary integrations. And this issue is never fixed, as ecosystems are constantly in flux.

This issue extends past full-stack frameworks and plagues basically all server-side frameworks in JS or any other languages. Each have their own namespaced @framework/tool flavour, often completely reimplementing the underlying functionality.

In some cases it may genuinely be necessary to completely reimplement the functionality itself, according to the goals of the framework. For example, a performance-obsessed framework might justify reimplementing things according to those principles. In these instances, this is completely reasonable and is not a “problem to be fixed”.

But this isn’t the case most of the time: a special implementation rarely necessary. Ideally, the ecosystem would converge on a handful of high quality, interchangeable implementations and then that can be used everywhere else - unless such special cases preclude this.

In practice, the reimplementations are due to unnecessary modifications. In other words, they could be articulated in a standardised or compatible manner, but aren’t, for no particularly good reason. Instead, the bespoke-ness tends to emerge from circumstance or priorities other than interoperability. This is a perfectly reasonable preference, but I am claiming it’s often a mistaken one.

The fundamental culprit for this issue is the core of any server-side framework: the Request and Response objects. As it happens, basically every server-side framework has their own bespoke Request and Response object. Not only do they have different properties or shapes, but they also inevitably carry bespoke behaviour which requires learning, remembering and respecting. This carries an implementation burden for both user and library-author.

Reconciling this problem

With the introduction of web standard Request and Response objects, we’ve been handed an opportunity to build a compatible ecosystem remedying all of the above issues. Personally, I’m a fan of the implementation itself. But regardless of whether it’s good or not: by virtue of being a standard it almost doesn’t even matter. The standardisation itself is inherently valuable.

However, the existence of these standards probably isn’t sufficient to reliably produce a flourishing and compatible ecosystem. Once again, people can create their own bespoke abstractions on top and then build tools to that abstraction. And they certainly have. The new wave of server-side tools built on web standards immediately deviated from those standards in meaningful ways. That is: they have built tooling on their own bespoke abstractions, thus many of the benefits of standardisation are lost.

To avoid the n’th framework with its accompanying bespoke middleware implementations which require battle-testing and edge-case handling from scratch, it’s worth defining a set of principles that enable the flexibility and decoupling that would usher in an ecosystem that doesn’t need to be rewritten perennially.

Pinciples of Interoperability

On top of the standards themselves, we should develop some conventions and principles to ensure things can interoperate. This way, we can actually stop reinventing the wheel – or significantly less so.

The conventions and principles should be specifically designed for the general use case, as opposed to catering to every single possible use case. These principles should ultimately constrain what we can achieve, but will do so in a way that entails less reinvention of the same wheels. They should also aspire to make things more comprehensible, less magical and more reliable.

Below are what I believe (some of) those principles are. These principles apply to those creating reusable, packaged code. The principles are designed to make the end-user experience maxmially simple, reliable, comprehensible, and future-proof. Any magic, incomprehensibility or unreliability is then reserved for the end-developer to conjure up – or not.

No Extending or Wrapping Web Standards

As touched on, the fundamental mistake of previous work has been achieved on day one, by abstracting the underlying lingua franca of the web (HTTP at least): Request and Response. By leaving these alone entirely, an elegantly simple boundary can be established between tools. Any additional information can be provided as with any other code: as extra options, arguments, context, whatever.

This principle includes avoiding monkey-patching, messing with prototypes, or extending or wrapping the underlying Request and Response. This greatly improves compatibility and provides a solid foundation for anything else built within the ecosystem. Being a standard means any changes affect everyone else too, and thus there is inherent industry coordination to some kind of objective-ish, bedrock reference point.

Adhering to this little principle provides a lot of power. The function (req: Request) => Response can be used in almost any runtime or infrastructure, and support will only increase moving forward. In contrast, express middleware, for example, will be less and less applicable, as modern runtimes become incompatible with that particular Request and Response paradigm.

Immutability-ish

To produce a comprehensible ecosystem, mutability needs to – at the least – be contained, if not outright forbidden. Being able to mutate any aspect of a Request object means middleware becomes hard to reason about, and more fragile. Ultimately, it abstracts away too much.

Instead, immutability should be a broad, but somewhat lenient, goal. In fact, this is very much codified into web standards, by virtue of the Request and Response object both being immutable. With that said, they are not unchangeable, as such. One can update elements of the Request and Response objects by instantiating a new one based on the old one, a la new Request(request). And we should lean into this.

Immutability works well in conjunction with the next principle.

Explicit State

Aside from external state like databases or storage, backends also typically require some request state. While immutability is an important principle, this is not at odds with statefulness. However, this state should be explicitly stored as state somewhere, as opposed to being tacked onto a Request instance with mutable state, data and functions cludged together.

Framework-pure I/O Middleware

Here we need to distinguish between “framework-specific” and “functionality-specific” side-effects, in lack of a better term. For example, assinging the user to req.user is framework-specific. This only makes sense if the framework supports it, and should therefore be avoided. In contrast, saving some rate-limiting information in a redis cache would be “functionality-specific”.

Here, how the rate limiter functions is the responsibility of the middleware. Maybe it can even be configured to use a different data store. But the result of the rate limiting in terms of the web request handling is a concern of the framework. By deferring these “framework-specific” side-effects to the developer, slightly more work is required on their part. But on the other hand, we’ve sucesssfully decoupled our rate-limiter middleware, such that it’s entirely framework-agnostic. This is the exact type of tradeoff that we should prefer as an industry: namely, to prefer a one-time code-writing cost at the benefit of long-term maintenance, visibility and reusability.

Rely on Return Values

Following from the above principle, inputs and outputs should be how developers interface with our middleware and/or utilties. For example, instead of mutating the Request object with some data, just return it. Now it’s up to the user to use that information, and perhaps that requires assigning to the req object directly - but that’s up to them and is explicit.

By really leveraging return values, we can produce the same result as mutation, but with much more reusability and flexibility in terms by decoupling “work done” and any side-effects. For a given request middleware, we might compute some information, or derive some state, or want to exit early by returning an error response.

These are all mutually exclusive options. Or at the very least they could be enforced as such. In the rare case you want to update state and early return, you could create a middleware that mutates the state and then another to return early, perhaps based on that state. In practice, empirically speaking, most existing middleware is adding state to req, throwing early errors or transforming input or output (yes, I checked this).

Stateless Requests, Stateful Framework

Following several of the above points, we can support statefulness effectively by returning state updates instead of mutably assigning them. State is framework dependent, by assigning state to the request, we are not only violating the web standard Request implementation, but we’re also shoehorning in application code. This approach is convenient but that convenience comes at a meaningful cost.

State is a function of our app, not of HTTP or web standards or the request, per se. When our “middleware”-esque functions return values, we can know their type and deal with them according to what any framework offers. Whether that is req.state = state or db.save(state) is irrelevant to getting our user from an Authorization header, for example. We are effectively doing all the work but deferring this last step to the user. Both the user and the compiler will thus know more about the flow of the program this way.

Given the complexity of large-scale APIs, leveraging computers to do as much of the “thinking” as possible is a very worthwhile pursuit.

A Demonstration

In isolation, these principles have hopefully resonated, but are perhaps still a bit abstract.

To paint a real use-case, consider a common piece of middleware meant to extract a JWT and return the user ID, or otherwise respond as an unauthorized exception. Traditionally, this might look something like:

// ... express-style example
const authHeader = req.headers.authorization;
const token = authHeader?.toString().replace("Bearer ", "");
const payload = JWT.verify(token, secret);

if (payload.id) {
  req.userId = payload.id;
  next();
  return;
}

next(new UnauthorizedException("Not logged in."));
// ...

The work being done is primarily extracting, parsing and verifying a JWT. The assignment (req.userId) and orchestration (next()) is framework-specific, but only comprises a small amount of the actual code. Moreover, the input request object is framework-specific and so the header extraction is also framework-specific. This is typical of almost all middleware you would have used over the years.

Instead, if we were to refactor this according to the principles outlined above we might end up with:

// ...
const authHeader = req.headers.get("Authorization");
const token = authHeader?.replace("Bearer ", "");
const payload = JWT.verify(token, secret);

if (payload.id) {
  return payload.id; // Return some product/result
}

// Indicate we want to return early
return new Response("Not logged in", { status: 401 });
// ...

This implementation has no mutations, nor framework-specific code. The only coupling is to the web standard Request and Response objects. Instead, our user understands that our tool can either retrieve an ID or early exit with response. Both of these can be modified, ignored or respected, as desired.

What about response handling?

Typically, middleware can – one way or another – handle both incoming requests and outgoing responses, in a concentric-ring fashion: InA -> InB -> [...] -> OutB -> OutA. Again, instead of defering responsibility to what is third party code via the common next() approach, we can instead utilise return values to communicate to the consumer what’s happening.

Perhaps the convention could be that returning a function indicates a response callback.

// ...poweredByMiddleware
return (response: Response) => {
  const headers = new Headers(response.headers);
  headers.set("X-Powered-By", "Custom-Framework");

  return new Response(response.body, { headers });
};

Now our consumer and/or the framework is responsible for coordinating this back through the “stack” of middleware”.

Our middleware type signature might then look something like:

type MiddlewareFn<TReturn> = (
  req: Request,
  ...rest: any[]
) => Response | TReturn | Promise<Response | TReturn>;
// If a response is returned we know this middleware _intends_ to exit.
// Whether we honour that is up to us.

In other words, we either return early with a Response or we can (optionally) return some information back. Unlike when express was created, we need not rely on callback() patterns to communicate when our routine is finished.

Notably, we cannot change the Request, which means the Request the end-developer received is canonical, untouched, unchanged, “real”. And any additional information or functionality is provided separately by the framework – for example via a state parameter or context object.

This approach allows us to hook into the operations of the middleware more if need be, thus simplifying the API and providing us with more control. Instead of encapsulating everything in a black box fashion, we can understand what the middleware does by simply inspecting it’s type signature. The intentions, side effects and results of middleware operations become much more explicit, and our ability to modify behaviour is greatly improved.

Fundamentally, we are ensuring our server-side app is always orchestrated by the framework and not by third party functionality. As a result, we could use this middleware in infinitely more places, and frameworks can avoid having to create framework-specific wrappers for it.

Conclusion

By largely adhering to these principles, there need only be one middleware/utility for each piece of functionality which would work across any framework implementation. Any variations or competitors would be purely based on the merits of the implementation itself, instead of necessity.

Ultimately, the core nugget of value here is that the boundaries between middleware, framework, application/userland code and everything else is redefined, only slightly, such that interoperability is all but established automatically. By revoking some responsiblity by obscure, black-box third party packages, and granting that back to the user and/or framework, apps become substantially more reasonable and interchangeable.

On the surface, interhangeability or interoperability appears to confer the benefit of being able to swap stuff out easily. True, this is a benefit. But more fundamentally it allows us to learn and become accustomed to fewer tools. It allows us to stop reinventing the wheel for the nth time and progress tooling instead of merely proliferating it. It means we can produce more reliable software by avoiding rudely encountering the edge cases of some other variation of fundamentally the same functionality. And that is worth pursuing, I think.

Addendum

This long-winded exploration of how interoperability might be achievable is part of a larger project of blog posts and real code-work occuring under the Webroute umbrella. Webroute is fundamentally a design philosophy, while also providing some building blocks for this “new web”, intended for real-world applications.