For decades, Extract, Transform, and Load (ETL) has been the backbone of data integration. But let's be honest: traditional ETL often feels like a relic from a bygone era. Monolithic, brittle, and slow, these pipelines are frequently a black box—managed by a specialized team and feared by the developers who depend on them. A minor change in a source schema can trigger a cascade of failures, and building new integrations can take weeks or months.
What if we could break free from this paradigm? What if we treated data transformations not as massive, one-off jobs, but as lightweight, reusable, and composable microservices?
This is the promise of Intelligent Data Transformation as Code. By exposing complex data mapping, cleansing, and format conversion logic through a simple API, we can build more resilient, agile, and developer-friendly data workflows.
If you're a developer, you've likely felt the friction of traditional ETL. The core problems are deeply ingrained in its monolithic nature:
It's a model that doesn't fit the fast-paced, API-driven world we live in today. We need a new approach.
Imagine a world where data transformation is just another service you can call. This is the core idea behind the "data microservice" model enabled by transform.do.
A data microservice is a small, independent service with a single responsibility: to transform data from a source structure to a target structure. Its interface isn't a complex dashboard; it's a clean, stable API endpoint.
This shift to ETL as Code unlocks the same benefits that microservices brought to application development:
At transform.do, we turn this concept into a practical reality. We provide a simple API and SDK that let you define and execute complex transformations without managing any infrastructure. Our AI-powered agentic workflow handles the heavy lifting, so you can focus on the logic.
Here’s how simple it is to reshape a data structure using our JavaScript/TypeScript SDK:
import { Agent } from "@do/sdk";
// Initialize the transformation agent
const transform = new Agent("transform.do");
// Define your source data and transformation rules
const sourceData = [
{ "user_id": 101, "first_name": "Jane", "last_name": "Doe", "join_date": "2023-01-15T10:00:00Z" },
{ "user_id": 102, "first_name": "John", "last_name": "Smith", "join_date": "2023-02-20T12:30:00Z" }
];
const transformations = {
targetFormat: "json",
rules: [
{ rename: { "user_id": "id", "first_name": "firstName", "last_name": "lastName" } },
{ convert: { "join_date": "date('YYYY-MM-DD')" } },
{ addField: { "fullName": "{{firstName}} {{lastName}}" } }
]
};
// Execute the transformation
const result = await transform.run({
source: sourceData,
transform: transformations
});
console.log(result.data);
/*
Output:
[
{
"id": 101,
"firstName": "Jane",
"lastName": "Doe",
"join_date": "2023-01-15",
"fullName": "Jane Doe"
},
{
"id": 102,
"firstName": "John",
"lastName": "Smith",
"join_date": "2023-02-20",
"fullName": "John Smith"
}
]
*/
In this example, the transformations object is your ETL as code. It’s a declarative, version-controllable definition of your intent. You can perform powerful actions like:
You define the what, and our agent handles the how.
This API-first approach is designed to scale with your needs.
Worried about large datasets? Our platform processes data in efficient streams and runs workflows asynchronously. You can kick off a transformation on terabytes of data, get back to your work, and receive a webhook when it's complete.
Need to handle multiple formats? We natively support JSON, CSV, XML, and YAML, and our agentic workflow can be taught to handle proprietary formats as well.
The true power, however, lies in composability. Since every transformation workflow is a service with a stable API, you can chain them together to orchestrate sophisticated data flows. For example:
This entire pipeline is composed of small, independent, and reusable services—all defined as code.
The era of the slow, opaque ETL job is over. The future of data integration is agile, developer-driven, and built on the same microservice principles that have revolutionized application development. By thinking of data transformation as a composable service, you can build systems that are faster to develop, easier to maintain, and infinitely more scalable.
Ready to stop wrestling with pipelines and start building data microservices? Learn more at transform.do.
What kind of data transformations can I perform?
You can perform a wide range of transformations, including data mapping (e.g., renaming fields), format conversion (JSON to CSV), data cleansing (e.g., standardizing addresses), and data enrichment by combining or adding new fields based on existing data.
How does transform.do handle large datasets or ETL jobs?
Our platform is built for scale. Data is processed in efficient streams, and workflows can run asynchronously for large datasets. You can transform terabytes of data without blocking your own systems and receive a webhook or notification upon completion.
Can I chain multiple transformations together?
Yes. A transformation workflow on .do is a service with a stable API endpoint. This allows you to chain multiple transformations together or integrate them with other services to build complex, multi-step data processing pipelines, all defined as code.