Mastering Complex Data Structures: How to Flatten Nested JSON with an API

If you're a developer, you've been there. You query a third-party service, a NoSQL database, or an internal microservice, and what you get back is a deeply nested JSON object. While JSON's hierarchical structure is great for representing complex data entities, it becomes a major roadblock when you need to load that data into a BI tool, a spreadsheet, or a relational database.

The challenge is always the same: how do you effectively and reliably flatten this nested structure into a simple, tabular format? The traditional answer involves writing brittle, recursive scripts that are a pain to write and even more of a pain to maintain.

But what if you could describe the transformation you want and let an intelligent service handle the execution? Let's explore how to solve this common data transformation problem using a simple, powerful API.

The Problem with Nested JSON

Nested JSON is data that contains objects within objects or arrays of objects. For example, a customer order might look like this:

{
  "orderId": "ORD-123",
  "customer": {
    "id": "CUST-A",
    "contact": {
      "name": "Alice Johnson",
      "email": "alice@example.com"
    }
  },
  "items": [
    { "sku": "SKU-001", "name": "Widget A", "price": 10.00 },
    { "sku": "SKU-002", "name": "Widget B", "price": 15.50 }
  ]
}

This structure is perfectly logical, but most analytics platforms and data warehouses (like BigQuery, Redshift, or even a simple CSV importer) work best with flat data, like this:

order_id	customer_id	customer_name	customer_email	item_sku	item_name	item_price
ORD-123	CUST-A	Alice Johnson	alice@example.com	SKU-001	Widget A	10.00
ORD-123	CUST-A	Alice Johnson	alice@example.com	SKU-002	Widget B	15.50

Flattening the data involves two key tasks:

Un-nesting objects: Accessing fields from nested objects (like customer.contact.name) and bringing them to the top level.
Exploding arrays: Creating a separate record for each element in an array (like items).

Writing a custom script to handle this often leads to fragile code that breaks the moment the source schema changes. This is where a dedicated data transformation service becomes invaluable.

The Modern Solution: Data Transformation as Code

Instead of writing imperative code that details how to loop and recurse through a JSON object, a modern approach lets you declaratively define what the final structure should look like.

This is the core philosophy behind transform.do—intelligent data transformation as code. You define your transformation rules in a simple, version-controlled format, and our AI agents handle the complex execution. This turns a complex ETL job into a stable, reusable service you can call from anywhere.

A Practical Example: Flattening Order Data with transform.do

Let's take our nested order data and flatten it using the transform.do agent. All we need to do is define our source data and a set of transformation rules.

Here's how you can do it with a few lines of code:

import { Agent } from "@do/sdk";

// Initialize the transformation agent
const transform = new Agent("transform.do");

// 1. Define your nested source data
const sourceData = [{
  "orderId": "ORD-123",
  "orderDate": "2024-05-21T15:00:00Z",
  "customer": {
    "id": "CUST-A",
    "contact": { "name": "Alice Johnson", "email": "alice@example.com" }
  },
  "items": [
    { "sku": "SKU-001", "name": "Widget A", "price": 10.00, "quantity": 2 },
    { "sku": "SKU-002", "name": "Widget B", "price": 15.50, "quantity": 1 }
  ],
  "shipping": { "address": "123 Main St, Anytown, USA" }
}];

// 2. Define your flattening and transformation rules
const transformations = {
  rules: [
    // "Explode" the items array to create a row for each item
    { unwind: "items" },

    // Map nested fields to a flat structure using dot notation
    { rename: {
        "orderId": "order_id",
        "customer.id": "customer_id",
        "customer.contact.name": "customer_name",
        "items.sku": "item_sku",
        "items.name": "item_name",
        "items.price": "item_price",
        "items.quantity": "item_quantity",
        "shipping.address": "shipping_address"
      }
    },
    
    // Create a new calculated field
    { addField: { "total_price": "{{item_price}} * {{item_quantity}}" } },

    // Clean up the original complex fields
    { removeFields: ["customer", "items", "shipping"] }
  ]
};

// 3. Execute the transformation
const result = await transform.run({
  source: sourceData,
  transform: transformations
});

console.log(result.data);

The Result: Perfectly Flat JSON

Executing this workflow produces a clean, flat array of objects, ready for any analytics tool or database. The agent automatically handled the array explosion and nested field mapping based on our simple rules.

[
  {
    "order_id": "ORD-123",
    "order_date": "2024-05-21T15:00:00Z",
    "customer_id": "CUST-A",
    "customer_name": "Alice Johnson",
    "item_sku": "SKU-001",
    "item_name": "Widget A",
    "item_price": 10,
    "item_quantity": 2,
    "shipping_address": "123 Main St, Anytown, USA",
    "total_price": 20
  },
  {
    "order_id": "ORD-123",
    "order_date": "2024-05-21T15:00:00Z",
    "customer_id": "CUST-A",
    "customer_name": "Alice Johnson",
    "item_sku": "SKU-002",
    "item_name": "Widget B",
    "item_price": 15.5,
    "item_quantity": 1,
    "shipping_address": "123 Main St, Anytown, USA",
    "total_price": 15.5
  }
]

Beyond Flattening: A Complete Transformation Toolkit

Flattening is just one piece of the puzzle. The transform.do platform is designed to handle a wide range of data transformation tasks within the same simple workflow:

Data Mapping: Easily rename fields (e.g., orderId to order_id).
Data Enrichment: Add new fields based on existing data (e.g., calculating total_price).
Data Cleansing: Standardize formats, clean up addresses, or handle null values.
Format Conversion: Effortlessly convert the output from JSON to CSV, XML, or other formats by changing a single parameter.

By defining these complex ETL pipelines as simple, version-controlled services, you can chain them together, integrate them into CI/CD, and build robust, scalable data processing workflows without the maintenance headache.

Stop writing one-off scripts. Start building intelligent data transformation services.

Ready to simplify your data workflows? Visit transform.do to learn more and get started for free.