Chakra Replication: Real-Time Data Transfer, Built for Scale

Links
Intro
Today, we’re excited to share the public launch of Chakra Replication - seamless, scalable data transfers, enabling businesses to maintain data consistency across platforms effortlessly. With this new functionality, we’re enabling organizations to move terabyte-level data footprints in near real-time with high performance. All with best-in-class developer ergonomics and cost profiles.
Background
In the early 2000’s, tech companies would often ask interviewees a simple question: “What’s the fastest way to transfer 100TBs of data between SF and NYC?”
The answers would often range - high speed network transfers, P2P transfer solutions - but the humorous (and truthful) answer was: physically shipping the data. Coined “Sneakernet”, this method involved physically moving the data from one location to another. Surprisingly enough, this is still an enterprise service offering from AWS (Snowball).
Over the last decade, data replication - moving data from one location to another - has gotten more efficient across the structured data landscape with tools like FiveTran and Airbyte, but these products have become bloated and expensive.
Specifically, the larger the data footprint the more these products struggle on performance. On costs, moving 100TBs a month on Fivetran today will cost an enterprise several hundred thousand dollars making replication often a cost-prohibitive exercise.
Solution
With the release of replication in our data warehouse, we’re aiming to simplify, accelerate, and reduce costs on this once cumbersome process.
Out of the gate, Chakra supports the following data locations, with more integrations in the pipe:
- AWS S3
- Google Cloud Service (GCS)
- Snowflake
- Databricks (delta lake support in S3)

With a few clicks, you can create a source + sink and start transferring structured data between these locations. The data is also fully partitioned based on the user requirements to ensure fast query performance on reads.
We’ve built a fully custom indexing, partitioning, and monitoring platform to make the process as efficient while minimizing egressive costs when appropriate.
Sample Use Case
Consider an e-commerce provider that keeps their product catalog in Snowflake. The data scientists need that same data in Google Cloud Storage for their machine learning models, while the analytics team wants it in Databricks for their dashboards.
Previously, they would write custom ETL pipelines - writing code to extract data, transform it, and load it into different destinations. These homegrown solutions are notorious time-sinks and brittle, leaving teams waiting hours for fresh data.
Chakra Replication simplifies this process with just a few clicks:
- Connect Snowflake to GCS for the ML team
- Set up Snowflake to Databricks for the analysts
- Configure refresh rates and partitioning strategy
The teams get fresher data, the engineers spend less time maintaining pipelines, and the customers saves on both infrastructure and operational costs. What used to take weeks of engineering work to build and maintain can now be set up in minutes through our intuitive interface.
How do I get started?
To get started, check out our replication feature directly in the product here. Once you’re signed up, you’ll receive a link to schedule a user onboarding session where we’d be happy to walk you through any questions regarding the new feature launch.
You can check out consolidated docs for the replication feature here.