Challenges with Incremental S3 Objects

Why You Can’t Easily List “New Files” in S3

S3 compatible object storage systems are designed for massive scalability and simplicity, but that comes with trade-offs:

  • Stateless operations: S3 APIs are designed to be stateless and simple. This makes them highly scalable but limits features like incremental change tracking.
  • Flat Namespace, No File System Semantics: S3 stores objects in a flat namespace within buckets. There’s no concept of folders or file system metadata.
  • No Built-in Change Log or Index: S3 doesn’t maintain a native index of recently added or modified objects. To find new files, you must list “all objects” and compare timestamps or versions — which can be slow and expensive for large buckets.
  • Eventual Consistency (in some systems): Some S3-compatible systems offer eventual consistency for listing operations, meaning newly added objects might not appear immediately in a list. This makes real-time change tracking unreliable without application add-ons or external tooling.

Due to these limitations determining new, modified or deleted items is computationally expensive and time consuming. In use cases like backup, large object stores with more that 200 million objects may take more than 24 hours to process. Most organizations would find this situation unacceptable.