The Problem with Timeseries Data in Machine Learning Feature Systems

• Etsy’s Feature Systems introduced real‑time features via Rivulet, feeding ML models with timeseries data. • A recommendation engineer flagged that Avro timestamp logic caused precision mismatches across Pandas, NumPy, Spark. • Millisecond‑level timestamps were read as nanoseconds, creating potential training‑serving skew and silent failures. • The team opted to drop the timestamp type, serving features as plain numeric longs instead. • This change reduces cross‑framework incompatibilities and stabilizes downstream recommendation, search, and ad models. • The incident highlights the importance of data type consistency in feature pipelines.

Article Summaries:

Etsy’s Feature Systems team, which supplies machine‑learning models with feature data, recently added real‑time features via its Rivulet streaming platform. An ML practitioner flagged that the team’s use of Avro’s timestamp logical type-set to millisecond precision-would be interpreted inconsistently by popular libraries (Pandas, NumPy, Spark). The differing precisions could create a training‑serving mismatch and silent failures in production. After investigating the root cause, the team concluded that the safest fix is to drop the timestamp type altogether and serve the values as plain numeric longs, thereby avoiding cross‑framework precision drift.
Etsy’s Feature Systems team rolled out real‑time features through its Rivulet streaming platform, enabling models for search, ads, and recommendations to ingest live data such as “most recent add‑to‑carts.” Soon after launch, an ML engineer flagged a “major problem” with the timestamp datatype used to export these features. The team’s Avro‑based store stores timestamps at millisecond precision, but when read by Pandas, NumPy, or Spark, the values are interpreted as nanosecond‑precision datetime objects, creating a training‑serving skew. After investigating the root cause, the team is moving to a plain numeric type (e.g., Long) to avoid silent failures and ensure consistency across frameworks.
Etsy’s Feature Systems team has identified a critical issue with the way time‑series data is handled in its machine‑learning pipeline. After launching real‑time features via the Rivulet streaming platform, an ML practitioner warned that the Avro “timestamp” logical type-configured for millisecond precision-would be interpreted inconsistently by Pandas, NumPy, and Spark, producing nanosecond‑level timestamps. This mismatch could create a training‑serving skew and silent failures in downstream models for search, ads, and recommendations. The team is investigating the root cause and plans to replace the timestamp type with a plain numeric format (e.g., Long) to ensure consistent precision across all frameworks.
Etsy’s Feature Systems team discovered that real‑time timeseries features delivered via its Rivulet streaming platform were causing downstream machine‑learning models to misbehave. The problem stemmed from the Avro “timestamp” logical type, which was stored with millisecond precision but interpreted as nanoseconds by Pandas, NumPy, and Spark, creating a training‑serving skew. After receiving a warning from a recommendation‑model engineer, the team investigated the root cause and found that the discrepancy was a symptom of broader cross‑framework representation issues. To avoid silent failures, they recommended abandoning the timestamp type altogether and serving features as plain numeric values such as longs.
Etsy’s Feature Systems team rolled out real‑time features through its Rivulet streaming platform, enabling machine‑learning models for search, ads, and recommendations to use time‑series data such as “most recent add‑to‑carts.” Soon after launch, an ML engineer flagged that the timestamp datatype used in the Avro‑based offline store would be interpreted inconsistently by downstream frameworks-Pandas, NumPy, and Spark would read millisecond‑precision timestamps as nanosecond‑precision datetime objects, creating a training‑serving skew. The team investigated the root cause, confirming that the Avro timestamp annotation was not respected across systems. To avoid silent failures, they plan to replace the timestamp type with a basic numeric type (e.g., long) while continuing to analyze the underlying mismatch.

Sources: