Here's my non-exhaustive checklist I think every good ML-for-DB paper should have, especially query optimization:

  • Tails: not just an average. Show the distribution or at least 90/95/99%
  • Query performance: show me that the queries actually get faster
  • Overhead: how much does training and inference cost?
  • Optimals: if possible, how close are you to the optimal prediction / latency?
  • Comparisons: existing (+ commercial, when possible), other learned, naive approaches
  • Failures: Show me systematic failures. If you think there aren't any, look harder
  • Pareto analysis: show me tradeoffs. No way you're better at everything.
  • Interpretation: what did the model learn? Examine the weights. Examine especially successful cases. It's ok to hypothesize!

None of the ML-for-systems papers at this VLDB (including my own!) check all these boxes. Let's raise the bar!