Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Combine gateway log files for a given date (assuming at this point that the datestamp in the filename is reliable, and that the log entries are for a roughly equal span of time.

  • Write Awk scripts to extract file sizes into one file, extract deletion times into another file.

    • File sizes look like:

      • path,date,size,writetime

    • File deletion times look like:

      • path,date,deletetime

  • Correlate these entries by a simple sqlite join query:

    • .mode csv
      .headers on
      .import atlas-file-sizes.csv sizes
      .import atlas-delete-times.csv times
      select distinct sizes.path, size, deletetime, round(size*1.0/deletetime, 2) as "deleterate"
      from sizes inner join times using (path)

  • Running that query produce a correlation between file sizes and their deletion times looking like:

    • path,size,deletetime,deleterate

  • This can be plotted with e.g. GNUplot , or query statistics on the command line produced with datamash