You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Adding a new field to the PyStats struct (or one of its substructs)
Outputting it from print_stats (or one of its subfunctions)
Actually collecting the statistic by adding a call to STAT_INC (or friend)
Adding code to add it to an output table in summarize_stats.py
(3) is just sort of required, but the other steps could probably be merged into one through use of macros and/or codegen, thereby making it much easier to add new stats and understand what we have, and how they flow through the system etc.
The main thing that will be hard to solve will be to maintain the use of sort of "English phrases" in the output file. For example, the stats.rare_event.set_class statistic is written to the file as Rare event (set_class):. I think it would be better if we just used the same dot notation everywhere, only converting to friendly English phrasing at the very last step in summarize_stats.py.
I think it's fair to say that pystats is an implementation detail that only interpreter hackers care about, and we are free to change the file format at a whim. This would create a "hard break" before and after such a change, but comparing stats across long time periods is cumbersome anyway. But perhaps comparing main against 3.13.0 is something we still care about.
I wouldn't worry about breakage. The main use of stats is to guide improvements and tell us about individual changes. We don't care about long term trends (at least, I don't).
Using the same dot notation everywhere is fine, but some stats are tables, like execution counts, and some have tables within fields and/or fields within tables. Instead of just a.b.c it can be a[b].c or a.b[c]. How would that work?
Adding a new pystat currently involves:
PyStats
struct (or one of its substructs)print_stats
(or one of its subfunctions)STAT_INC
(or friend)summarize_stats.py
(3) is just sort of required, but the other steps could probably be merged into one through use of macros and/or codegen, thereby making it much easier to add new stats and understand what we have, and how they flow through the system etc.
The main thing that will be hard to solve will be to maintain the use of sort of "English phrases" in the output file. For example, the
stats.rare_event.set_class
statistic is written to the file asRare event (set_class):
. I think it would be better if we just used the same dot notation everywhere, only converting to friendly English phrasing at the very last step insummarize_stats.py
.I think it's fair to say that
pystats
is an implementation detail that only interpreter hackers care about, and we are free to change the file format at a whim. This would create a "hard break" before and after such a change, but comparing stats across long time periods is cumbersome anyway. But perhaps comparingmain
against 3.13.0 is something we still care about.Thoughts, @brandtbucher, @markshannon (others...)?
The text was updated successfully, but these errors were encountered: