Persistence of Training Data (+misc)

(Perceptilabs 0.11.8)

Having run the Textile demo to completion over 10 epochs; I shut down perceptilabs.

Today I notice for the first time the explicit statement that only stats from the current session are available. OK, that is in fact clear, so my bad for muttering that it was not saving training data, but: this model takes ~half an hour to run. The chances that I can build & run & tweak a model in one session are slim… please can you advise when all stats from previous runs will be supported?

[Side bar: how do you intend to deal with model structure, parameter/hyperparameter changes? The user should be able to compare results meaningfully, which means those things should be ~stored with the training data - but then the questions is presentation & display - I can offer one approach but this margin is too narrow to contain it :wink: ]

In the model hub it says “training complete” (and the time), but this is practically inconsistent with the stats situation. Model is trained but I can’t use the previous info -> should probably clear that if the training data is not actually accessible.

Finally, the Misc items: before logging out yesterday I was prompted that I had unsaved models, but as the screenshot shows there is no indication that the only model open is unsaved (no dot on the tab).

And - even in the session of training, after training was complete the Test view said I had not trained any models.

Hi @JulianSMoore,

To start with having persistent statistics; this is something we have had in our pipeline for a loooong time but had difficulty getting around to implementing. We now finally have a time frame attached to that feature though and you can expect to see it in 2 (larger) releases (not the next one but the one after that).
The estimated release time between those two releases will be a lot shorter than the current and the upcoming one though.

[Side bar: how do you intend to deal with model structure, parameter/hyperparameter changes? The user should be able to compare results meaningfully, which means those things should be ~stored with the training data - but then the questions is presentation & display - I can offer one approach but this margin is too narrow to contain it :wink: ]

With the same release that the statistics persistence is planned to come in we are also planning a bit of an interface update, so I would love to hear your suggestion on this :slight_smile:

In the model hub it says “training complete” (and the time), but this is practically inconsistent with the stats situation. Model is trained but I can’t use the previous info -> should probably clear that if the training data is not actually accessible.

Agreed, this will be clearer with the next release.

The other misc items looks like bugs, I’ve added them to our buglist, thanks! :slight_smile:

A wider margin (for error…? :wink: )

Managing model state and training - this may not be practicable, but it may nonetheless contain some useful ideas/principles.

At present, models are organised with the model itself as the central concept & key management item.

If one changes perspective a little, things become a lot easier: consider that the results are what the user is ultimately interested in and workflow elements like this… Just some ideas, not a fully worked out design :wink:

  • The perceptilabs model (as a graph of nodes, layers + other ops, together with parameters/ hyperparameters) is a very compact specification that generates code to be run. As such I see no harm in storing it multiple times: it will always be a relatively small overhead in comparison with the parameter storage required after training. (Even with custom code? I think would also still be small compared to parameter storage.)
  • The term model is already generic in that we speak of trained and untrained models, so when the user creates a new “model” they create both the structures for recording the graph and those for recording the training state of the model in a single atomic entity (e.g. a json). Then any such thing can be opened, the model specification inspected and results applicable to that specification viewed/reported on ).
  • At any time (modulo batch/mini-batch) the model can be saved, and that saved state subsequently reloaded to continue training
  • If a model has had any training, any significant change to the model (could be exceptions, such as number of epochs) invalidates the pre-existing training and a new version to be created when either a) the user chooses save or b) training starts
  • What determines whether saving creates a new model or a new version? User always decides: following an initial save, regardless of the magnitude of changes made, versions are created unless the user chooses Save As [new model]. (Note that even with save as new model, a new model could be compared with an old model with this approach)
  • The model hub then lists model version files (could be grouped under initial/current version with expand/collapse) to allow for a hub that shows many models. (Also - deleting versions would not delete a folder and all it’s content… potentially including things one would rather not delete such as data… :wink: )
  • Aside: hub to show both model Name/Title (which should be in the file) and filename (with access to path, option to open location, etc.)
  • How to name, etc. versions: TBD
  • NB UI tabs on LHS are tabs within model tabs across the top, so one can already look at model, statistics, test for each model currently open.
  • Changes to underlying data potentially invalidate comparisons or even training in progress… you cannot know for sure, but you can indicate to the user that the data changed since [TBD] (and cause new version saving) by saving a source data hash computed at the time training starts (?how expensive is that in compute, time) and/or file timestamps… since source data is linked to local data elements, adding & checking timestamp would not be a big overhead and probably better/cheaper than hashing)
  • OT: random data does not seem to have seed control - that’s essential to reproducibility (and if I have >1 random thing going on, is the sequence of calls to random guaranteed to be the same from run to run?)

Hope that’s useful…

1 Like

PS Trained models could get quite big… it would be nice to be able to resave without the trained parameter values, e.g. to archive a model (that could be re-run if the data needs to be regenerated)

Thank you very much @JulianSMoore!

I really love the idea of having both the training and the non-training state in the same json, makes it very clean.
How would you manage the versioning of the model jsons? Through git or just having multiple json files being created?

OT: random data does not seem to have seed control - that’s essential to reproducibility (and if I have >1 random thing going on, is the sequence of calls to random guaranteed to be the same from run to run?)

Yea… we actually have that setting behind the scenes but it was accidentally left out from the frontend. It’s fixed in our dev env but has not been pushed to production yet.
The good news though is that the seed will be the same every time (unless you change it in the cusomt code).

How would I manage the versioning of the jsons? Well, since I’ve never met a wheel that couldn’t be improved (even if might never have seen a wheel before :wink:), here’s a very simple scheme that supports the way I often work: go forward until I reach a conceptual or practical dead-end (bad idea or terrible implementation), go back to last known good version and move forward again - taking care not to overwrite the last known good version.

I can’t comment on GIT, but perhaps not everyone always wants to be that productionised. TBH, I don’t know how clever GIT is generally, let alone for json, but I would expect that GIT integration would be a useful addition to basic versioning in the longer term.

This scheme can certainly be broken (if you were to start deleting versions and then recreating them, effectively erasing save/saveas history) but it does a workmanlike job of identifying sequential versions and forks.

And, if the save/saveas characters are OS compatible they could be included by default in the filename (choice of rename file on save or save new version file, but always new file on saveas), but since version information is also embedded in the json, even if the file is renamed, its place in history is retained internally.

As it says on the diagram, even if one cleans up the version history by deleting files, the implicit hierarchy is also preserved.

Additional considerations:

  • Assign every new model (from “Create” action) a unique internal GUID
  • Record dates and times of last save/saveas as well for info
  • Last save/saveas by +username?

Thank you very much! :pray:

1 Like