feat: Allow study progress to be saved & resumed by anrp · Pull Request #106 · p-e-w/heretic

anrp · 2026-01-15T14:00:16Z

Fixes #95
Followup/improvement of https://github.com/spikymoth/heretic/tree/save-and-resume

gemini-code-assist · 2026-01-15T14:00:37Z

Summary of Changes

Hello @anrp, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the robustness of optimization studies by implementing a comprehensive checkpointing and resumption system. It leverages Optuna's JournalStorage to persist study progress, allowing users to seamlessly continue interrupted or long-running experiments. The changes include new configuration options for managing checkpoint files, intelligent handling of study identification through parameter hashing, and dynamic adjustment of Optuna's sampler behavior to correctly account for startup trials upon resumption.

Highlights

Study Checkpointing and Resumption: Introduced the ability to save and resume the progress of an optimization study, preventing loss of work due to interruptions or crashes.
Configurable Checkpoint Files: Added new settings (study_checkpoint_file and study_autoresume) to control where study progress is saved and whether to automatically resume.
Robust Study Identification: Implemented a mechanism to generate a unique study name by hashing key configuration parameters, ensuring that resumed studies are consistent with their original setup.
Dynamic Startup Trial Management: Adjusted Optuna's optimization process to correctly handle the n_startup_trials (random trials) when resuming a study, ensuring proper exploration even after an interruption.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable feature to save and resume study progress, which is crucial for long-running optimization tasks. The implementation correctly uses Optuna's JournalStorage for persistence, and the logic for handling resumed studies, including the continuation of startup trials, is well-implemented. I've identified a critical typo that would prevent the code from running, along with a medium-severity style guide violation. After addressing these points, this will be a solid contribution.

src/heretic/main.py

src/heretic/config.py

src/heretic/main.py

p-e-w · 2026-01-16T07:36:29Z

src/heretic/main.py

+        remaining_trials = settings.n_trials - start_index
+        if remaining_trials > 0 and random_trials_to_run > 0:
+            if start_index > 0:
+                print(f"Running additional {random_trials_to_run} random trials")


This should not be displayed in the frontend. Sampling is an implementation detail that the user can do nothing with.

src/heretic/main.py

src/heretic/utils.py

src/heretic/main.py

anrp · 2026-01-16T13:29:26Z

Ultimately from all of this discussion - would you like "configuration option for directory", "all settings saved in study name so that it can't restart from a different place" and "no logic for more random tries"? That converts this to a resume-only capability (and you can extend the number of tries interactively, that's it)? (Maybe just don't include n_tries in the study name, and read from study file (resuming) or command line (initial)?

p-e-w · 2026-01-17T12:26:48Z

We need to decide what we actually want here. My idea was to have a system that can recover from a crash by resuming where it left off (e.g. after another program used up VRAM and led to an OOM), or allow the user to stop the run and continue the next day. In that case, it's obvious what should happen: All settings are restored, and the trials continue from the cutoff point.

But it seems that you and @spikymoth have a more ambitious vision: That the user should be able to stop the run and resume it with different settings. I agree that this ability would be nice to have in principle, but it seems very difficult to implement in a way that's correct, non-confusing, and maintainable.

anrp · 2026-01-17T12:55:27Z

The only setting that really makes sense to resume with differently is the n_trials, and that can effectively already be overriden with a resume-only capability by the interactive menu option. I would be fine with that, because it does simplify every other part of this.

p-e-w · 2026-01-17T14:12:49Z

Here's a sketch of how I think a reasonable implementation might look:

The study progress is journaled to a file with the same name as the model. The complete settings object is stored in there.
When Heretic is started and a journal file matching the model name already exists, the user is offered the choice between resuming the study or starting a new study.
If the user chooses to resume the study, the entire settings object is replaced with the one from the journal file.
If the user chooses to start a new study, the existing journal file is deleted and replaced with a journal file corresponding to the current run.
When the run is complete, the journal file is deleted.

anrp · 2026-01-17T17:10:33Z

1-4 SGTM but I'd like to push back on 5 being automatic, since it's sometimes useful to just be able to drop back in to the chat interface to test things i.e. I'd like to keep the few-KB result file which represents the $hours of computation. Thoughts?

spikymoth · 2026-01-17T17:36:48Z

I think a "Delete study log? (y/n)" question at the end if you select "None (exit program)" would make sense, personally. And I agree that n_trials is the main setting that makes sense to allow changing between sessions (which should only require setting exclude=True on that Field() to avoid serialization). There's a case to be made for some other settings, but I think n_trials is the main one for just picking up a promising study again later.

p-e-w · 2026-01-18T05:13:23Z

1-4 SGTM but I'd like to push back on 5 being automatic, since it's sometimes useful to just be able to drop back in to the chat interface to test things i.e. I'd like to keep the few-KB result file which represents the $hours of computation.

Okay, but how exactly will this work? Let's say the study is complete and we keep the journal. Now Heretic is run again with the same model, and the user chooses to "resume" the study. But the study is already complete, and the number of completed trials is equal to the number of trials to complete. Now what? We just tell them that the run they just started is already complete, and show the trial selection menu with the Pareto front?

And I agree that n_trials is the main setting that makes sense to allow changing between sessions (which should only require setting exclude=True on that Field() to avoid serialization).

But we already have that functionality. When the Pareto front is displayed, one of the options is "Continue optimization (run more trials)". This was implemented in #76.

anrp · 2026-01-18T05:18:24Z

[...] We just tell them that the run they just started is already complete, and show the trial selection menu with the Pareto front?

is necessary if you want to

[...] "Continue optimization (run more trials)". This was implemented in #76.

after deciding to exit.
Maybe print a distinct message about no work actually happening, but basically, yes.

p-e-w · 2026-01-18T05:42:27Z

Ok, how about this:

Instead of asking the user whether to delete the journal file when they exit (which interrupts the exit process and might be the wrong time to make that decision), we prompt them when the program starts?

If the previous run is complete:

You have already processed this model. How would you like to proceed?
[1] Show the results from the previous run, allowing you to export models, or to run additional trials.
[2] Ignore the previous run and start from scratch. This will delete the checkpoint file and all results from the previous run.

If the previous run is incomplete:

You have already processed this model, but the run was interrupted. How would you like to proceed?
[1] Continue the previous run from where it stopped.
[2] Ignore the previous run and start from scratch. This will delete the checkpoint file and all results from the previous run.

anrp · 2026-01-18T13:05:23Z

I like that, basically moves the delete question to startup time. Implemented, PTAL.

src/heretic/config.py

src/heretic/main.py

p-e-w · 2026-01-18T15:30:07Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a valuable feature to save and resume study progress, which is a great improvement for long-running optimization tasks. The implementation is mostly solid, leveraging Optuna's journaling storage. I've found one area in src/heretic/main.py that could be improved for robustness and efficiency by handling potential missing keys in study attributes and avoiding redundant API calls. My suggestion addresses a potential crash and makes the code cleaner. The rest of the changes are well-implemented and consistent with the new feature.

src/heretic/main.py

.gitignore

src/heretic/main.py

src/heretic/config.py

p-e-w

I think the logic is sound now.

src/heretic/main.py

src/heretic/config.py

src/heretic/main.py

p-e-w · 2026-01-22T14:05:45Z

/gemini review

gemini-code-assist · 2026-01-22T14:07:31Z

Warning

Gemini encountered an error creating the review. You can try again by commenting /gemini review.

p-e-w · 2026-01-22T14:21:21Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a valuable feature allowing study progress to be saved and resumed, which significantly improves the user experience by preventing loss of work. The changes are well-structured, touching .gitignore for checkpoint exclusion, config.py for new settings, and main.py and utils.py for the core logic of saving, loading, and resuming studies. The refactoring of settings source customization in config.py is also a good improvement. Overall, the implementation seems robust and directly addresses the stated goal of the pull request.

src/heretic/main.py

p-e-w · 2026-01-23T14:19:57Z

Yup, that looks good now. Merged!

spikymoth added 2 commits January 15, 2026 06:56

feat: Store active study in log/study.jsonl and allow resuming

eb04b4a

Simplify resume logic with load_if_exists=True

5a2ecfb

gemini-code-assist bot reviewed Jan 15, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

anrp force-pushed the anrp/save-and-resume-v2 branch from af67313 to b49e0ce Compare January 15, 2026 14:03

anrp changed the title ~~Allow study progress to be saved & resumed~~ feat: Allow study progress to be saved & resumed Jan 15, 2026

anrp mentioned this pull request Jan 15, 2026

State save possible? #95

Closed

anrp force-pushed the anrp/save-and-resume-v2 branch from b49e0ce to 3c37150 Compare January 15, 2026 14:12

p-e-w reviewed Jan 16, 2026

View reviewed changes

src/heretic/main.py Show resolved Hide resolved

p-e-w reviewed Jan 16, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

anrp mentioned this pull request Jan 17, 2026

feat: refactor save machinery #110

Merged

anrp force-pushed the anrp/save-and-resume-v2 branch from 3c37150 to ba1f085 Compare January 18, 2026 13:05

p-e-w reviewed Jan 18, 2026

View reviewed changes

anrp force-pushed the anrp/save-and-resume-v2 branch from ba1f085 to 2d1fdbb Compare January 18, 2026 15:11

p-e-w reviewed Jan 18, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

anrp force-pushed the anrp/save-and-resume-v2 branch from 2d1fdbb to 19d9f52 Compare January 18, 2026 15:24

gemini-code-assist bot reviewed Jan 18, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

anrp force-pushed the anrp/save-and-resume-v2 branch from 19d9f52 to 73d7dba Compare January 18, 2026 15:50

anrp force-pushed the anrp/save-and-resume-v2 branch 2 times, most recently from f95b0ce to a68fc36 Compare January 18, 2026 16:10

anrp commented Jan 18, 2026

View reviewed changes

src/heretic/main.py Show resolved Hide resolved

p-e-w reviewed Jan 19, 2026

View reviewed changes

anrp force-pushed the anrp/save-and-resume-v2 branch 5 times, most recently from ca72b30 to 083a6b3 Compare January 19, 2026 13:34

spikymoth reviewed Jan 20, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

anrp force-pushed the anrp/save-and-resume-v2 branch from 083a6b3 to d1d0fb6 Compare January 20, 2026 15:12

p-e-w reviewed Jan 20, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

p-e-w reviewed Jan 20, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

anrp force-pushed the anrp/save-and-resume-v2 branch from d1d0fb6 to e38921c Compare January 20, 2026 17:58

Significantly improve flexibility of study save/load

6483012

anrp force-pushed the anrp/save-and-resume-v2 branch from e38921c to 6483012 Compare January 20, 2026 18:08

Put constructor arguments at the highest precedence

3b63699

anrp force-pushed the anrp/save-and-resume-v2 branch from d83a056 to 3b63699 Compare January 20, 2026 19:10

anrp commented Jan 22, 2026

View reviewed changes

src/heretic/config.py Show resolved Hide resolved

p-e-w reviewed Jan 22, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/config.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

src/heretic/main.py Outdated Show resolved Hide resolved

gemini-code-assist bot reviewed Jan 22, 2026

View reviewed changes

src/heretic/main.py Outdated Show resolved Hide resolved

Review comments

0c7582e

p-e-w merged commit ebc22c2 into p-e-w:master Jan 23, 2026
4 checks passed

p-e-w mentioned this pull request Jan 26, 2026

Session restore #120

Closed

Conversation

anrp commented Jan 15, 2026

Uh oh!

gemini-code-assist bot commented Jan 15, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

p-e-w Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anrp commented Jan 16, 2026

Uh oh!

p-e-w commented Jan 17, 2026

Uh oh!

anrp commented Jan 17, 2026

Uh oh!

p-e-w commented Jan 17, 2026

Uh oh!

anrp commented Jan 17, 2026

Uh oh!

spikymoth commented Jan 17, 2026

Uh oh!

p-e-w commented Jan 18, 2026

Uh oh!

anrp commented Jan 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

p-e-w commented Jan 18, 2026

Uh oh!

anrp commented Jan 18, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

p-e-w commented Jan 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

p-e-w left a comment

Choose a reason for hiding this comment

anrp commented Jan 18, 2026 •

edited

Loading