perf: empty cache after using residuals and between trials by red40maxxer · Pull Request #15 · p-e-w/heretic

red40maxxer · 2025-11-17T14:47:20Z

Great work btw, this is an awesome project and looking forward to seeing it grow, I'm still learning and this has been a fun repo to hack around in :)

I'm pretty sure we can free the residuals as soon as the refusal directions have been computed, and we can also clear the cache between trials.

Tested on my laptop 4060 while abliterating Qwen3-0.6B and peak GPU usage saw a slight decline

p-e-w · 2025-11-17T15:23:52Z

src/heretic/main.py

        score, kl_divergence, refusals = evaluator.get_score()
-
+        # free memory between trials
+        empty_cache()


This is done by reload_model() already, which happens at the start of each trial. I'd be very surprised if that line makes a difference because we're literally milliseconds away from calling that function anyway.

oops you're right, i'll remove that line

Can you retest and check if the performance gain is still there with just the residuals garbage collected?

Memory usage without residual gc:

█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.0.1 █▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░ ▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic GPU type: NVIDIA GeForce RTX 4060 Laptop GPU Loading model Qwen/Qwen3-0.6B... * Trying dtype auto... Ok * Transformer model with 28 layers * Abliterable components: * attn.o_proj: 1 matrices per layer * mlp.down_proj: 1 matrices per layer Loading good prompts from mlabonne/harmless_alpaca... * 400 prompts loaded Loading bad prompts from mlabonne/harmful_behaviors... * 400 prompts loaded Loading good evaluation prompts from mlabonne/harmless_alpaca... * 100 prompts loaded * Obtaining first-token probability distributions... Loading bad evaluation prompts from mlabonne/harmful_behaviors... * 100 prompts loaded * Counting model refusals... * Initial refusals: 52/100 Calculating per-layer refusal directions... GPU memory before residuals: 1261976064 (peak so far: 3135785472) * Obtaining residuals for good prompts... * Obtaining residuals for bad prompts... GPU memory after residuals: 1358445056 (peak so far: 3135785472) GPU memory after clearing residuals: 1358563840 (peak so far: 3135785472)

With GC:

█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.0.1 █▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░ ▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic GPU type: NVIDIA GeForce RTX 4060 Laptop GPU Loading model Qwen/Qwen3-0.6B... * Trying dtype auto... Ok * Transformer model with 28 layers * Abliterable components: * attn.o_proj: 1 matrices per layer * mlp.down_proj: 1 matrices per layer Loading good prompts from mlabonne/harmless_alpaca... * 400 prompts loaded Loading bad prompts from mlabonne/harmful_behaviors... * 400 prompts loaded Loading good evaluation prompts from mlabonne/harmless_alpaca... * 100 prompts loaded * Obtaining first-token probability distributions... Loading bad evaluation prompts from mlabonne/harmful_behaviors... * 100 prompts loaded * Counting model refusals... * Initial refusals: 52/100 Calculating per-layer refusal directions... GPU memory before residuals: 1261976064 (peak so far: 3135785472) * Obtaining residuals for good prompts... * Obtaining residuals for bad prompts... GPU memory after residuals: 1358445056 (peak so far: 3135785472) GPU memory after clearing residuals: 1262094848 (peak so far: 3135785472)

So we save ~100MB by clearing the residuals but it doesn't reduce the peak usage, it would scale with number of prompts I think?

src/heretic/main.py

p-e-w · 2025-11-17T16:48:41Z

Thanks, this is a reasonable change!

p-e-w reviewed Nov 17, 2025

View reviewed changes

src/heretic/main.py Show resolved Hide resolved

perf: clear residuals after computing direction

cec47df

red40maxxer force-pushed the optimize-memory-usage branch from e29e667 to cec47df Compare November 17, 2025 16:42

p-e-w merged commit 7bad84b into p-e-w:master Nov 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: empty cache after using residuals and between trials#15

perf: empty cache after using residuals and between trials#15
p-e-w merged 1 commit intop-e-w:masterfrom
red40maxxer:optimize-memory-usage

red40maxxer commented Nov 17, 2025

Uh oh!

p-e-w Nov 17, 2025

Uh oh!

red40maxxer Nov 17, 2025

Uh oh!

p-e-w Nov 17, 2025

Uh oh!

red40maxxer Nov 17, 2025

Uh oh!

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

red40maxxer commented Nov 17, 2025

Uh oh!

p-e-w Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

red40maxxer Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

p-e-w Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

red40maxxer Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

p-e-w commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants