RL context compaction
Experiments around using reinforcement learning to choose what context to retain, compress, or drop.
What it is
A small RL framework using context-window compaction as the case study: define state, actions, rewards, baselines, trajectories, then compare simple RL methods against heuristics.