Progress Log

May 13, 2026

Pushing on the DICE Engine paper draft for the DL4C workshop at ICML 2026 — submission deadline is May 14, 2026. Final day to tighten the experiments section and prep the anonymized code link before submission.

May 6, 2026

Added a note to the publications tracker on using anonymous.4open.science for double-blind submissions — strips author identity from a repo URL while keeping it browsable for reviewers. Small process detail, but worth pinning before the first submission.

May 4, 2026

Embedded dice-engine as a Git submodule under preliminary/ so the first paper’s code lives alongside its prose, and documented the submodule-embedding convention in the repo guide. Also queued four RL and code-LLM papers in toRead.md to set up the next reading sprint.

May 1, 2026

Launched the research site at unphd.yonglinzhu.com. Set up the custom subdomain, configured GitHub Pages, and redesigned the site to match the al-folio academic style — adding a Publications page, reading time and tags for writing posts, a recent progress section on the homepage, and footer social links.

April 30, 2026

Defined the qualifier output requirement: three papers across distinct topics, at least one first-authored and one led by a new collaborator. The first paper is underway — a project called dice-engine. Title and venue TBD.

April 29, 2026

Narrowed candidate research areas to two: LLM post-training alignment and Physics of AI. Both are recorded in the qualifier notes. The next step is building structured overviews of each area before committing to a direction.

April 28, 2026

Read the InstructGPT paper (Ouyang et al., 2022) — foundational reading for understanding RLHF. Fine-tunes GPT-3 with human labelers and reinforcement learning; despite being ~100x smaller than GPT-3, it performs better on most NLP tasks. Good grounding for the LLM alignment exploration, though large-scale labeler-based training is unlikely to be a direction I pursue directly.

April 27, 2026

Launched this independent PhD journey. Set up the repository structure: three phases (qualifier, preliminary, defense), network and publications trackers, and this site. No institution, no advisor — just the work.