GLM-4.7 Flash ignores story-edit prompt

// 90d agoNEWS

GLM-4.7 Flash ignores story-edit prompt

A user running an uncensored Q8 build of GLM-4.7-Flash on a 5090 says the model can handle normal chats but keeps reproducing an attached 8.5k-token story nearly verbatim instead of inserting a new scene. The thread reads like a long-context failure, but it is more likely a prompt-structure and document-editing problem than a raw context-window limit.

// ANALYSIS

The hot take: 200K context does not mean 200K of reliable instruction-following. This looks less like the model “forgetting” the prompt and more like a weak edit task setup, especially when the source text comes in through PDF ingestion.

–The model’s advertised long context can retain the story, but that does not guarantee it will prioritize a vague “rewrite with changes” instruction over faithfully reconstructing the source.
–PDF attachments often worsen this kind of task because extraction can flatten structure, erase section boundaries, and make the model treat the input as a document to continue rather than a text to surgically modify.
–Better results usually come from explicit edit framing: identify the insertion point, request a diff or revised scene only, and anchor the change to a concrete section or paragraph.
–For an 8.5k-token story, chunking or retrieval is often more reliable than asking for a full end-to-end rewrite with one added scene.
–Community uncensored builds may be strong at chat and generation, but they are not automatically optimized for precise editorial transformations.

// TAGS

glm-4.7-flashllmprompt-engineeringinferenceself-hostedopen-weights

DISCOVERED

90d ago

2026-04-16

PUBLISHED

91d ago

2026-04-16

RELEVANCE

8/ 10

AUTHOR

NeuroPalooza

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE36m ago

Mindwalk visualizes AI agent sessions in 3D

Mindwalk is an open-source local tool that replays an AI coding agent's terminal session by illuminating the files it reads and edits on a 3D visualization of the repository. By scanning local projects and session logs, it renders a browser-based "night map" where files glow with specific colors (moss green for seen, moon white for read, warm amber for edited, and dark for unvisited), allowing developers to easily trace the agent's path, discover hallucination loops, and verify its overall pathfinding efficiency.

OPEN SOURCE36m ago

Clodex IDE launches open-source agentic sandbox

Clodex is an open-source, local-first agentic IDE designed to run autonomous AI tasks in isolated, user-approved environments. By treating engineering work as stateful tasks, it retains context across sessions, routes queries dynamically between models, and generates cryptographically signed evidence records for all operations.

OPEN SOURCE36m ago

Waggle optimizes multi-agent handoffs

Waggle is an open-source Rust library and MCP-native reference layer designed to streamline multi-agent workflows by passing compact, ~30-byte versioned reference tokens instead of massive context files during handoffs. Subagents resolve these tokens via the Model Context Protocol to retrieve only the specific data segments they need, reducing token bloat and enabling efficient context shaping and read attribution.