About the Programme
Research Programme.
Mission, methodology, ethics, and publication plan. This document is versioned with every structural change; the current state is binding for all surveys.
1
Mission
How an author is found today has changed. Anyone searching for a book increasingly no longer lands on the publisher's page, but on an answer assembled by Gemini or Claude — or on a knowledge-graph card that Google's Knowledge Graph or Bing's AI overview displays directly. Which steps of an author actually arrive in this new answer layer, how long that takes, and with what fidelity citations occur there, remains only partially documented empirically.
This programme measures it openly — on a single case, but methodologically honestly. Later claims about effects of individual author actions become more than anecdote only if a reliability-tested measurement apparatus carries them. Language models, knowledge graphs, and AI answer engines are not stable instruments for this purpose: their answers drift with model updates, indexing changes, and platform policy. Separating instrument drift from actual action effects succeeds only when the measurement properties of the instruments themselves are known.
Hence three temporally overlapping phases with continuously pre-registered interventions. Phase 1 (May → Sep 2026) is the active pre-launch phase: deliberate interventions on identity surfaces (Wikidata co-occurrence, Zenodo DOI cadence, Common-Crawl optimisation, machine-readable identity surfaces, Reddit karma, Cross-LLM Trust Graph) run in parallel with instrument validation of the fourteen measurement surfaces (test-retest reliability, intra-set consistency, CUSUM drift, coverage). Phase 2 (Sep 2026 → Q3 / 2027) is post-launch effect detection: the book launch on 22 September 2026 is the central deliberate intervention, aggregated effect measurement across all surfaces, long-tail observation of AI-answer reach. Phase 3 (from Q3 / 2027) carries long-term controlled experiments on the then-validated apparatus — with effect measures appropriate to an n-of-1 design (interrupted time series, Bayesian structural time series, hierarchical Bayesian models), not pre/post Cohen’s d on individual actions.
What the programme aims at: an openly documented measurement basis that transfers to other author identities. The observation is declared as a single-case study; it is not about statistical generalisation, but about the methodology remaining traceable and replicable.
2
Programme Architecture
The programme observes a single literary project across the entire publication and reception cycle. Phase 1 is operationalised in four lines — Citation Inventory, Measurement-Instrument Validation, Codebook Iteration, and Open Materials — each carrying an isolated question and remaining separately citable. The detailed description of each line, including observed endpoints and measurement procedures, is in the programme index at /en/research.
This document focuses on the methodological commitments (§3), the ethics and policy (§4), and the reproducibility specification (§6), against which the programme can be externally audited.
3
Methodological Commitments
Four methodological commitments apply to every survey of the programme. Violations are noted in the quarterly report.
01
Pre-registration
Before every data collection a pre-registration in OSF style is published: hypothesis (H₀/H₁), measurement operationalisation, data source, sampling plan, a-priori power analysis, stop criterion, and analysis plan. Deviations are openly disclosed in the report.
02
Open Methodology
All measurement, aggregation, and test scripts are published
with the report as a replication archive: Python 3,
MIT-licensed, frozen pins via environment.yml,
endpoints with retrieval timestamp and user agent, raw data
under CC0 where platform terms permit.
03
Failure Log
Null effects, against-hypothesis findings, and drafts blocked by the linter are equally publication-obligatory. Aggregated excerpts are part of every quarterly report; individual findings with date, reason, and context in the replication archive.
04
Quarterly Cadence
Reports appear in mid-October, January, April, and July with a ±two-week tolerance. Omitted reports are explained in the following report. From quarterly report Q4 / 2026 an adversarial reviewer role is established.
4
Ethics and Policy
4.1 Pseudonymity
Marin T. Kael is an openly declared pseudonym. The separation between pseudonym and real person is grounded in private law and is not dissolved. For the scientific citability of the programme, the person behind the pseudonym is irrelevant — the materials stand on their own.
4.2 Platform Policy
The programme operates explicitly along the policies of the respective platforms. For Reddit the Responsible Builder Policy applies; posts are not automatically submitted but manually filed by the author through the standard web interface. For Wikidata the Wikimedia Foundation editor policy applies; edits carry documented sources. Violations of platform policy are openly noted in the quarterly report.
4.3 Data Protection and Reach
The programme does not collect personal data of third parties. It analyses publicly accessible materials (posts, comments, reviews) in aggregated form. Individual reader comments are cited only when the writer has explicitly consented. No material is prepared for model training or transferred onward.
4.4 Conflict of Interest
The programme lead is simultaneously the subject of inquiry — that is a genuine methodological limit. It is restated in every report. An adversarial reviewer role will be established from quarterly report Q4 / 2026 and is an open position.
5
Publication Plan
The following plan is binding; deviations require a justified update to this document and are publicly documented.
- 11 May 2026 Methodology Note 01 Baseline Measurement: Author Identity in the Citation Behaviour of Language Models (Pre-Launch) published
- 13 May 2026 Methodology Note 01 · v2.7 Major redesign: active three-phase design + 6 pre-registrations Q0–Q5 · DOI 10.5281/zenodo.20170615 published
- May – Sept. 2026 Phase 1 · Active Pre-Launch Q0–Q5 interventions + parallel instrument validation · active since T+0 active
- October 2026 Activity Report Q3 / 2026 First 90 days Q0–Q5 + parallel measurement apparatus validation scheduled
- 22 September 2026 Phase Transition · Book Launch "Das vierte Feld" release — central deliberate intervention, transition to Phase 2 (post-launch effect detection) scheduled
- January 2027 Activity Report Q4 / 2026 Post-launch effect detection + Codebook v0.2 — external annotators, inter-rater agreement scheduled
- July 2027 Transition Note Phase-2 closure · Phase-3 pre-registrations for long-term controlled experiments from Q3 / 2027 scheduled
6
Reproducibility
Every publication is accompanied by a replication archive. The
archive contains the measurement code (Python, MIT), the raw
measurement values in machine-readable form, the versioned style-sheet
in the state of the survey, and an environment.yml file
for restoring the software environment. A pre-version of the
measurement code is made available to external auditors on request
even before the first regular publication.
7
Contact, Licence, Versioning
- Enquiries
- research@marin-t-kael.de
- Licence
- Texts: CC BY 4.0 · Code: MIT · Raw data: CC0 (where platform terms permit)
- Versioning
- v1.0 · 11 May 2026 · first version
- Citation
- Kael, M. T. (2026). Marin T. Kael Research Programme — Mission and Methodology, v1.0. Marin T. Kael, Independent. marin-t-kael.de/en/research/programme