Using LLMs to Evaluate the Quality of Position Papers?
- 1 day ago
- 4 min read

By Paul Shotton, Advocacy Strategy
A few years ago, my colleague and research collaborator Dr Adam Chalmers introduced me to a series of ideas drawn from computational linguistics that offered a different way of thinking about written advocacy. Adam was exploring how concepts such as readability, complexity, concision and action orientation could be used to evaluate the effectiveness of written communication in a more systematic way. Together with Alan Hardacre, we subsequently worked with Adam to transform these ideas into a practical training module for Advocacy Academy.
At the time, we were already experimenting with ChatGPT to make these concepts more accessible to practitioners. The models could explain the underlying theories and often produce useful assessments. However, they sometimes struggled with more technical calculations and were less consistent when applying structured evaluation frameworks. The ideas themselves were compelling, but the tools had limitations.
Over the last few years, the situation has changed considerably. Today's large language models are far more capable. They can analyse lengthy documents, compare multiple texts, apply structured frameworks and explain their reasoning in ways that are genuinely useful for practitioners. This prompted me to revisit Adam's original ideas and ask a broader question: can AI help us evaluate the quality of position papers more systematically, and if so, what are the most useful frameworks to apply?
To explore this, I recently conducted a simple experiment. I compared two position papers addressing the European Commission's Omnibus Simplification Package. Both covered broadly similar topics but were written by different organisations with different audiences and objectives. Rather than asking which paper was better, I asked a different question: what dimensions of quality can we evaluate consistently, and which of those dimensions are best suited to analysis by large language models?
The exercise led me to a framework built around three layers of position paper quality.
Layer One: Communication Quality
The first layer focuses on the quality of the writing itself. This includes readability, concision, comprehensibility, persuasiveness and action orientation.
These concepts originate largely from the work Adam introduced several years ago.
Readability examines how easy a document is to understand. Concision assesses whether the paper communicates its message efficiently without unnecessary repetition.
Comprehensibility considers the complexity of the language and how accessible it is to the intended audience. Persuasiveness focuses on the ability of the text to influence its readers, while action orientation examines whether the paper clearly encourages action rather than simply describing a problem.
This is the area where current AI tools perform particularly well. Large language models are remarkably effective at identifying excessive jargon, unnecessarily complex language, weak calls to action and repetitive arguments. They can compare documents and explain why one is easier to read or more persuasive than another. For public affairs professionals, this is valuable because many advocacy documents are written by experts for other experts, while policymakers are often operating under significant time constraints.
Layer Two: Advocacy Quality
The second layer focuses on the effectiveness of the document as an advocacy tool. This includes strategic framing, audience alignment and advocacy utility.
A position paper is not simply a piece of writing. It is designed to influence a policy process. As a result, we need to assess whether it frames the issue effectively, whether it speaks to the priorities of its intended audience and whether it provides clear and usable recommendations.
This layer often reveals interesting differences between organisations. Some papers are highly accessible and politically compelling but provide limited technical detail. Others are legally precise and analytically rigorous but are unlikely to be read in full by a busy policymaker. Neither approach is necessarily better. They simply reflect different advocacy strategies and different audiences.
AI performs reasonably well in this area. It can identify framing choices, assess audience fit and evaluate the clarity of recommendations. However, human judgement remains important because advocacy effectiveness is always influenced by political context.
Layer Three: Policy Quality
The third layer focuses on the substance of the position paper itself. This includes argument strength, evidence quality and policy quality.
This is where AI becomes less of an evaluator and more of an assistant. Large language models are capable of identifying claims, supporting evidence, logical gaps and the overall structure of an argument. They can compare how different organisations justify their recommendations and identify areas where evidence appears weak or underdeveloped.
However, assessing whether the evidence is actually correct, whether alternative evidence has been ignored, or whether a policy proposal is politically feasible still requires expertise and judgement. This is not a weakness of AI; it simply reflects the reality that policy analysis often requires knowledge that exists outside the document itself.
Using AI as a Reviewer Rather Than a Writer
Perhaps the most interesting conclusion from this exercise is that AI may be more valuable as a reviewer than as a writer.
Much of the discussion around AI in public affairs focuses on content generation. We ask how AI can help us draft position papers, consultation responses or briefing notes. These are useful applications, but I increasingly find myself using AI to evaluate documents rather than produce them.
When used in this way, AI becomes a structured reviewer. It helps identify strengths and weaknesses, compare approaches across organisations and challenge assumptions about what makes an effective advocacy document. Rather than replacing professional judgement, it provides a framework for applying that judgement more systematically.
For those interested in experimenting with this approach, I would recommend starting with your own work. Select a policy issue your organisation is currently engaging with and gather a small sample of position papers written by your organisation and others active on the same file. Then evaluate them across these three layers: communication quality, advocacy quality and policy quality.
The results can be surprisingly revealing. You may discover that another organisation frames the issue more effectively. You may find that your own recommendations are stronger but less accessible. You may identify opportunities to improve clarity without sacrificing substance.
Revisiting Adam's original ideas has reminded me that many of the concepts we now associate with AI-assisted analysis are not new. What has changed is the accessibility of the tools. Frameworks that once required specialist knowledge can now be applied by almost any public affairs practitioner. Whether these approaches become standard practice remains to be seen, but they offer a useful starting point for anyone interested in improving the quality and effectiveness of their advocacy communications.




Comments