Atlassian · 2025

Improve Writing in the Atlassian Editor

How I achieved the first statistically significant quality improvement in Atlassian Editor's most-used AI feature, and tripled monthly active users to 304,000+ users.

Role

Product strategyPrompt engineeringContent design

Tools + Tech

Cursor

Statsig

Confluence

0K+

Monthly active users

0K+

Daily usage

Golden examples

Prompt iterations

Context

Improve Writing was the most popular AI writing tool in the Atlassian Editor, but also a persistent source of negative feedback.

The output was verbose and robotic. A feature that was meant to make writing clearer was doing the opposite.

The machine learning team had experimented to improve the feature from a tactical lens — tweaking parameters, testing latency, maintaining locales. But they hadn't looked at it from the lens of what makes great writing.

Try original prompt

Input

RFCs are a way for Atlassian to share what we're working on with our valued developer community. It's a document for building shared understanding of a topic. It expresses a technical solution, but can also communicate how it should be built or even document standards. The most important aspect of an RFC is that a written specification facilitates feedback and drives consensus. It is not a tool for approving or committing to ideas, but more so a collaborative practice to shape an idea and to find serious flaws early.

Output

Click Run to see the output using the original prompt

Approach

I proposed a new collaboration model: content design x MLE, combining language expertise with experimentation infrastructure.

This approach has since become a defining model for prompt engineering at Atlassian:

1
Audit outputs and customer feedback to understand where and how the feature could be improved.
2
Define actionable rules that are specific enough for the LLM to follow and let the feature live up to its name.
3
Create a golden dataset of 60+ examples and run an evaluation pipeline across it to determine the outputs of different prompt iterations.
4
A/B test with customers to determine the results between control and test variants.
5
Ship and measure impact.

Try new prompt

Input

Output

Click Run to see the output using the new prompt

Impact

During the A/B test, we saw a statistically significant lift in insertion rate — the first statistically significant quality change the feature had seen in its two-year history.

Following the release of our new prompt, daily usage and monthly active users have both almost tripled from 33,000 to 100,000+ uses per day, and from 120,000 to 304,000+ monthly active users. The feature is now the fastest-growing in Editor AI and accounts for almost one third of Atlassian's total AI monthly active users.

It also changed the way content and machine learning collaborate at Atlassian. Content design expertise is now seen as a key driver of prompt writing, golden dataset creation, and LLM evaluation — and our project became the template other teams follow.