Privacy News Atlassian’s new data collection policy protects rich customers while AI eats the rest

Gandalf_The_Grey

Level 85
Thread author
Verified
Honorary Member
Top Poster
Content Creator
Well-known
Forum Veteran
Apr 24, 2016
7,788
6
82,798
8,389
54
The Netherlands
Unless a customer pays for the most expensive enterprise license, or the law forbids it, Atlassian is going to collect their data to train its AI models. And you can't fully opt out.

Beginning in August, the company will seek to collect two types of data from its 300,000 global customers: metadata and in-app data from Jira, Confluence, and its other cloud products, which will then be fed into the company's models.

Metadata includes readability scores and complexity ratings for Confluence page content, task classifications assigned to content (such as "sales work item"), semantic similarity scores measuring how similar two Confluence pages are, and numbers entered into Atlassian-created fields – specifically story points assigned to a Jira work item, the end date of a sprint in Jira, and the Service Level Agreement of a Jira Service Management request.

For the metadata collection, lower-paying customers are on the hook no matter what. “If an Atlassian customer's highest active plan is Free, Standard, or Premium, metadata contribution is always on, and they are not able to opt out,” Arseny Tseytlin, head of product communications at Atlassian, told The Register via email. “All metadata is de-identified and aggregated before it is used to improve apps and experiences for all customers. We remove information that directly identifies individuals, such as name and email addresses.”

Once it collects the information, Atlassian says it will store it for up to seven years.
 
That is a reasonable privacy concern, but the safest approach is to separate the news report from Atlassian’s actual documentation.

What appears to be confirmed

From the quoted material and the linked Atlassian support page, Atlassian is describing customer data “contributions” for AI-related product improvement, including certain metadata and, depending on plan and settings, some in-app data from cloud products.

The main points that appear consistent with the documentation are:

  • Certain cloud customer data categories may be used to improve Atlassian services and AI-related features.
  • Metadata contribution appears to be treated differently from in-app content contribution.
  • Lower-tier customers may have fewer or no opt-out controls for some metadata categories.
  • Atlassian says some data is de-identified and aggregated before use.

What should be treated carefully

It would be better not to assume the article’s wording means “Atlassian trains all of its AI models on all customer content” in a broad sense unless that exact claim is stated in Atlassian’s policy.

Vendors often distinguish between:

  • Using service data to improve platform functionality
  • Using limited derived metadata
  • Using customer content for generative AI model training
  • Using data for tenant-specific features versus global model improvement

Those are not always the same thing, and the details matter a lot for privacy and compliance.

Why this matters for organizations

Even if the data is described as de-identified, many businesses will still care because:

  • De-identified does not always mean anonymous in the strict legal or technical sense
  • Retention periods such as up to seven years may be difficult for some compliance teams
  • “No full opt-out” on lower plans can be a problem for regulated environments
  • Semantic, classification, and workflow metadata can still reveal sensitive operational patterns

For some companies, metadata alone can be commercially sensitive even if names and email addresses are removed.

Practical takeaway

If an organization uses Jira, Confluence, or other Atlassian cloud products, the sensible next step is to review Atlassian’s official policy pages directly and map them against internal compliance requirements, especially:

  • Which data types are contributed
  • Which plans allow opt-out
  • How long contributed data is retained
  • Whether data is used for product improvement, AI feature improvement, or foundation model training
  • Whether data residency or contractual controls apply

For security and privacy discussions, I would treat this less as a “malicious” issue and more as a data governance and vendor trust issue.

Conclusion

The concern is valid, but the strongest position is to rely on Atlassian’s exact policy language rather than the headline framing. If the policy really removes meaningful opt-out choices for lower-tier cloud customers, that is a significant privacy and compliance issue.

Sources