ASD releases joint advice on AI data security

Provides essential data security guidance for organisations that develop AI or ML systems

The Australian Securities Directorate (ASD), in collaboration with international partners, has come with new advice on best practices for securing data throughout the artificial intelligence (AI) and machine learning (ML) system lifecycle.

The cyber security advice highlights the importance of data security in ensuring the accuracy and integrity of AI outcomes and outlines potential risks arising from data integrity issues in various stages of AI development and deployment.

It also provides essential data security guidance for organisations that develop and/or use AI systems, primarily using AI systems in their operations, with a focus on protecting sensitive, proprietary or mission critical data.

The principles outlined are meant to provide a “robust” foundation for securing AI data and ensuring the reliability and accuracy of AI-driven outcomes.

According to the ASD, data security is of paramount importance when developing and operating AI systems.

Organisations in various sectors rely more and more on AI-driven outcomes, data security becomes crucial for maintaining accuracy, reliability and integrity.

The guidance provided in the ASD’s cyber security information sheet (CIS) outlines a “robust approach to securing AI data and addressing the risks associated with the data supply chain, malicious data and data drift”.

Data security is an ever-evolving field and continuous vigilance and adaptation are key to staying ahead of emerging threats and vulnerabilities, noted the CIS.

The best practices presented encourage the highest standards of data security in AI while helping ensure the accuracy and integrity of AI-driven outcomes.

AI system lifecycle

When it comes the AI system lifecycle, securing data is paramount to maintaining information integrity and system reliability.

The CIS advised with starting in the initial ‘plan and design phase’, carefully plan data protection measures to provide proactive mitigations of potential risks.

In the next ‘collect and process data phase’, data must be carefully analysed, labeled, sanitised and protected from breaches and tampering.

Importantly, securing data is paramount in the ‘build and use model phase’ to help ensure models are trained on reliably sourced, accurate, and representative information.

In the ‘verify and validate phase’, organisations are urged to rigorously test AI models built from training data to uncover security flaws and address them effectively.

This stage will be necessary each time new data or user feedback is introduced into the model and the data also needs to be handled with the same security standards as AI training data.

Implementing these strict access controls protects data from unauthorised access, especially in the ‘deploy and use phase’, while continuous data risk assessments in the ‘operate and monitor phase’ will help the model adapt to evolving threats.

“Neglecting these practices can lead to data corruption, compromised models, data leaks, and non-compliance, emphasising the critical importance of robust data security at every phase,” noted the CIS.

Best practices to secure data for AI-based systems

Included in the CIS was a list of recommended practical steps that system owners can take to better protect the data used to build and operate their AI-based systems, regardless of it running on premises or in the cloud.

The list recommends:

1. Source reliable data and track data provenance

Verify data sources use trusted, reliable and accurate data for training and operating AI systems. To the extent possible, only use data from authoritative sources.

2. Verify and maintain data integrity during storage and transport

Maintaining data integrity is an essential component to preserve the accuracy, reliability and trustworthiness of AI data.

3. Employ digital signatures to authenticate trusted data revisions

Digital signatures help ensure data integrity and prevent tampering by third parties. Adopt quantum-resistant digital signature standards, to authenticate and verify datasets used during AI model training, fine tuning, alignment, reinforcement learning from human feedback (RLHF) and/or other post-training processes that affect model parameters.

4. Leverage trusted infrastructure

Use a trusted computing environment that leverages Zero Trust architecture. Provide secure enclaves for data processing and keep sensitive information protected and unaltered during computations.

5. Classify data and use access controls

Categorise data using a classification system based on sensitivity and required protection measures. In general, the output of AI systems should be classified at the same level as the input data, rather than creating a separate set of guardrails.

6. Encrypt data

Adopt advanced encryption protocols proportional to the organisational data protection level. This includes securing data at rest, in transit and during processing. AES-256 encryption is the de facto industry standard and is considered resistant to quantum computing threats.

7. Store data securely

Store data in certified storage devices that enforce NIST FIPS 140-3 compliance, ensuring that the cryptographic modules used to encrypt the data provide high-level security against advanced intrusion attempts.

8. Leverage privacy-preserving techniques

There are several privacy-preserving techniques that can be leveraged for increased data security, including:

Data depersonalisation techniques
Differential privacy
Decentralised learning techniques

9. Delete data securely

Prior to repurposing or decommissioning any functional drives used for AI data storage and processing, erase them using a secure deletion method such as cryptographic erase, block erase, or data overwrite.

10. Conduct ongoing data security risk assessments

Conduct ongoing risk assessments using industry-standard frameworks, such as the NIST SP 800-3r2, Risk Management Framework and the NIST AI 100-1 Artificial Intelligence RMF.

General risk for data consumers

According to the CIS, from the moment data is ingested for use with AI systems, the data acquirer must secure it against insider threats and malicious network activity to prevent unauthorised modification.

The use of web-scale databases includes all the risks outlined earlier, and this can’t “simply assume that these datasets are clean, accurate, and free of malicious content”.

Third-party models trained on web-scraped data used to train a model for downstream tasks could also affect the model’s learning process and result in behaviour that was unintended by the AI system designer.

For mitigation strategies, the CIS recommends:

Dataset verification: Once a dataset is ingested, the consumer or curator should verify, as much as possible, whether it is free of malicious or inaccurate material.
Content credentials: Use content credentials to track the provenance of media and other data.
Foundation model assurances: In the case where a foundation model is trained by another party, the developers of the foundation model need to be able to provide assurances regarding the data and sources used and certify that their training data did not contain any known compromised data.
Require certification: Data consumers should strongly consider requiring a formal certification from dataset and model providers, attesting that their systems are free from known compromised data before using third-party data and/or foundation models.
Secure storage: Data needs to be stored in a database that adheres to the best practices for digital signatures, data integrity, and data provenance that are described in detail above.

Along with the ASD, the CSI was created with information from the Australian Signals Directorate’s Australian Cyber Security Centre, the New Zealand’s Government Communications Security Bureau’s National Cyber Security Centre (NCSC-NZ), the US’ National Security Agency’s Artificial Intelligence Security Centre (AISC), Cybersecurity and Infrastructure Security Agency (CISA) and the Federal Bureau of Investigation (FBI) and the UK’s National Cyber Security Centre (NCSC-UK).

Asia

Europe

Oceania

Topics

About

Policies

Our Network

More

ASD releases joint advice on AI data security

Provides essential data security guidance for organisations that develop AI or ML systems

AI system lifecycle

Best practices to secure data for AI-based systems

General risk for data consumers

More from this author

DXC creates AI platform to help self-insured employers

DyFlex increases SAP capabilities with Bluetree Solutions acquisition

Optus goes with Google Cloud for agentic AI

blueAPACHE makes management changes to focus on global expansion

EDGE 2025: AI demands MSP transformation – but into what?

EDGE 2025: AI disrupts the MSP business

Smart hybrid cloud adoption for an AI-driven digital world

Datacom moves service management to ServiceNow

Show me more

Australian Data Centres appoints ex-Microsoft Mark Pont for expansion push

EDGE 2025: Don’t become a ‘culture terrorist’ during M&A

NinjaOne nabs Elastic VP Geoff Davies as A/NZ country manager

ASD releases joint advice on AI data security

Provides essential data security guidance for organisations that develop AI or ML systems

AI system lifecycle

Best practices to secure data for AI-based systems

General risk for data consumers

From our editors straight to your inbox

More from this author

DXC creates AI platform to help self-insured employers

DyFlex increases SAP capabilities with Bluetree Solutions acquisition

Optus goes with Google Cloud for agentic AI

blueAPACHE makes management changes to focus on global expansion

EDGE 2025: AI demands MSP transformation – but into what?

EDGE 2025: AI disrupts the MSP business

Smart hybrid cloud adoption for an AI-driven digital world

Datacom moves service management to ServiceNow

Show me more

Australian Data Centres appoints ex-Microsoft Mark Pont for expansion push

EDGE 2025: Don’t become a ‘culture terrorist’ during M&A

NinjaOne nabs Elastic VP Geoff Davies as A/NZ country manager