Base Requirements
This section describes base requirements for the archiving solution and the data that it archives. These requirements apply to all archivable data, regardless of the specific Moodle activity it originates from. Therefore, this set of requirements can be extended by additional requirements for specific Moodle activities.
This set of requirements underpin each of the user stories, but do not directly map to them. Instead, they are meant as a framework of standards that the archiving solution must adhere to with every feature that is implemented or define specific technical standards that must be met.
Data Integrity
This section focuses on measures to verify and ensure the integrity of the archived data throughout the archiving process.
[REQ-DI-01] Equality of archived and original data
The archived data must resemble the original data as closely as possible and no critical information, e.g., grades, must be lost or altered. Archived data must only be transformed into an alternate representation when inevitable.
[REQ-DI-02] Static presentation of archived data
The presentation of the archived data must not change in the future. Therefore, all data must be fully rendered at the point of archiving so that it does not depend on future software behavior1. For PDF files, the PDF/A standard should be used.
[REQ-DI-03] Checksums for data integrity
Archived data must be stored alongside checksums (SHA256 or similar) to verify the integrity of the archived data.
[REQ-DI-04] Digital cryptographic signatures
Digital cryptographic signatures should be able to be issued for archived data by a third party to attest their integrity and time of creation using the Time-Stamp Protocol (TSP).
Traceability and Reproducibility
This section focuses on the traceability and reproducibility of the archiving process and the resulting archive data.
[REQ-TR-01] Unique identifiers for archive jobs
Each archive job must possess an identifier that uniquely identifies the archive job. This can be a universally unique identifier (UUID) with a length of 128 bit.
[REQ-TR-02] Reproducibility of archive jobs
An archive job must produce the same result every time it is executed with the same input data. Archive job metadata, e.g., the date of archiving, is excluded from this requirement.
[REQ-TR-03] Metadata for archive jobs
Representative metadata about the archive job must be stored alongside the archived data and be kept until the archived data can be deleted. This includes, but is not limited to, the date of archiving, the user who initiated the archiving, a list of the users whose data is included in the archive, and the Moodle course / activity the data originates from.
[REQ-TR-04] Identifying archived data
The user should be able to determine, which data is already archived and which data currently is not.
[REQ-TR-05] Identifying data to be archived
The user should be able to determine, which data requires archiving and which data does not.
[REQ-TR-06] Traceability of automated archiving processes
If the archiving process is initiated automatically, it must be possible to trace back which data was marked for archiving at which point in time.
[REQ-TR-07] Error handling
If an error occurs during the archiving process, the user must be informed about the error and the archiving process must be aborted to prevent archiving invalid or incomplete data.
Compatibility and Autonomy
This section focuses on ensuring the future readability of the archived data, its independence from Moodle itself, and the compatibility of the archiving solution with external systems.
[REQ-CA-01] Future readability
Archives must be readable in up to 10 years.
[REQ-CA-02] Autonomy of archived data
Archives must be readable without requiring the Moodle instance it was created on.
[REQ-CA-03] Archive retention
Archives must be kept, even if the Moodle course or activity is deleted.
[REQ-CA-04] Transferring archived data
Archives must be transferable to external third party systems, such as network storages or document management systems.
[REQ-CA-05] Open data formats
If an archiving process requires transforming data into a different format, an open format must be used to ensure that the data can still be accessed in the future, without requiring specific software2.
[REQ-CA-06] Open data storage standards and transmission protocols
The archiving software must use open standards for data storage and data transmission to ensure that the data can still be accessed in the future, without requiring specific software2.
[REQ-CA-07] Source code availability
The code of the archiving solution should be publicly available, free of charge.
Data Protection, Regulatory, and Privacy
This section focuses on regulatory and privacy aspects of the archiving solution, as well as data protection aspects.
[REQ-DP-01] GDPR compliance
The archiving process must comply with the general data protection regulation (GDPR).
[REQ-DP-02] Legal user data requests
Users must be able to request the data stored about them.
[REQ-DP-03] Automatic deletion of archived data
Archived data should automatically be deleted after the legal retention period.
[REQ-DP-04] Policies
Global archiving policies must be enforceable to ensure that all data is archived according to the institution's regulations and policies. This includes, but is not limited to, the retention period of the data, the obligatory data that is always archived, the way data is stored, and the way data is accessed.
[REQ-DP-05] Data retention policy enforcement
If a data retention policy is in place, all archives must store their retention period alongside the archive.
[REQ-DP-06] Access rights and user capabilities
Access to archived data must be controlled by appropriate capabilities, integrating with the Moodle role concept.
[REQ-DP-07] Third-party services
Private data must not compulsorily be sent to third-party services that are not under the control of the institution. It must be possible to run all software components of the archiving solution on-premises.
[REQ-DP-08] Data security
The archived data must be storable in a secure manner3, e.g., encrypted at rest.
Miscellaneous
This section lists additional requirements that do not fit into any of the previous categories.
[REQ-MI-01] Machine-readable metadata
Archive metadata must be machine-readable (e.g., as CSV files).
[REQ-MI-02] Machine-readable archive data
Archive data should be machine-readable, whenever possible.
[REQ-MI-03] Text-searchable archive data
Archived data should be text-searchable by the user, whenever possible, e.g., exporting full-text PDFs instead of images.
[REQ-MI-04] Data compression
Archives should be compressed to preserve storage space using open compression standards (e.g., gzip, deflate, ...).
-
This explicitly excludes dynamic web content, such as JavaScript-based content and HTML/CSS DOMs, from being a valid archiving format on its own. This is due to it requiring rendering inside a browser at the time of viewing, which is not guaranteed to reproduce the exact same data presentation in the future. ↩
-
If proprietary software or standards are used, it can not be guaranteed that the software will still be maintained and work in the future. Open source software and open standards, on the other hand, can be maintained by the community and are less likely to become fully unavailable in the foreseeable future. ↩↩
-
It is sufficient to support encryption on storage level, e.g., using an encrypted file system. ↩