Ylioppilastutkinto 2020

Suggestion for an enviroment in which a Matriculation Exam can take place in an informationally secure way.

© Vertebratan 2017

Abstract

This work describes a software architecture in response to the Hackabi2 contest held by the Finnish Matricluation Examination Board. The goal of this work is to provide answers to the information security issues with the current electronic exam system, by describing a completely different approach.

The architecture is aimed to be flexible, parts of it can be deployed independently. Ensuring realistic deployment goals. The new approach will simplify the system facing the end users, such as, the supervisors, teachers, principals and the candidates. This work is written assuming that a fundamentally correct architecture will be safer and more efficient.

License

This work (kilpailutyo.html) and all of its attachments, authored by Vertebratan are licensed under a Creative Commons Attribution 4.0 International License. Only publish this work with a reference to its signature (kilpailutyo.signed).

What is an exam system?

First, we must understand what are we protecting. The Matriculation Exam is an exam in which candidates are reviewed and scored by their ability to answer the given assignments with restricted material. There must be a device that the candidate uses to answer, this is the endpoint. There also must be an exam server, or couple in the exam environment, because many schools don’t have the requisite bandwidth available for interfacing with cloud services in real-time.

The most important aspect of the exam is to give every candidate a fair opportunity in the exam. The exam consists of:

  1. The assignments, these assignments must be revealed to all candidates at the same time.

  2. The answers to these assignments, it is important that all answers remain untampered for review.

  3. The exam environment, it is important that all candidates have access to the same material during the exam. The board declares the material that the candidate has access to.

Our attacker is a malicious candidate trying to get unfair advantage in the exam, or simply sabotage it, by attacking at least one of these aspects of the system.

Key issues with the current exam system

  1. USB thumb drives

    USB thumb drives are being used to distribute software for the candidates to use during the exam. The candidates are to boot the software on their endpoints, by themselves at the beginning of the exam. The distribution and use of these USB thumb drives has proved to be clumsy, and very time consuming at the beginning of the exam.

    Also, the fact that the software on the USB drive is unprotected and the general disorder within the exam space during the this ‘initialization period’ of the exam, creates an attack vector in which malicious candidate could utilize any software during the exam. This arbitrary code execution flaw is the biggest issue with the current system. And it cannot be fixed with any software security measures, as it s the architecture that is flawed.

  2. Social security numbers

    Social security numbers are used to authenticate each candidate, the supervisors have list of every candidate, and their SSN. The supervisors are to authenticate every student in to the exam environment by checking that the candidate has input their name and SSN just before entering the exam. The candidate may enter after these fields are confirmed valid by the supervisor, the supervisor gives the candidate a key that can be used to enter the exam. This is very clumsy, and some think that giving everybody’s SSNs to the supervisors is questionable.

    Also, as it is not publicly stated how the key used to enter is derived, it could be derived from values known to the the candidate, this would create an attack vector in which a malicious candidate could enter the exam as someone else, or without supervision. However, this attack is unlikely due to the fact that it’s fairly easy to notice during the exam, and the lack of motive. It may be possible to reverse engineer the key derivation function with the help of the public file ‘exam_Katastrofiharjoitus_Katastrof_vning.meb’.

  3. Integrity of answers

    The authenticity and integrity of the answers is not verifiable in a trusted way. If the current system doesn’t use encryption for communicating during the exam, it will be trivial to MitM other candidates answers by running unauthorized code in the exam environment. The answers could also be tampered on the USB drive or in the Abitti cloud, in which the answers are stored for review.

    These attacks may seem unlikely for the risks involved, but proper verification of answer authenticity is necessary for the legal protection of the candidate. For example, it would be great if, in a case where a candidate claims their answer has been tampered with, could the issue be investigated. Currently, any cryptographically valid proof can’t be presented either way.

  4. Distribution of assignments

    The board distributes the confidential assignments to the school principals by USB thumb drives. These drives are to be plugged in to the exam space servers. The assignment files. (File extension .meb) are decrypted using keys sent to the principals by mail. The only key I’ve seen contains four lowercase finnish words. It would seem to be based on a word list. This is a very complicated process that provides questionable security.

  5. Hardware support

    Not all computers that the candidates own are compatible with the exam system. Most notably the new MacBooks are completely incompatible with booting Linux on them. Also, as the DigabiOS uses some ancient kernel, many laptops are missing very basic support for hardware, most of which has been in the kernel for years. (At least this was the case as of DigabiOS build 1539C)

Key aspects of any exam system

To recap, some key aspects of any secure exam system are:

  1. Secrecy of the assignments beforehand
  2. Authentication of the candidates
  3. Authenticity of the answers
  4. A restricted software environment for the candidate

Architecture

See attachment 1 (attachment1.jpg).

1. Endpoint

  1. Devices

  2. A traditional OS, no booting over USB

  3. System integrity protection

  4. VPN

  5. PKI

  6. Operating system

  7. Restricted user

  8. Candidate authentication

    When a candidate wants to access the exam environment, they must be authenticated. These authentication methods will dramatically cut down the time spent initializing the exam.

    1. Simple authentication

    This authentication method is more cost effective than #2, it should provide similar security.

    This authentication option is deployed if the (Architecture-1-i-2) “Strong verification of answer integrity” is deployed.

  9. Answer integrity

    Answer integrity is verified using asymmetric encryption.

    1. Minimalistic verification of answer integrity

      This is a simple way for checking answer authenticity. This method provides limited security, but could be more cost-effective.

      • Every endpoint has a unique asymmetric key pair
      • The keys are stored on the encrypted partition of the endpoint
      • These keys are endpoint specific, the candidate or the supervisors may want to write down which endpoint was used.
        • The CA has a database of endpoints and their corresponding public keys
      • Every time the exam server is provided with more answers, the answers will be signed with the private key
        • If the candidate uses different endpoints during the exam, the answers should be verified against the key of the last endpoint
      • The public key is stored with the answers
      • This way, when a candidate claims that their answer has been modified, can their claim be investigated
    2. Strong verification of answer integrity

      • Every time the exam server is provided with more answers, the answers will be signed with the private key
      • This private key is stored only on the smart card (Architecture-1-h-2)
      • The Matriculation Board has the public keys of every candidate
        • The integrity of the answers can be verified.
  10. Updates

  11. Network

2. Proposed server

For simplicity, the server should be very similar to the endpoint (Architecture-1). The use of system integrity protection enables the uses of the server computer in other school administrative tasks. But the endpoints and exam servers shouldn’t share the same OS. As using the same OS for the servers and the endpoints would reduce maintainability and increase the attack surface of the exam environment.

  1. VPN

  2. PKI

  3. Operating system

  4. Distribution of assignments

    As the exam servers can connect to the internet. The assignments can be downloaded to the exam server at the beginning of each exam.

    For the use in an end of course test, there is a mechanism for the supervisor to download the assignments from the Abitti cloud. For resilience against internet connection failure there is also be a manual way for importing assignments. Remote use described in (Architecture-4)

  5. Redundancy and delivery of answers

    As the exam servers can connect to the internet. The answers could be backed up to the Abitti cloud in real time. This may reduce complexity or increase redudancy if the system is used for end of course tests.

    Multiple exam servers can be used for redundancy, in this architecture the redundant servers are symmetric by services, and the endpoint mirrors all connections for both, this will have to be implemented on application level.

  6. Headless servers

3. Endpoint and exam server communication

  1. The endpoint discovers the server with mDNS
  2. The endpoint initiates a VPN connection with the server
  3. The endpoint adds firewall rules

If multiple exam servers are needed to run in one exam environment, they both should connect to all of the endpoints individually.

4. Cloud

New features of the cloud platform are (Abitti and potentially new applications):

Production

Example solutions to the key questions in deploying the described architecture

OS

There is no reason to reinvent the wheel. It would be easy to base both operating systems on an already existing distribution. I will focus on CentOS 7, the free (as in coffee) version of RHEL 7 as I have experience with it. However, any up-to date Linux distribution could be used.

There is no need to fork any aspects of CentOS, all modifications can be distributed with a kickstart configuration file. A kickstart file is a file that describes the repositories and software that the CentOS 7 installer Anaconda, will install to the system. Most packages can be maintained by the CentOS project and the other applications and modifications can be installed through a repository maintained by the board. This will greatly reduce the complexity compared to the current exam operating system. Freeing the resources of the developing team. A kickstart file can execute bash scripts, this enables practically limitless configuration. All of the keys would be managed remotely. (Architecture-4)

The OS would be installed by the school and the devices must be added to the school fleet in the cloud so they may have their keys.

There would be two different kickstart files, one for the installation of the exam servers other for the endpoints.

Please see (Known flaws and considerations) for information on system integrity protection.

Candidate authentication

  1. OTP keys

    An approach where the candidate uses their mobile device to create the codes is the most-cost effective.

  2. USB Smartcard

    A Yubikey 4 can provide GPG signing and U2F (or simpler OTP) simultaneously over a commodity USB port. This device completely answers all guestions regarding annswer signing and candidate authentication, in this architecture. There may be other similar devices on the market.

Server discovery

On Linux, Avahi is an open implementation of mDNS, and it can be used to advertise and discover services in the network. Using Avahi will be the simplest way of discovering the exam servers. All different school network implementations should work with this approach. An Avahi package is maintained for most Linux distros, further reducing complexity.

VPN

The industry standard way of establishing VPN connections is IKEv2 and IPsec, an open implementation like strongSwan will make it very maintainable to deploy these connections. Optionally, some other standard, like OpenVPN, could be used but IKEv2 and IPsec is the most most widely supported option.

Remote management

As I have no practical experience with remote management of large fleet of servers and pc’s, implementing the cloud based remoting of the servers and endpoints is beyond the scope of this work. An easy to use web-interface seems like the optimal solution, but existing remote management tools can be used. (Architecture-4) (Architecture-2-f)

Cost

The hardware cost of deploying the described architecture includes the acquisition cost of the laptops, servers and USB smartcards. In some cases, the school’s existing devices can be used. The central acquisition of laptops for the candidates can be argued by the fact that, it makes the exam and upper secondary education in general more accessible, as the candidates don’t need to buy the laptop to participate.

The school’s network can be used as is, in the future, the reduced complexity of the physical network can create cost savings.

This architecture overall reduces the complexity that faces all school officials, when compared to the current system, saving their time. This will bring cost savings.

A big chunk of the cost is created by development. The key in writing this architecture was that there would be no need to reinvent the wheel, so as little time as possible would be spent in solving issues that are already solved by others. Majority of the operating system development is carried out the chosen distribution. Saving the development team’s time for the applications, like Abitti.

Known flaws and considerations

Author

This work is authored by Vertebratan, who is currently studying in an upper secondary school, in Finland. And who wishes to remain behind this pseudonymn. The subject interests me largely because I am yet to participate in the matriclution exam. One thing I love about this education system is the emphasis on equality, which largely influences the architecture itself.

Terminology

Abitti

A cloud service ran by the board that provides services concerning the creation and distribution of assignments and the collection and review of answers. Referred to as a cloud service because it is largely abstract and undocumented. In this, also refers to the similar system used exclusively for matricluation exams.

CA

Certificate authority

Chain of trust

In this, ensures that only trusted, unmodified software can decrypt the system partitions. With TPMs not every stage of of the boot process is verified by the previous, but the process is supervised by the TPM.

Current architecture

This architecture is used by the Matriculation Examination board for hosting current Matriculation exams.

Digabi

Current, already existing application software for the exam environment, a server for the exam server and a client for the endpoint.

DigabiOS

Debian based Linux operating system, the base of the current exam environment.

Endpoint

The computer that the candidate uses to connect to the examination environment.

Exam server

A server used store the candidates answers and provide the questions during the exam.

Exam space

Space provided by the school for the exam, this space is supposed to be very very quiet.

Exam space servers

The exam servers set up by the school for the exam. These server are located in the exam space.

Exam system

The whole digital system that is deployed by the board for hosting the matriculation exams. Any digital system used to host any traditional exams.

Exam environment

The digital space in which the exam takes place. The private network that connects the Endpoints and the exam servers.

Fair opportunity

The meaning of fair opportunity is defined by the matriculation board. For example, people with dyslexia are given extended time for completing some assignments.

Initialization period

The process of setting up the exam space, and booting all the necessary devices of the exam environment. Such as the candidate booting their laptop.

MitM

Man-in-the-Middle attack.

One time password

OTP, Simple password generated either electronically or stored on a paper envelope.

Proposed architecture

The architecture that this document describes.

Trusted Platform Module

TPM, A chip or a feature of the CPU found in many modern laptops.