Data leaks in the HPI school cloud

Learning platforms for schools are one of the IT environments that have been the focus of public interest since the pandemic. Some federal states offer their schools the HPI School Cloud, which the Hasso Plattner Institute (HPI) has been developing as open source software since 2016. In addition to an instance that the HPI operates itself, there are separate instances in Brandenburg, Lower Saxony and Thuringia.

The schools use the platform partly because they do not yet have their own solutions in use, partly as a permanent solution. Video conferences are held and tasks are set and processed via the HPI school cloud.

In February we received the information from a whistleblower that he had taken a closer look at the school cloud and had encountered various security problems. The starting point of his research were the GitHub repositories, in which the HPI school cloud is developed open source. First he looked for any demo access data and found what he was looking for in a configuration file. With the user name he was finally able to register with the Thuringian authority.

Not only teachers and students could register on the Thuringian School Cloud. Anyone could have gotten access to a forgotten demo account.

Because the Thuringian actually use a single sign-on solution, he had to take a little detour, open the login page directly and enter the access data found there. He only found himself as a registered demo user in the HPI school cloud by clicking the back button of the browser. At other instances this way through the back door did not work.

In the JavaScript code in the GitHub repository, our whistleblower became aware of the endpoint / teachersOfSchool, which displayed a list of hundreds of teachers from all over the state as a JSON object in Thuringia, including their surname, first name and an ID. You could also filter on schools using a parameter in the search.

More from c't magazine

More from c't magazine

The endpoint // metrics turned out to be even more revealing – mind you, with two slashes at the beginning. You didn’t even have to be logged in to open it in the browser, so that we could try it out not only in Thuringia, but also in the HPI instance. Measured values ​​from the inner workings of the server were displayed, including the utilization of processors and main memory.

For attackers who are planning a denial-of-service attack (DoS), i.e. who want to utilize the server to such an extent that it becomes unusable, such measured values ​​are a gift. Using this information, they can specifically measure which actions are driving up the load and increasing the efficiency of their attacks. At the beginning of 2021, the school clouds were repeatedly exposed to DoS attacks that brought the servers to their knees.

The other information that the endpoint offered was even more disastrous. At the end of the document there were excerpts from logs with the URLs called up. The system tried to make IDs unrecognizable, but failed at a crucial point: There were numerous entries in the log according to the scheme “/ link / aBc123Defg”. The result was a serious data leak. Via such links, users make files available to other people – anyone who knows the link is allowed to read the file. The demo student, whose data was publicly available, was enough, and the endpoint presented the actually secret links on a silver platter.

What we found in random samples through these links was very diverse. These include handwritten tests and exercise sheets, some with names and grades. Even worse class and teacher lists, also with contact details. The list of found objects was rounded off by various videos: schoolchildren reciting poems in their living room and greeting their class, dancing schoolgirls in a video to say goodbye to their teacher and entire screen recordings of virtual lessons.

Recordings of digital lessons from Thuringia – including class list and student chat – were not intended for the public. Due to a configuration error, the link to the video was publicly available.

We documented all of these observations and looked for explanations and similar problems in the publicly available code. The fact that the demo user was active in Thuringia can be classified as a classic configuration error by the Thuringian operators. The end point, on the other hand, which exposed all teachers as JSON data, was apparently built in on purpose. The front end that offered this endpoint obtained the data via an API in the background and passed it on to the browser in JSON format. To be on the safe side, we checked other API endpoints, but couldn’t download all student names with the demo account, for example. The rights system was therefore intact, only configured very loosely at this point.

Finally, the // metrics endpoint, which accidentally exposed the links and thus led to the largest data leak, was a mixture of programming and configuration errors. He appeared in Thuringia and at the HPI itself, but not in Lower Saxony, for example.

With all these observations, we contacted the HPI and the operators of the Thuringian authority as well as the data protection officer specified in the respective imprint.

The reaction to our information could be published as a sample solution for data protection officers. In less than an hour we received a first confirmation of receipt, after less than 25 hours we received a more than detailed statement from the press and publicity team of the HPI School Cloud, which was obviously written in cooperation with several responsible persons. All technical and data protection questions were answered in a well-founded manner.

According to the reply, the demo user should not have been active in Thuringia – he is normally deactivated on productive instances, but this was neglected. As expected, the problem could be solved quickly. The // metrics endpoint was also not intended for the public but, as assumed, for internal analyzes. The filter used, which was supposed to prevent access, only worked if the address was called via / metrics; it failed with // metrics. The fact that the links with the student and teacher files were exposed as a result was a serious consequential mistake. All previously issued links have been deleted to be on the safe side.

The endpoint, which listed the state’s teacher names, was also turned off as a precaution for students, although there was a plausible reason for it. Only teachers who had checked that they wanted to be included in cross-school teams were shown. There was a search function for this function, which was also initially deactivated. The search was previously implemented on the client side – so the browser loaded the huge list from the server and only displayed a selection of them when you typed in a search term. Our recommendation to filter the data on the server and never offer the entire data set was gratefully received for the development.

The written answer makes it clear that the HPI was meanwhile prepared for such cases and that the emergency plans were in the drawer. In May 2020, the reaction to a safety alert was very different. When the ARD magazine Kontraste pointed out an active registration link via which external people could access an account, the HPI reacted unsuccessfully and filed criminal charges against unknown persons for spying on data.

Such steps were not taken with our advice. The HPI classified the case as notifiable within the meaning of the GDPR and, according to its own information, informed the responsible state data protection officers, as well as the partners in the federal states. In consultation with the authorities, users should also be informed that their data could potentially have leaked.

The case is instructive from several perspectives. During his investigations, our whistleblower behaved like a penetration tester who offers such tests for money. Hiring one can be worthwhile. Anyone who operates an infrastructure in the network themselves should at least occasionally take the time to take a look at their own systems from the outside and at least try out the most obvious steps.

It is also important to be well prepared just in case. If technicians and data protection officers have already completed the basic structure for a report in advance, the probability is higher that a report to the state data protection officer will be put on paper within the 72 hours required by the GDPR if a data leak becomes apparent.

In c’t 6/2021 we would like to make it easier for you to get started with the smart home: We provide practical tips and purchase advice for more security, comfort and efficiency in the intelligent home. If you have your finances under control and want to use home banking for this, you should consult issue 6: In it, we tested six programs for home banking, paying particular attention to data protection. We will also show you how you can cleanly separate your personal phone calls and data from your work in the home office. We are testing GPS trackers for e-bikes, compact document scanners for more order in the office and the first e-car with Android. The school cloud of the Hasso Plattner Institute (HPI) recently revealed a huge security breach. Fortunately, the hole in the platform was closed after our advice. You can read about this and much more in issue 6/2021, which will start on February 26th in the Heise shop and at the well-stocked magazine kiosk.


To home page