AI Development

Data collection in online education| Ethics, Law and Practices

This article describes our views on the ethics of data collection and how they combine with the Swiss legal framework to inform our practices at SCA. While data collection can provide numerous benefits, including product improvement, better planning, and advertising, it also poses potential privacy issues. This article also describes the practices of data collection at Swiss Connect Academy and some of the proactive steps we take to mitigate privacy risks: prioritizing transparency, adhering to legal obligations, and aligning our interests with those of our users.

How an organism acts is guided by what it perceives, the efficacy of its actions depends on the acuity of its perceptions. This is as true for a social organism as it is for a biological organism. It applies equally to plants, animals, companies or countries. For animals, perception and the associated cognition mainly takes place at the level of the individual, while for social organisms, the collection and treatment of information happens at both the human level and the group level. On the one hand, this implies that social organisms can perceive without any explicit supra-individual mechanism of acquisition of information because its members are informing themselves. On the other hand, once a company (since it will be the organism of interest to us) starts acquiring information in a systematic, explicit and organized way, this process of acquisition can more easily become the object of reflection. In the same time, the gain in efficiency of the process allowed by information technology, in the collection, retention and analysis of data makes this process so much more important in its costs, potential benefits and potential harms that a reflection about it becomes ever more necessary.

The collection of user data by large companies is a topic that has often been considered before, and we hope to present, in this article, some of the best and most relevant of those ideas. Since Swiss Connect Academy’s clients are usually businesses and the users of our learning platform are those businesses’ employees, our situation has an unusual twist in comparison to data collection by Meta or Google: our users are usually not directly our clients but rather their employees [footnote: It could be argued that for Meta and Google the clients are more the buyers of advertising than the users, but this difference does not impact our argument.].

Section 1-Who benefits from data collection?

For a company, deciding about a data collection policy is a question of balancing between trade-offs: Minimizing harms and costs while optimizing benefits. A suspicious reader might consider that the harms of data collection are often incurred by third-parties while monetary benefits are earned by the firm. This is a correct but partial view of benefits: in general, a company mainly earns money in exchange of the benefits it provides for the rest of society. [Footnote: Exceptions to that rule are well publicized and well know, because they are unacceptable, not because they are the norm. At Swiss connect Academy (SCA), we are, while not disinterested, plainly uninterested in working to obtain unearned monetary benefits.] A baker earns money by making good bread and Google by helping people find what they want on the web.[Footnote: Google does make money from their advertisement, but people see the ads because they use Google to do searches.] We will thus candidly continue writing of the benefits being maximizing as benefits for society in general.

There are, for a business, several reasons to collect data. The main one are: internal planning, product development, advertising, pricing and reselling the data. Better planning and product development tend to benefit everybody, since the firm becomes better at satisfying the needs of its clients. However, better informed advertising and pricing can have disadvantages for a company’s clients: manipulative advertising can become more effective and the prices can be adapted to be higher for clients who are predicted as able and willing to pay. Of course positive aspects are still present: better advertising might allow clients to be informed about a product useful or agreeable to them and some people will be able to afford more services due to better pricing. Meanwhile the reselling of data is a Pandora’s box, which makes unclear to what goal the data might be used: political propaganda or repression, evaluation of candidates by potential employers or insurers, blackmail, identity theft and other criminal use become possible, even if, as before, better product targeting, advertising and adaptive pricing are more common consequence. At SCA, product improvement is the principal use, and we also use data for planning and for advertising our own products, but we avoid personalized pricing. Data collection can be used in two ways to improve products. First, it can influence how new products are chosen, designed and targeted. Second, products using machine learning directly benefit from collected data by training the AI system on (a part of) it.

In the previous paragraph, we considered the benefits and harms caused intentionally by data collection. However, data collection can have unintended consequences, in particular the data collected can be stolen and maybe leaked. Leaks are the more direct threat to privacy, they transform private information into widely accessible or even common knowledge. Stolen data can however also be used without being leaked, especially for criminal activities like stealing from bank accounts, illegally accessing private or confidential information (using stolen passwords), identity theft or blackmail.

Thus, it is important for a firm to protect its users by protecting the data it collects. Again, we might ask how the interests of a company can be aligned with the interests of its users. Protecting collected data is costly to the company, while the costs of data theft are often, at first, borne by the users. The answer is two-fold: transparency and legal obligations. Transparency allows a firm to benefit from being security-conscious by communicating about it and being chosen by more clients for this reason. (The present document hopes, in its own way, to perform this function exactly!) However, transparency does not solve the problem completely, because it puts the burden on the clients to ensure the firm is diligent and requires the latter to be sufficiently honest. Consequently, laws were passed to make it obligatory for firms to provide protection of the data and the privacy of their clients (or users) and of third parties. [Footnotes: Often companies are able to acquire information about much more than it’s own users. For example, when a photo is uploaded to a social network, data about everyone in it becomes accessible]. This will be the topic of the next section.

Section 2 – Data privacy regulations are catalyzing responsible data collection.

The General Data Protection Regulation (GDPR) is the applicable regulation passed by the European Union (EU) to guard consumers’ personal data privacy and security in all member states. Its stipulations, fines, and power of sanctions apply to all states member but also extend beyond EU territory provided companies are collecting personal data related to people living in the EU or when the effects of the data processing might have consequences in EU territory.^[1]

Due to the GDPR’s importance in building consumer trust in data processing and giving increased rights to individuals concerning their data, this law is seen globally as a positive legislative example that sets international standards for data protection and privacy.

While Switzerland is not part of the EU, the Swiss government has enacted its own data protection law called theFederal Act on Data Protection (FADP), which applies to the processing of personal data by natural persons, private persons or companies.

In 2020, the FADP was completely revised (nFADP), and it was supposed to come into force in 2022, however this term was delayed to 1st of September 2023.

The nFADP is broadly inspired by GDPR; however, it introduces a few distinct provisions in an effort to adapt to the new social and technological changes we are experiencing nowadays. For example, the nFADP strengthens data processing transparency toward consumers by increasing processors’ obligations to provide more extensive information about the data they collect.^[2] Also, the nFADP seeks to increase accountability for breaches and non-compliance inside companies’ structures by stipulating sanctions in the form of fines for natural persons/employees, instead of fining companies only as we see in the GDPR. Therefore, any employee who commits a privacy violation will be held accountable, provided their action was intentional.^[3]

The higher the risks associated with a certain data set, the stricter the legal stipulations are.

A risk assessment approach was built by dividing data into three categories: personal, sensitive, and non-personal data.

In general, personal data is anything that could be used to identify a person, for example: an IP address, a title or a distinction that could uniquely identify a person, a cookie ID. Therefore, not only are the name, address, portrait or picture, and phone number considered personal data, but any other information which could be used in corroboration with other data to identify someone.

Only personal data is subject to the FADP, but what about personal data anonymization?

Non-personal data can be any type of data that does not contain personally identifiable information.Non-personal data is not regulated and can be collected for any purpose without consent and stored for an indefinite time.

Suppose the data is altered so that it becomes impossible to retrieve the connection between the anonymized data and an individual. From a legal perspective, this data is considered to be non-personal.

An interesting case is when personal identifiers from a data set are encrypted, masked, or hidden, but a later re-association with the data subjects is still possible or desired. The pseudonymized data in such circumstances has the legal status of personal data, but only for those who can decrypt or re-identifying the data subjects in any way.

Anonymizing or pseudonymizing data is a great tool for protecting private information, reducing data breach risks, and ensuring compliance. Still, it can be challenging, especially with more complex or mixed data sets. It’s hard to say ifsingling out an individual or linking the subjects to the data set will not become possible again due to technological advancements or large amounts of data collected and stored over time. In the next section of the article we will return to this topic and talk about anonymization at SCA in more detail.

Section 3 -How do we keep data secure at SCA?

Keeping personal data is inevitable, due to simple facts such as students registering for a course on our website, we have basic personal information about them that we protect. We ensure that we receive informed data subject’s consent by formulating our privacy policy clearly and transparently with specific up-to-date information about the data we collect. Our privacy policy is available on our website’s footer. However, we do not collect any sensitive data and we do not sell or share any personal data.

At SCA, personal data is being anonymized in line with GDPR and nFADP standards, while we keep the same high-security standards for both non-personal and personal data. Internally only a few employes have access to the non-personal data at SCA, and even less have access to the personal data. We manage individual users’ permissions to restrict access to data sets to the minimum. Even within the company we avoid sharing data, we rather share the results deriving from the data processing on a “need-to-know” basis.

SCA ensures that the safety of its user data does not depend on firms it works with by anonymizing the information it shares with them. This means that the information those companies access had all learner names (or in some cases email addresses) replaced by codes (and code-based email addresses) and that SCA alone is in possession of the table of correspondence between the learners names (and other identifying characteristics like email addresses or physical addresses that constitute personal data) and the codes. Moreover, we are, as often as possible, only giving access to (anonymized) learner data to companies based and hosting their server in Switzerland.

The replacement of names by codes called identifiers happens automatically. For example, when a student answers a question in the SCA App, if the question is hosted by Taskbase, it is requested based on a code for the learner and the Taskbase Server sends us back data using the same code. The data received by our servers is then deanonymized to provide feedback to the student and for internal use in analytics.

At SCA we raise data awareness and educate our employees about how they can better contribute to data protection. In addition to using security tools and software (such as encryption, anti-virus and double authentication), sharing any type of file externally/internally is done responsibly by avoiding sending it via personal email or consumer file-sharing tools. It matter to us to have a safe infrastructure where we keep our data and avoid the temporary transfer of data through servers located abroad. That is why we chose a cloud storage service based in Switzerland and fully compliant with the data protection laws.

The importance of data collection and its potential benefits and harms cannot be overstated, especially in today’s society where information technology has made it so much easier to collect, retain, and analyze data. We described data collection practices at Swiss Connect Academy and contextualized those practices by explaining their legal framing and the values to which they answer. However, we tried to portray a moving target: We regularly revise our security practices and reflect on how to remain faithful to our values, like protecting our users and their privacy, because, as the Red Queen proclaims “’it takes all the running you can do, to keep in the same place.”

[1] Article 3 GDPR. Territorial scope

[2] Chapter 3 revFADP: Obligations of the person responsible and the processor

[3] Article 60 revFADP: Violation of information, disclosure and cooperation obligations

Share This Post

We are Swiss Connect Academy

We look forward to introducing you to our services!

Training

Learning Experience Platform

Consulting

AI Development

Certifying Trainings

Other courses

Resources

Discover all our courses!

Data collection in online education| Ethics, Law and Practices

Share This Post

We are Swiss Connect Academy

Training

Learning Experience Platform

Consulting

Related Posts

Crafting User-Friendly E-books

New questions & formative feedback cycles at SCA

The magic of pilot projects in formative testing

Information

Download our apps

Certifications

Certifying Trainings

Other courses

Resources

Discover all our courses!