Think of a fruit. Think of a bird. Did you imagine a red apple and a robin? Many people do! These specific mental images associated with a given noun are called prototypical images in the field of cognitive linguistics.
Maybe like us, you’ve played countless games of skribbl.io, an online pictionary game that has grown significantly since the beginning of the pandemic. There’s always that one person in every game of skribbl that guesses the answer as soon as a vague shape is drawn and is able to use their cursor to perfectly replicate your own mental image of a noun. This person has developed a good sense of the common prototypical images for many nouns and thus is able to quickly determine what people are drawing. Prototypical images are often very similar for people in the same region or culture, likely due to the similarities in environments during critical years for language development. The concept of prototyping is studied under the field of cognitive linguistics, which takes an interdisciplinary approach to examining our use of language. By combining the fields of cognitive science, psychology, neuroscience, and linguistics, we can form a more complete picture of language and our relationship with it.
What happens when we consider prototyping in technology? In order to represent our world, software often must form a prototypical image of real-life objects and concepts. Take Apple’s Siri and Amazon’s Alexa for example, these technologies are designed to recognize and respond intelligently to the human voice. Thus, they must form a prototype of the human voice to be able to identify when something is said and what it means. Since the introduction of Siri in 2011, we have made significant progress in voice recognition software. Initially, Siri made many mistakes and struggled to understand accents outside of Mainstream US English (MUSE); this is due to the software having a prototype of the human voice where MUSE was overrepresented. Later introduced technologies improved on the voice recognition model that Siri used. Rather than building a model of the human voice from inherently biased databases, Amazon’s Alexa builds a user-specific prototype, that is, it builds a prototype of YOUR own voice. This is why Alexa and other similar softwares ask you to say several predetermined phrases; it’s learning how you speak! By using this algorithm to create a prototype for each individual user, Alexa avoids the inevitable errors that would be made by using a database to create one prototype for the human voice. Voice recognition technology represents the perfect example of how prototypes in technology have biases that greatly affect the users, and how we may be able to overcome these biases.
Prototyping in Machine Learning
Artificial Intelligence is built to resemble the thinking process of human beings, making it easier for users to interact with the technology and speed up common issues in human computing. One of the main issues with AI is that these systems are built using machine learning algorithms. For example, an artificial intelligence system is programmed by mirroring the structure of human cognition for a specific task, right? Well in this instance an Artificial Intelligence system is learning from the interaction with that human — seeing how the human works, how it can make certain shortcuts, how it can predict what it may be useful form, and overall collecting data to make the human’s life easier and its own “artificial” life easier.
You may be thinking — okay, I hear you, but what does this have to do with prototypes and skribbl? Well, in the same way, the brain establishes prototypes through experience and frequency of a certain image in the development of an idea, the same thing happens in AI. AI forms prototypes to prioritize certain aspects of its function. For example, while shopping for a dress online, AI algorithms will register the fact that you are looking for dresses and then proceed to show you targeted advertisements with all stores that sell dresses similar. Therefore, AI decision-making is influenced not only by its programmers but also by the users interacting with the program. There are many shortcomings of AI for this reason (among others). However, these biases in AI are not necessarily the fault of programmers. A programmer could program an AI with a myriad of methods to prevent bias, whether these be biases based on race, gender, economic status, etc. in an attempt to create a “perfect type of thinking”, but that is logistically impossible. It is impossible because AI is at the whim of the prejudiced society it is meant to collectively reflect.
There are many concepts that computer and cognitive scientists have tried to implement in AI in order to prevent bias in the program’s function. Typical models of Machine Learning will include a form of deep learning and data analysis, including the use of internet and web searches, typical human behavioral data, and other preexisting forms of usable information. In these past models, however, there are many existing issues due to how these data are collected and even in the type of data. These data tend to include information that may contain biases such as prejudiced drawings, biased portrayals of demographics, biased historical accounts, and other biased information of this sort. There are also more common issues within data sets, including a sampling bias which occurs when there is an oversampling and overexposure to one community of people or specific demographic and an undersampling or lack of exposure to other groups in comparison.
We see issues in this biased sampling in AI that many of us are now incredibly familiar with: Zoom. There has been an overarching issue of racial bias in the programming of Zoom backgrounds. These backgrounds tend to not work that well for darker skin tones compared to lighter skin tones. This is one way in which Zoom’s facial recognition AI is problematic. This seemingly small issue forces people of color to make more adjustments to their use of the program than those with a lighter skin tone. This is most likely due to a sampling bias where when building the program, the facial recognition AI was not properly exposed to a representative range of human skin tones. This heavy bias is already concerning, but the issue deepens when we look at how machine learning models consistently reflect similar errors.
Embodied AI & Current Solutions
A new concept in AI that attempts to eliminate this form of bias is known as Embodied AI. This form of machine learning gained traction due to the shortcomings of deep learning. Embodied AI focuses on a different prioritization where individual human interaction is of higher importance within the system’s intention of learning. This kind of learning contrasts with learning through the use of an established data set — Embodied AI grows its data set by orienting its goal by progressively building its dataset from the individual user interactions. The AI is programmed to do this by taking in information from interactions and then applying constraints to the data in order to prevent bias from any individual based on group influence.
For example, when the idea of goal orientation and embodied AI was first implemented, programmers tried to model the idea with an AI robotic arm that was tasked to move balls around a table. The arm could move the balls any way that it wanted: bouncing, dropping, or completely flinging the balls off the table — which caused many problems for slightly more obvious and hilarious reasons. Then the AI was given the constraint to keep the balls on the table, where it could not push the balls so hard that they would fall off nor could it pick up or force the balls to leave the surface of the table in any way. This constraint initially made it hard for the AI to learn and required more computing but it then learned and easily implemented the constraint. Introducing these constraints to how an AI interacts with a human sounds easy in theory, but it gets to be significantly more complicated than just adding assumption and bias constraints.
Overall, having Embodied AI and constraints as an option would be amazing especially when you think about how it could be used, for example, looking at our justice system one could look at using embodied AI and create bias constraints in order to determine someone’s time in jail. This option would be beneficial in the opposite case as well, where a constraint is introduced that is biased toward the action of releasing people from jail from the determined point of the law while looking for the best aspect of the person involved.
Unfortunately, there is still no perfect way to create constraints, let alone apply them. This challenge of incorporating constraints into AI lies in the fact that both ethics and morals must be taken into account. Furthermore, the fact that such biases exist in the first place is indicative of a sad reality that our society contains within it systematic biases against certain groups.
Then again, there is still the issue of sampling and only having access to disproportionately represented demographics. Even when applying things like weights and adjusting percentages to certain groups and ethics, there is still not an equal display of opinion and experience. This is then paired with the argument of whether AI should then be blindly and blatantly equal or if it should take a more complicated, but equitable approach and take into account historical contexts of marginalized groups.
Humans have a much easier job at this kind of prioritization and filtering of information. Human reasoning inherently has the ability to account for the constraints of social norms and moralities and adjust behavior accordingly. AI could have a similar path of social evolution, but it would also take time, practice, a lot of money, and incredible machine power. Which, unfortunately, many companies do not see as a worthy investment.
We often do not recognize how valuable the ability to immediately prioritize external information is for processing social interactions. Our brains are trained throughout our entire life to decide what information is irrelevant and ignore it, allowing us to focus on what is important in any given situation. Take the cocktail party effect as an example of this — we as humans are able to carry out a conversation in a noisy, crowded room because of this ability of prioritization that permits us to filter background noise as irrelevant stimuli. But then as soon as someone says your name or starts talking about something you have an interest in, the information is automatically processed as relevant and causes us to immediately trigger our conscious attention.
However, a computer does not have this inherent ability to automatically ignore information. Instead of working with already filtered and automatically prioritized information, the program has to take in every piece of information from the cocktail party. Then, the program has to sift through it and prioritize the conversation and information that it wants to, which requires significantly more computational power than required of the brain to accomplish the same task.
This same kind of ability for a program would require a highly monitored dataset and is generally a HUGE task for both the computer and for the programmers and data analysts that are creating the AI. This kind of tech practice is simply not cost-effective in terms of computing power for the AI or in terms of finances for human-interface-based companies.
So many of the issues within this area of prototypes and formation of profiles for human interaction are completely based on the argument of wealth and the basis of economics. It is more explicitly cost-effective for AI companies to have almost purposefully biased data sets based on who fits the demographic for using and purchasing AI, which unfortunately tends to be those who already have access to technology or have the means to buy. Then, furthermore, this gives incentive to not prioritize economically marginalized communities, which tend to be people of color, in their target consumer population. This issue makes it so that there is something inherently unjust about the marketing and establishing a “target audience” of AI.
We see this again and again for the reasons stated above: it’s more cost-effective to feed a bunch of easily accessible data and take whatever comes out. If this were to change there would be an insane amount of constraints for Embodied AI. Generally, because of access and homogeneous demographics within the tech industry, many people/experiences are somewhat flooded out from machine learning techniques.
We don’t typically think of prototyping as anything important, and instead see it more so as stock images that our brain immediately uploads. It seems as though it isn’t until we are confronted with niche cognitive issues, like prototyping, that we see how our society has trained us to prioritize certain images over time — and we can see how subtle influences affect our thinking long term. This effect is why diversity in kids' development and exposure is so important, but also why general diversity in all fields — especially in this circumstance of tech and wealth — is so important. What we build and manifest as a society is completely affected by the exposure we have to the prototypical images and implicit biases we develop.
Prejudices are built into the structure of our society, and as we see in the effect of these structures within biases in AI — society as a whole affects the trajectory of technology and with that the perpetuation of access and wealth disparity. If we consider everything that AI and technology are supposed to resemble in a user interface, we see that all societal limitations are all tied together. These types of issues make it so we can’t even have AI be programmed to prioritize one thing over another — because they all have some sort of an influence. Our society, in all its current injustices and inequalities, is unable to have an AI free of prejudice because AI in a way becomes a reflection of society in its current and historical state. There is little that can be done immediately to address this issue, but with gradual growth and diversity in computer and tech-based fields, along with a spread in wealth and access distribution we will be able to see a better and more equitable reflection of society in AI.
This article was written by Annabel Davis, who is a junior undergraduate student at UC Berkeley studying Cognitive Science, and Mridula Vardhan, who is a Senior undergraduate student at UC Berkeley studying Molecular and Cellular Biology with an emphasis in Neurobiology. This article was edited by Oliver Krentzman, a senior undergraduate student at UC Berkeley studying Cognitive Science.