Stereotypes are not "just patterns". Google "stereotype" and the first definition is "a widely held but fixed and oversimplified image or idea of a particular type of person or thing.".
In this case, the machine is learning, for example, that women do dishes and men drink beer. This isn't based on empirical data and patterns. It comes from the data the algorithm is trained on, which in this case is "...more than 100,000 images of complex scenes drawn from the web, labeled with descriptions". Those descriptions inevitably reflect human stereotypes (which again aren't just patterns). "Both datasets contain many more images of men than women, and the objects and activities depicted with different genders show what the researchers call “significant” gender bias."
Should machines understand that men are more likely to be construction workers than women, I think so. But that doesn't mean that biased data is not a huge problem.
At the very least, we should strive to teach machines to understand the world as it is, not as we view it through flawed, biased eyes. Datasets generated by humans are going to be invariably flawed and far from reality, unless we take careful steps to ensure otherwise. Emphasizing the importance of these datasets, we should be particularly concerned when we are teaching machines to act based on our biases and find that (in the case of this article), the machine is actually learning to amplify our own biases.
Patterns do fit stereotypes perfectly. They are oversimplified concepts of something real; some samples are outside the pattern and that is perfectly fine for pattern-discovery. So stereotypes and patterns aren't distinct concepts in essence.
«Datasets generated by humans are going to be invariably flawed and far from reality, unless we take careful steps to ensure otherwise.»
If the data in the wild is not real, we can only adapt it to your reality to make it real. That's not necessarily my view of reality. You can't just pluck objectivy out of the air and fresh up fake real data.
No they don't. "Stereotype" is a psychological concept, and therefore by definition incorporates human subjectivity. There are various conflicting definitions, but most include the possibility or likelihood that many overstate or even completely falsely construct generalizations.
Or, as Wikipedia[1] states,
> By the mid-1950s, Gordon Allport wrote that, "It is possible for a stereotype to grow in defiance of all evidence."
> Research on the role of illusory correlations in the formation of stereotypes suggests that stereotypes can develop because of incorrect inferences about the relationship between two events (e.g., membership in a social group and bad or good attributes). This means that at least some stereotypes are inaccurate.
In this case, the machine is learning, for example, that women do dishes and men drink beer. This isn't based on empirical data and patterns. It comes from the data the algorithm is trained on, which in this case is "...more than 100,000 images of complex scenes drawn from the web, labeled with descriptions". Those descriptions inevitably reflect human stereotypes (which again aren't just patterns). "Both datasets contain many more images of men than women, and the objects and activities depicted with different genders show what the researchers call “significant” gender bias."
Should machines understand that men are more likely to be construction workers than women, I think so. But that doesn't mean that biased data is not a huge problem.
At the very least, we should strive to teach machines to understand the world as it is, not as we view it through flawed, biased eyes. Datasets generated by humans are going to be invariably flawed and far from reality, unless we take careful steps to ensure otherwise. Emphasizing the importance of these datasets, we should be particularly concerned when we are teaching machines to act based on our biases and find that (in the case of this article), the machine is actually learning to amplify our own biases.