Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 02 May 2018

The Role of Intelligence in Social Learning

  • Alexander Vostroknutov 1 ,
  • Luca Polonio 1 &
  • Giorgio Coricelli 2  

Scientific Reports volume  8 , Article number:  6896 ( 2018 ) Cite this article

9129 Accesses

14 Citations

3 Altmetric

Metrics details

  • Human behaviour
  • Intelligence

Studies in cultural evolution have uncovered many types of social learning strategies that are adaptive in certain environments. The efficiency of these strategies also depends on the individual characteristics of both the observer and the demonstrator. We investigate the relationship between intelligence and the ways social and individual information is utilised to make decisions in an uncertain environment. We measure fluid intelligence and study experimentally how individuals learn from observing the choices of a demonstrator in a 2-armed bandit problem with changing probabilities of a reward. Participants observe a demonstrator with high or low fluid intelligence. In some treatments they are aware of the intelligence score of the demonstrator and in others they are not. Low fluid intelligence individuals imitate the demonstrator more when her fluid intelligence is known than when it is not. Conversely, individuals with high fluid intelligence adjust their use of social information, as the observed behaviour changes, independently of the knowledge of the intelligence of the demonstrator. We provide evidence that intelligence determines how social and individual information is integrated in order to make choices in a changing uncertain environment.

Similar content being viewed by others

research paper on social intelligence

Foresight in a Game of Leadership

research paper on social intelligence

Quantifying effects of tasks on group performance in social learning

research paper on social intelligence

Cultural diversity and wisdom of crowds are mutually beneficial and evolutionarily stable

Introduction.

Learning is an important and flexible process that allows humans to adapt to their environment. A first basic source of learning is personal experience. Humans interact directly with the environment and learn from the feedback they receive. A second source of learning is observing other people interacting with the same environment. In a world where we need to adapt quickly to ever-changing circumstances (e.g., climate fluctuations, socio-political commotion), the ability to learn from others is fundamental because it allows us to foresee the consequences of our actions without experiencing them directly. However, people should be selective in which situations they rely on social learning strategy as it can be efficient in some cases and inefficient in others 1 , 2 , 3 , 4 , 5 , 6 , 7 . An efficient social learning strategy should specify under which circumstances to pay attention to social information, which individual to imitate, and the type of information that should be taken into account 8 , 9 . The most studied classes of social learning strategies include: frequency - dependent rules , such as conformity or anti-conformity to the most chosen alternative 10 , 11 , 12 , 13 ; payoff - based rules , where the level of imitation depends on the payoffs achieved by a demonstrator in the recent past 14 , 15 ; confidence - based rules , when confidence of individuals and demonstrators modulates imitation 13 ; and prestige - based rules , where the level of imitation depends on the status of the observed other 16 , 17 , 18 , 19 .

Another aspect is how the integration of social and individual information is accomplished 20 , 21 . In an environment where demonstrators are observed repeatedly, on the one hand, it is possible to learn from others using simple reinforcement learning 22 , which is the case when an agent imitates others, evaluates the feedback she receives after imitation, and chooses whether to keep imitating or not, depending on the outcome 23 , 24 , 25 , 26 , 27 . On the other hand, a more strategic use of social information involves understanding the rationale behind the observed choices 6 , 13 , 28 , 29 , 30 . This is a more sophisticated mechanism of learning that requires the agent to integrate what she has observed with the feedback she has directly received from the environment. The adoption of this integrated learning process can be more efficient, especially when the environment is changing or when the expertise of the demonstrator is unknown, but is also costlier since it requires a higher level of attention and an ability to understand and integrate signals coming from different sources. A key question is, thus, when and how to switch between social and individual learning. Experimental evidence shows that individuals increase their level of imitation with task difficulty and cost of individual learning, and decrease it with the probability of changes in the environment 13 , 31 . The tendency to use social learning is also related to the cognitive abilities of the individual. In particular, individuals with low intelligence scores use social (instead of individual) learning more often, as compared to the individuals with high intelligence scores, who, in addition, have a higher ability to understand when and whom to copy 32 .

The goal of this study is to determine how variation in fluid intelligence of participants and demonstrators modulates the use of social information and employment of different social learning strategies. In particular, we are interested in how individual characteristics related to fluid intelligence influence imitation decisions. To achieve this, we study behaviour of participants in a complex, stochastic, and unstable environment while they are observing another individual choosing in the same setting 33 , 34 . The task was designed so that individual learning requires effort and, more importantly, it is hard to recognise the competence of the observed model from her choices alone. To see how reliance on social information and rate of copying the demonstrator change, we have participants observe the actions of either a highly competent or less competent other, and vary the availability of the information about her fluid intelligence. In addition, the dynamic nature of the task and multiple imitation choices that each participant has to make allow us to investigate in detail how social and individual information is integrated.

Evidence from experimental economics literature on strategic thinking suggests that the cost of learning in interactive settings varies with fluid intelligence 35 , 36 , 37 , 38 , 39 , 40 . Studies in cultural evolution literature find a variation in social learning strategies that depends on social group and individual characteristics 13 , 32 , 41 , 42 . Having these findings in mind, we hypothesize that low fluid intelligence participants have a relatively high cost of learning, which implies that it should be difficult for them to perform efficiently in the task and, as a consequence, hard to interpret the observed actions of the demonstrator when her competence is unknown. This should lead to low confidence, low efficiency, and, as a result, strong dependence of the imitation rate on the information about the demonstrator’s intelligence (prestige bias). Behaviourally, we should, thus, observe (1) relatively low earnings; (2) inability to modulate imitation rate with changing characteristics of choices of the demonstrator; (3) no difference in imitation rates between competent and less competent demonstrators when their intelligence is unknown; (4) stable increase in imitation when the intelligence of the competent demonstrator is known (and, as a result, increase in earnings). High fluid intelligence participants, who have low cost of learning, should be able to learn well in the task and also able to recognise the competence level of the demonstrator from her actions even if her intelligence is unknown. This should lead to high confidence, high efficiency, and dependence of the imitation rate on the dynamic features of the demonstrator’s choices instead of the information about her intelligence (no prestige bias). For high fluid intelligence participants we should, thus, observe (1) relatively high earnings; (2) dependence of the imitation rate on the changing properties of the observed choices; (3) different imitation rates of competent and less competent demonstrators when their intelligence is unknown; (4) no difference in imitation rates of a competent demonstrator when her intelligence is known vs. when it is not.

To test our hypotheses we use a two-armed bandit problem in which the probabilities of reward from the two arms are determined by two independent stochastic processes. In each trial, participants choose one of the two arms, which gives them a fixed reward with some probability (Fig.  1A2 ). The probabilities of reward from the two arms change over time as shown in Fig.  1B . Participants were not informed about the exact processes guiding the probability changes, but they knew that these probabilities change gradually. In Experiment 1, participants make their decisions in 200 trials without the possibility to learn from others. In Experiments 2 and 3, participants are presented with exactly the same 200 trials as in Experiment 1, but, from time to time before their choice, they are able to observe the action of a demonstrator (Fig.  1A1 ). They are aware that this person, whom they observe, was making her choices in a previous experimental session (Experiment 1) and that the probabilities of reward that she faced were the same as the probabilities in the current experiment (Fig.  1B ). To assess intelligence, we use a 20-minute version of the Raven Advanced Progressive Matrices Test as a measure of fluid intelligence of our participants 43 . The Raven APM test was found to be a measure of the general ability to think in an abstract way, recognize patterns, reason, and discern relationships, all of which should be crucial for efficient learning from observation in a changing stochastic environment. Before starting the task, participants in Experiments 2 and 3 were provided with the histograms of the Raven APM scores (from now, Raven scores) of participants from Experiment 1 (Fig.  1C ). Two participants were used to act as demonstrators: a participant with a low Raven score (15 matrices solved correctly) and a participant with a high Raven score (28 correct matrices). Participants were not aware of the relation between the Raven score and the performance of the demonstrator in the learning task.

figure 1

Experimental design. ( A ) Participants choose between two options which can give them a 10 cents reward with some probability (A2). In each trial, a red figurine, presented after a fixation screen, informs participants that it is time to choose. In Experiment 1 participants choose without observing anyone’s choices. In Experiments 2 and 3, periodically (6–12 trials in a row, for a total of 100 out of 200 trials across the experiment), and before their choice, participants can observe the choice of another participant who took part in Experiment 1 (A1). The observed other is represented by a green figurine and her choice is shown with a green tick. Participants do not observe the outcome of that choice. ( B ) Throughout the 200 trials of the experiment the probabilities of reward from the two arms change according to prespecified random processes (the two lines on the graph). It was carefully explained to participants in Experiments 2 and 3 that these probabilities were the same for them and the demonstrator they observe. ( C ) Participants in Experiments 2 and 3 received different information on the observed other. In the NovisHigh and NovisLow treatments they see only the distribution of the Raven scores of potential demonstrators (graph at the top). In the VisHigh and VisLow treatments they see the distribution and the Raven score of the other they observe marked by a red bar (graph in the middle and at the bottom). In Experiment 2 all participants know their own Raven score. In Experiment 3 participants took the Raven test before the main task, but were not informed about their Raven score (everything else was as in Experiment 2).

In order to test the hypothesis that high fluid intelligence participants can recognise the competence of the demonstrator by only observing her actions, whereas low fluid intelligence participants cannot (Hypothesis 1), we run two treatments in which participants observe a low or high Raven demonstrator, but her Raven score is not visible. Participants, included in these two treatments (NovisLow for a low Raven other and NovisHigh for a high Raven other), see the histogram of the Raven scores from Experiment 1 (Fig.  1C top) and are told that the person they observe is one of those represented on it. To test the hypothesis that low fluid intelligence participants react to the information about the Raven score of the demonstrator and high fluid intelligence participants do not (Hypothesis 2), we run two more treatments (VisLow for a low Raven other and VisHigh for a high Raven other) in which the Raven score of the demonstrator is visible to participants. This information is delivered through the same histogram as in the previously described treatments, only now the Raven score of the observed individual is indicated with a red bar (Fig.  1C middle and bottom). To test the hypothesis that high (low) fluid intelligence participants (do not) modulate their imitation with the changing properties of choices of the demonstrator (Hypothesis 3), we introduce in our analysis an observable measure of demonstrator’s confidence—the number of times she switches between actions—and check if the imitation rate is influenced by it dynamically 44 . Finally, to verify that the differences among the four treatments come from fluid intelligence and not from the information about participants’ own Raven score we provide information about one’s own Raven score in Experiment 2, and we do not in Experiment 3 (see Table  3 in Appendix  B.1 for the detailed information about all experiments and treatments).

We start with providing evidence for the hypotheses that are concerned with the reactions of participants to the Raven score of the demonstrator and its visibility (Hypotheses 1 and 2). We analyse the aggregate average levels of imitation in the four treatments of Experiment 2. By imitation we mean the situations when a participant chooses the same action as the observed other. There are two types of imitation. The first is pure imitation : a participant sees what the observed other has chosen and decides to choose the same action. The second is coincidental imitation : a participant chooses the same action as the observed other because she thinks this is the best action to choose regardless of what the demonstrator does (for example when the probabilities of reward from the two arms are very different and it is obvious which arm is better at the moment).

We would like to focus our analysis on the cases of pure imitation. However, in the periods in which participants observe a demonstrator’s choice we cannot tell apart pure imitation from coincidental one. One way to control for coincidental imitation is to notice that, in the periods when the other is not observed, all cases of imitation are coincidental. Therefore, assuming that, on average, coincidental imitation is the same when the demonstrator is observed and not observed, we can use the average rate of imitation in the periods when the demonstrator is not observed as a proxy for the average coincidental imitation when she is observed. Thus, we consider an adjusted imitation rate that, for each participant, is equal to the average rate of imitation in periods when the other is observed minus the average rate of imitation when her actions are not observed. This adjustment is necessary to correctly estimate pure imitation since the behaviour of participants is very heterogeneous: the rate of coincidental imitation when the demonstrator is not observed ranges from 0.38 to 0.87.

Figure  2A shows the adjusted rates of imitation in the NovisHigh, NovisLow, VisHigh, and VisLow treatments averaged by the terciles of the Raven score of participants. We see significant differences in the adjusted rate of imitation between observing high and low Raven demonstrators for all terciles of the Raven score of participants and both a visible and non-visible Raven score of the demonstrator except for Low Raven participants (tercile 1) in the NovisHigh and NovisLow treatments. Middle and high Raven participants (terciles 2 and 3) are able to recognise a competent demonstrator even when her Raven score is unknown, while low Raven participants are not, which provides support for Hypothesis 1. This result shows that fluid intelligence correlates with the ability to understand when copying the demonstrator is worthwhile and corroborates previous findings that high performance demonstrators (in our case high Raven other) are copied more often 13 . We provide an additional angle to these results: when participants are not informed about the competence of the demonstrator (the NovisHigh and NovisLow treatments), they copy the high performance other more often than the low performance one only when they can recognise her as such. In our experiment low Raven participants are less able to do that, so their rate of imitation of the high performance demonstrator does not change based on the observed behavior of the demonstrator.

figure 2

Adjusted imitation rate in Experiment 2. ( A ) The adjusted rate of imitation by the terciles of the Raven score of the participants in treatments with a non-visible and visible Raven score of the other. Blue (red) bars represent the adjusted rate of imitation in treatments in which participants observed the high (low) Raven other. The p -values (from left to right: 0.003, 0.004, <0.001, <0.001, 0.033) denote the significance of the t -tests on the differences in coefficients of an ordinary least squares (OLS) regression that the bars represent (first column of Table  4 in Appendix  B.2 ). ( B ) The dynamics of the running average of the adjusted imitation rate of low Raven participants (below median Raven score) when they observe a high Raven other (only for 100 periods when the action of the other is observed). Ranges are ±1 SE. ( C ) Same as B only for high Raven participants (above median Raven score). ( D , E ) Analogous graphs for the situation when the low Raven other is observed.

Next, we turn to the analysis of the visibility of the Raven score of the demonstrator (Hypothesis 2). When we compare imitation rates in the VisHigh and NovisHigh treatments we find that low and middle Raven participants increase their imitation when they know that the other is high Raven (increase in imitation: 0.097*, t -test p  = 0.032 and 0.097*, p  = 0.034 respectively; see Appendix  B.2 for details). This supports Hypothesis 2 and suggests that low/middle Raven participants interpret the information about the Raven score of the demonstrator as a signal of competence in the task even though they do not know how much Raven score correlates with it. This is in line with the studies on unconditional copying of successful, knowledgeable, or prestigious models 14 , 17 , 45 , 46 , 47 , 48 . Conversely, high Raven participants are not significantly affected by the visibility of the high Raven score of the demonstrator (−0.031, p  = 0.419). Unlike low/middle Raven participants, they do not react to this information but identify a competent demonstrator by her actions. Taken together, we find that intelligence determines the sensitivity to (possibly irrelevant) information about the skills of the demonstrator. Similar differential reliance on learning from models was found in between-cultures studies 42 , 49 . We add to this literature by showing that variation in social learning strategies can arise from difference in the ability to interpret the actions of the demonstrator, which is correlated with fluid intelligence. It should be also noted that our results are robust when using a different measure of participants’ cognitive ability (see Appendix  B.3 ).

An additional question of interest is whether low Raven participants learn during the experiment and become more like high Raven participants, or whether they maintain their tendency to imitate a high Raven demonstrator more when her Raven score is known? Figure  2B–E show the dynamics of the adjusted imitation rate of low and high Raven participants (below and above the median Raven score). Figure  2B,C illustrate the moving averages in the VisHigh and NovisHigh treatments. Low Raven participants demonstrate a significantly increased rate of imitation when they know that the demonstrator is high Raven, which lasts almost until the end of the experiment as predicted by Hypothesis 2. High Raven participants are affected by this information only in the first 28 periods of observation and then exhibit the same imitation rate as in the NovisHigh treatment. This suggests that high Raven participants learn to understand the meaning behind the choices of the other after about 28 periods of observation, whereas low Raven participants do not and keep relying on the information about the demonstrator’s Raven score. Figure  2D,E show the dynamics of imitation for the low Raven other. In this case, neither high nor low Raven participants change their rate of imitation with the visibility of the Raven score of the demonstrator. The difference in behaviour of high and low Raven participants can be interpreted in terms of difficulty to learn. One possibility is that low Raven participants have a high cost of asocial learning, and it is difficult for them to interpret the observed choices of the demonstrator. Therefore, following the observed other, when her high competence is known, is adaptive for low Raven participants 50 . High Raven participants seem to be able to learn how to perform in the task as it unfolds and become more confident in interpreting the actions of the demonstrator. Thus, they rely on the information about the Raven score of the other only at the beginning of the task. This result supports the evidence provided in previous experiments 13 that an increase in confidence shifts the balance between social and asocial learning towards the latter.

Efficiency and Earnings

Next we test the hypothesis that high Raven participants exhibit higher efficiency of choices and earn more than low Raven participants. A choice of a participant is efficient if she chooses the action with the highest probability of reward. In our data the lowest efficiency rate is 0.45 and the highest efficiency rate is 0.87, which is a dramatic difference suggesting that some participants are much better at learning the task than others (see Fig.  5 in Appendix  B.4 for the distribution). Figure  3 shows the improvement in the efficiency rankings for the four treatments of Experiment 2 as compared with Experiment 1.

figure 3

Efficiency improvements over Experiment 1. The bars show the proportion of periods in which the average efficiency in Experiment 2 (by treatment and Raven score of participants) exceeds average efficiency in Experiment 1 (both smoothed by 5-period moving average). The bar colours stand for the Raven score of the observed other. Ranges are ±1 SE.

The figure shows that the efficiency improvements of high Raven participants are larger than those of low Raven participants in all treatments except NovisLow (improvement of 0.5 is due to chance). To support this finding, we perform non-parametric tests on individual efficiency rates. The Kruskall-Wallis test of the nine groups (participants in Experiment 1 and low and high Raven participants in the four treatments of Experiment 2) shows a significant difference among them ( p  = 0.042). For low Raven participants we can reject the null hypothesis of equal distributions of efficiency rates only for the VisHigh treatment (rank-sum test: p  = 0.031). For high Raven participants we can reject the equal distributions hypothesis for all but the NovisLow treatment (rank-sum tests: p  = 0.063 in the VisLow treatment; p  = 0.018 in NovisHigh; p  = 0.004 in VisHigh). In support of Hypothesis 3, low Raven participants significantly increase their performance only when they know the Raven score of the high Raven demonstrator, while high Raven participants manage to increase their efficiency in all but NovisLow treatment. This constitutes direct evidence that high Raven participants are able to extract useful information about the environment just by looking at the behaviour of the demonstrator, which confirms the importance of balancing social and individual information 42 .

Participants’ earnings are closely related to efficiency, thus, it is not surprising that we observe similar results. Low Raven participants show significantly higher earnings, as compared to the earnings of the participants in Experiment 1, only when they know that they observe a high Raven demonstrator (rank-sum test: p  = 0.044). High Raven participants increase their earnings in the NovisHigh and VisHigh treatments (rank-sum tests: p  = 0.056 - NovisHigh; p  = 0.002 - VisHigh). This shows that high Raven participants earn more money whenever they observe a high Raven other (with visible Raven score or not) and low Raven participants do so only when they know that they are observing a high Raven other.

Finally, we analyse where the difference in efficiency improvements between low and high Raven participants comes from. We relate it to two characteristics of participants: their Raven score and how often they switch between actions. The latter is a measure of how confident participants are about the expected rewards from the two actions. When expected probabilities of rewards are very different, participants are sure about which action is better. This leads to high confidence and low number of switches between actions. When expected rewards from the two actions are very similar, participants are uncertain about which action should be chosen. Thus, their level of confidence is low and they might switch a lot between actions. The number of switches is also a noticeable feature of the behavior of the demonstrator, which can be taken as a proxy for her level of confidence and, thus, can be utilised in the decision to imitate (see the next section). Table  5 in Appendix  B.5 reports the regressions that connect one’s own Raven score, number of switches, and efficiency/earnings. The regressions show that high Raven participants switch less than low Raven participants and also earn more money, and that a high number of switches decreases efficiency and earnings. Therefore, we find an observable behavioural property—number of switches—that is correlated with the Raven score, determines how efficient the asocial learning strategy is, and can potentially signal the confidence level of the demonstrator.

Strategic Use of Social Information

To support the hypothesis that high Raven participants are more strategic than low Raven ones in their reliance on social information (Hypothesis 3), we analyse imitation choices period by period and test if participants are able to infer the connection between the number of switches of the demonstrator and her efficiency. We estimate a panel logit regression reported in Table  1 (see Table  7 in Appendix  B.6 for the linear probability model) where the dependent variable is an indicator of whether a participant has chosen the same action as the demonstrator or not.

We find that the imitation choices of both high and low Raven participants depend on value of imitation, the variable that tracks how successful imitation was in the recent past: the higher its value, the more participants imitate the other. This finding is not surprising since such behaviour is the simplest and the most natural way of modulating imitation choices. The regression suggests, however, that low Raven participants do not seem to be able to make more complex inferences about the observed choices: they do not use the number of switches of the demonstrator as a signal of her efficiency, and earnings (see Appendix  B.5 ). Conversely, high Raven participants do increase their imitation when they observe that the demonstrator does not switch too often. Thus, in accordance with Hypothesis 3, we conclude that high Raven participants are more strategic than low Raven participants in weighing up social against individual information. In particular, they are able to interpret the number of switches of the demonstrator as a signal of her confidence about the choice and to use it to modulate their imitation. This fits well with the previous findings that the use of social information increases with the confidence of the demonstrator 13 .

Experiment 3

All previous analysis was based on the data from Experiment 2, where participants knew their own Raven score. It is not inconceivable though that the effects on imitation reported above are caused simply by this information and not by the fluid intelligence. In order to show that this is not the case, we ran Experiment 3 that is the same as Experiment 2 in all respects except that participants were not informed about their own Raven score. We find two differences between the experiments. First, in Experiment 3 low Raven participants imitate the high Raven demonstrator significantly more than the low Raven one when her Raven score is not visible (NovisHigh and NovisLow treatments). This can be explained by the higher amount of effort that low Raven participants put into learning from the actions of the demonstrator when they do not know that their Raven score is low (see Appendix  B.6 for the detailed comparison of all analyses). It should be mentioned, though, that the imitation rate of low Raven participants in the NovisHigh treatment is not significantly different in the two experiments. Moreover, Table  6 in Appendix  B.6 shows that low Raven participants in Experiment 3, as well as in Experiment 2, do not react to the number of switches of the other. This confirms that high and low Raven participants use different modes of weighing up social and asocial information in both experiments. The second difference we find is that in Experiment 3 participants’ imitation choices are noisier. This can be related to variation in beliefs that participants have about their own ability to perform in the task. Such variability can lead to increased noise in the decisions to imitate as it is less clear to participants how their own competence relates to that of the demonstrator. In fact, this result supports our claim that the information about participants’ own Raven score as well as demonstrator’s is associated with their confidence. The uncertainty about one’s own Raven score in Experiment 3 does not change the results, and the support for our hypotheses is not undermined.

Many studies attempted to identify factors that can explain the diversity of social learning strategies both within and between groups. It was found that many demographic characteristics like, for example, income, ethnicity, and gender, can not explain this variability 32 , 51 , 52 , 53 , 54 . We show that intelligence plays a role in the way people choose to weigh up social and individual information in their decisions. In particular, high fluid intelligence individuals are able to recognize whether the demonstrator is getting high or low payoffs from just observing her actions and without knowing her intelligence. As a result, high fluid intelligence individuals imitate the demonstrator when they deem it worthwhile. Conversely, low fluid intelligence individuals are unable to extract this information from observations and resort to unconditional imitation when they know that the demonstrator has high fluid intelligence. This is in line with the hypothesis that people turn to social learning when they are uncertain about their own ability or, in other words, their level of confidence is low 55 , 56 . These two modes of processing social information are also related to the findings in studies that explore the degree of reliance on social learning when individual learning is costly or difficult 1 , 57 , 58 , 59 , 60 .

One possible reason why we do find different social learning strategies used by low and high fluid intelligence participants is that in our experiment we did not give explicit information about how hard the task is. When the difficulty of the task is provided by the experimenter 13 , it is plausible to expect that all participants react to it in a similar way. In our case, participants have to learn themselves how difficult the task is for them. Therefore, the choice of how much to rely on social and asocial information depends on participants’ confidence about how to choose in the task and their ability to recognise the competence of the demonstrator. Our findings suggest that these two features are determined by fluid intelligence.

Our results can be interpreted in the light of theories that integrate social and individual learning 1 , 61 , 62 , 63 , 64 . In particular, we find that, at the beginning of the learning task, high fluid intelligence individuals rely on the information about the intelligence of the demonstrator by increasing their imitation (see Fig.  2C ). In the rest of the task their imitation rate stops being dependent on this knowledge and is modulated only by the characteristics of the observed behaviour. This is in line with theories that suggest that in the absence of experience, behaviour is driven by social learning, and later the reliance on social learning decreases as individual information accumulates 1 , 62 . This does not apply to low fluid intelligence individuals who rely on social learning throughout the task when they know that the demonstrator has high intelligence. Thus, we cannot exclude the possibility that low fluid intelligence individuals are unable to integrate the two types of learning 65 , 66 .

The study consisted of three experiments: Experiment 1 in which participants made choices in a 2-armed bandit problem without observing others’ actions, and Experiments 2 and 3 in which participants made choices in the same environment with the only difference that in half the trials they observed the choices made by one of the two participants selected from the first experiment. The purpose of Experiment 1 was to select two participants, one with a high and one with a low Raven score, in order to use them as demonstrators in Experiments 2 and 3. Moreover, we used the data from Experiment 1 to evaluate the improvement in efficiency of the participants in Experiments 2 and 3.

The two demonstrators were chosen using the following procedure. First, we divided participants into deciles of Raven score. Then we calculated the median number of switches between actions for participants in the first and tenth decile. We chose two participants (one in the first and one in the tenth decile) who were the closest to the median. The aim of this procedure was to select two participants who would have a prototypical behaviour in terms of the number of switches in the two extremes of the Raven dimension (the number of switches of the low and high Raven participants are equal to 49 and 20, respectively). We decided to use the number of switches parameter for two reasons: (1) it is an index related to the earnings of the participants and (2) using simulations of optimal behaviour in stationary 2-armed bandit problem we found that the number of switches is an important parameter that can be interpreted as a signal of confidence 44 : sophisticated learners interpret a relatively high number of switches as a signal of bad payoffs and learn to decrease imitation when the number of switches increases.

Participants in Experiments 2 and 3 were divided into four treatments with 2 × 2 design. The dimensions were: (1) the Raven score of the observed other (High or Low) and (2) the information participants received about the Raven score of the observed other (Visible or Non-visible). The Raven score of the observed other could be high (28 correct matrices) or low (15 correct matrices). The treatments are, thus, called VisHigh, VisLow, NovisHigh, and NovisLow. Only participants in the VisHigh and VisLow treatments knew the Raven score of the demonstrator. Participants in the NovisHigh and NovisLow treatments were matched with the corresponding demonstrator without knowing his/her score on the Raven test.

Experiment 2 and Experiment 3 differed in only one respect: in Experiment 2 participants were informed about their own Raven score before the learning task and in Experiment 3 they were not (though, they did take the Raven test before the learning task). Apart from this difference, the experiments were identical.

For the main experiments (Experiments 2 and 3), nine NovisHigh and NovisLow, and ten VisHigh and VisLow sessions were conducted. In each session half the participants observed the high Raven other and the other half observed the low Raven other. All participants were recruited from the subject pool of the Cognitive and Experimental Economics Laboratory at the University of Trento (CEEL). The dates of the sessions and the number of participants per session are reported in Table  9 , Appendix  G . Summary statistics are provided in Appendix  B.1 . Non-parametric tests ensure that participants in all experiments come from the same population: no significant differences were found.

On average participants earned about 20.06, in addition to the 3 show-up fee. The presentation of the 2-armed bandit task was performed using a custom made program implemented in Matlab Psychophysical toolbox. The tests and questionnaires were administered with z-Tree software package 67 . A detailed timeline of the experiment and all instructions are reported in Appendix  C .

Experiment 1

51 participants took part in Experiment 1. In the first part of the experiment participants made choices in a 2-armed bandit problem. In the second part they completed a 20-minutes version of the Raven Advanced Progressive Matrices test 68 , the Holt & Laury Risk Aversion test, the Cognitive Reflection Test, and the Empathy Quotient questionnaire 69 , which was added to the study in order to assess whether empathic abilities affect the way participants imitate others. The time-constrained version of the Raven APM test has been shown to be an adequate predictor of the unconstrained Raven APM score 68 .

After entering the lab, participants were randomly assigned to a PC terminal and were given a copy of the instructions (see Appendix  C ). Instructions were read aloud by the experimenter, and then a set of control questions were provided to ensure the understanding of the 2-armed bandit problem.

The probabilities of getting a 10 cents reward from each of the two hands followed independent stochastic processes (see Fig.  1B ). The process is a decaying Gaussian random walk with parameters λ  = 0.8, decay centre θ  = 0.5, and Gaussian noise with standard deviation 0.2 70 .

Participants were not aware of how the probabilities change but it was made clear that they would change slowly and independently of their choices, earnings, and each other. The 2-armed bandit task included 200 trials divided into four blocks of approximately 50 trials each. At the end of the task participants were not informed about their earnings until after they completed the second part of the experiment (participants were not told their total earnings at the end of the choice task, though, in principle, they could have calculated it by observing the outcomes after each trial). In the second part of the experiment, participants where provided with 20 minutes version of the Raven Advanced Progressive Matrices test. They were told that they have 20 minutes to solve as many problems as they can and that they would earn 30 cents for each correct answer. If participants did not complete an item or their answer was incorrect they would earn 0 cents for that item. At the end of the Raven test participants completed the Holt and Laury lottery task (with real incentives, see Appendix  E ), the CRT test and the EQ questionnaire (Appendices  C and F ). There was no time limit to complete these three tasks and no payment was provided for the CRT test, and the EQ questionnaire. At the end of the Holt & Laury task a single lottery was selected at random and played by the computer to determine payment.

At the end of the second phase, participants were paid according to their choices in the 2-armed bandit problem, their performance in the Raven problems, the outcome of the selected lottery, and a show-up fee of € 3.

Experiments 2 and 3

In Experiments 2, 160 participants first completed the 20-minute version of Raven APM test, the Holt & Laury Risk Aversion test, the Cognitive Reflection Test, and the Empathy Quotient questionnaire, and then played in the 2-armed bandit task. Before the learning task they were informed about their own Raven score. The only difference with Experiment 1 was that participants in the 2-armed bandit problem, sometimes, and before making their choices, also observed the choices (but not the outcomes) made by one of the two selected participants from Experiment 1. The choices of the demonstrator were provided in half the trials (in 100 out of the 200 trials) between trial 10 and trial 200 in blocks of randomized length of 6 to 12 consecutive trials. It was made clear to participants that the observed behaviour was from a real person who took part in the experiment approximately one month before and that he/she chose in the same exact environment (the probabilities of reward in each period were identical). Participants knew that the observed other has completed all parts of the experiment, including the Raven test that they completed at the beginning of the experiment. They were also informed that the demonstrator did not herself observe anyone while choosing in the 2-armed bandit problem task.

Experiment 3 had 142 participants and was the same as Experiment 2, only participants were not informed about their Raven score before the learning task. Also, participants in Experiment 3 did not complete the Holt & Laury Risk Aversion test, the Cognitive Reflection Test, and the Empathy Quotient questionnaire.

In Experiments 2 and 3 participants were shown (and explained) a histogram of the number of the Raven APM problems solved by the 51 participants from Experiment 1 (see Fig.  9 version A in Appendix  C ). In this way, in Experiment 2 they had the possibility to compare their score in the Raven test, which they knew before starting the 2-armed bandit task, with that of the group from which the demonstrator was chosen. No information about a possible connection between performance in the Raven test and the learning task was provided.

The instructions were identical for all participants, except for the information that was given about the score obtained by the observed other. In the NovisHigh and the NovisLow treatments the score obtained by the observed other remained unknown (only the distribution of all Raven score was known). Conversely, in the VisHigh and the VisLow treatments the score of the observed other was marked in red on the histogram and shown on the screen during the experiment (see Fig.  9 versions B and C in Appendix  C ).

Data availability

The data are available upon request.

Ethics committee

The study was approved by the Human Research Ethics Committee of the University of Trento ( http://www.unitn.it/en/ateneo/1755/human-research-ethics-committee ).

Informed consent

All participants gave informed consent to take part in the experiment.

Guidelines and Regulations

All experiments were carried out in accordance with relevant guidelines and regulations of the Human Research Ethics Committee of the University of Trento ( http://www.unitn.it/en/ateneo/1755/human-research-ethics-committee ).

Experimental protocol

The experimental protocol was approved by the Human Research Ethics Committee of the University of Trento ( http://www.unitn.it/en/ateneo/1755/human-research-ethics-committee ).

Images Used in the Experimental Design

All images used in the experimental design (Fig.  1 and all figures in Appendix  C ) were drawn by the authors and are not subject to any copyright. In particular, the drawing of a person was drawn by the authors and the picture of a coin was obtained by scanning a real 10 cents coin.

Boyd, R. & Richerson, P. J. Culture and the evolutionary process (University of Chicago press, 1985).

Rogers, A. R. Does biology constrain culture? American Anthropologist 90 , 819–831 (1988).

Article   Google Scholar  

Laland, K. N. Social learning strategies. Learning and Behavior 32 , 4–14 (2004).

Article   PubMed   Google Scholar  

Galef, J. B. G. & Laland, K. N. Social learning in animals: empirical studies and theoretical models. Bioscience 55 , 489–499 (2005).

Valone, T. J. From eavesdropping on performance to copying the behavior of others: a review of public information use. Behavioral Ecology and Sociobiology 62 , 1–14 (2007).

Article   ADS   Google Scholar  

Rendell, L. et al . Why copy others? Insights from the social learning strategies tournament. Science 328 , 208–213 (2010).

Article   MathSciNet   CAS   PubMed   PubMed Central   MATH   ADS   Google Scholar  

Rendell, L. et al . Cognitive culture: theoretical and empirical insights into social learning strategies. Trends in Cognitive Sciences 15 , 68–76 (2011).

Hoppitt, W. & Laland, K. N. Social Learning : An Introduction to Mechanisms , Methods , and Models (Princeton University Press, 2013).

Molleman, L., van den Berg, P. & Weissing, F. J. Consistent individual differences in human social learning strategies. Nature Communications 5 , 3570 (2014).

Eriksson, K., Enquist, M. & Ghirlanda, S. Critical points in current theory of conformist social learning. Journal of Evolutionary Psychology 5 , 67–87 (2007).

Wakano, J. Y. & Aoki, K. Do social learning and conformist bias coevolve? henrich and boyd revisited. Theoretical Population Biology 72 , 504–512 (2007).

Article   PubMed   MATH   Google Scholar  

Efferson, C., Lalive, R., Richerson, P. J., McElreath, R. & Lubell, M. Conformists and mavericks: the empirics of frequency-dependent cultural transmission. Evolution and Human Behavior 29 , 56–64 (2008).

Morgan, T. J. H., Rendell, L. E., Ehn, M., Hoppitt, W. & Laland, K. N. The evolutionary basis of human social learning. Proceedings of the Royal Society of London B: Biological Sciences 279 , 653–662 (2012).

Article   CAS   Google Scholar  

Schlag, K. H. Why imitate, and if so, how?: A boundedly rational approach to multi-armed bandits. Journal of Economic Theory 78 , 130–156 (1998).

Article   MathSciNet   MATH   Google Scholar  

Kendal, J., Giraldeau, L.-A. & Laland, K. The evolution of social learning rules: payoff-biased and frequency-dependent biased transmission. Journal of Theoretical Biology 260 , 210–219 (2009).

Article   MathSciNet   PubMed   Google Scholar  

Schlag, K. H. Which one should i imitate? Journal of Mathematical Economics 31 , 493–522 (1999).

Henrich, J. & Gil-White, F. J. The evolution of prestige: Freely conferred deference as a mechanism for enhancing the benefits of cultural transmission. Evolution and Human Behavior 22 , 165–196 (2001).

Article   CAS   PubMed   Google Scholar  

Lehmann, L., Feldman, M. W. & Foster, K. R. Cultural transmission can inhibit the evolution of altruistic helping. The American Naturalist 172 , 12–24 (2008).

Lehmann, L. & Feldman, M. W. The co-evolution of culturally inherited altruistic helping and cultural transmission under random group formation. Theoretical Population Biology 73 , 506–516 (2008).

Kendal, R. L., Coolen, I., van Bergen, Y. & Laland, K. N. Trade-offs in the adaptive use of social and asocial learning. Advances in the Study of Behavior 35 , 333–379 (2005).

Heyes, C. Whats social about social learning? Journal of Comparative Psychology 126 , 193–202 (2012).

Sutton, R. S. & Barto, A. G. Reinforcement Learning : An Introduction (Cambridge, Mass.: MIT Press, 1998).

Seymour, B., Singer, T. & Dolan, R. The neurobiology of punishment. Nature reviews. Neuroscience 8 , 300 (2007).

Burke, C. J., Tobler, P. N., Baddeley, M. & Schultz, W. Neural mechanisms of observational learning. Proceedings of the National Academy of Sciences of the United States of America 107 , 14431–14436 (2010).

Article   CAS   PubMed   PubMed Central   ADS   Google Scholar  

Dunne, S. & O’Doherty, J. P. Insights from the application of computational neuroimaging to social neuroscience. Current opinion in neurobiology 23 , 387–392 (2013).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Lindström, B. & Olsson, A. Mechanisms of social avoidance learning can explain the emergence of adaptive and arbitrary behavioral traditions in humans. Journal of Experimental Psychology: General 144 , 688–703 (2015).

Lindström, B., Selbing, I. & Olsson, A. Co-evolution of social learning and evolutionary preparedness in dangerous environments. PLoS One 11 , e0160245 (2016).

Article   PubMed   PubMed Central   Google Scholar  

Biele, G., Rieskamp, J. & Gonzalez, R. Computational models for the combination of advice and individual learning. Cognitive science 33 , 206–242 (2009).

Liljeholm, M., Molloy, C. J. & O’Doherty, J. P. Dissociable brain systems mediate vicarious learning of stimulus—response and action—outcome contingencies. Journal of Neuroscience 32 , 9878–9886 (2012).

Suzuki, S. et al . Learning to simulate others’ decisions. Neuron 74 , 1125–1137 (2012).

McElreath, R. et al . Applying evolutionary models to the laboratory study of social learning. Evolution and Human Behavior 26 , 483–508 (2005).

Muthukrishna, M., Morgan, T. J. & Henrich, J. The when and who of social learning and conformist transmission. Evolution and Human Behavior 37 , 10–20 (2016).

Merlo, A. & Schotter, A. Learning by not doing: An experimental investigation of observational learning. Games and Economic Behavior 42 , 116–136 (2003).

Article   MATH   Google Scholar  

Selbing, I., Lindström, B. & Olsson, A. Demonstrator skill modulates observational aversive learning. Cognition 133 , 128–139 (2014).

Hanaki, N., Jacquemet, N., Luchini, S. & Zylbersztejn, A. Cognitive ability and the effect of strategic uncertainty. Theory and Decision 81 , 101–121 (2015).

Fehr, D. & Huck, S. Who knows it is a game? On strategic awareness and cognitive ability. Experimental Economics 19 , 713–726 (2016).

Benito-Ostolaza, J. M., Hernández, P. & Sanchis-Llopis, J. A. Do individuals with higher cognitive ability play more strategically? Journal of Behavioral and Experimental Economics 64 , 5–11 (2016).

Gill, D. & Prowse, V. Cognitive ability, character skills, and learning to play equilibrium: A level-k analysis. Journal of Political Economy 124 , 1619–1676 (2016).

Kiss, H. J., Rodriguez-Lara, I. & Rosa-Garca, A. Think twice before running! Bank runs and cognitive abilities. Journal of Behavioral and Experimental Economics 64 , 12–19 (2016).

Proto, E., Rustichini, A. & Sofianos, A. Intelligence, personality and gains from cooperation in repeated interactions. Journal of Political Economy forthcoming (2018).

Chang, L. et al . Cultural adaptations to environmental variability: An evolutionary account of east–west differences. Educational Psychology Review 23 , 99–129 (2011).

Mesoudi, A., Chang, L., Dall, S. R. & Thornton, A. The evolution of individual and cultural variation in social learning. Trends in ecology and evolution 31 , 215–225 (2016).

Raven, J., Raven, J. C. & Court, J. H. The advanced progressive matrices. In Manual for Ravens Progressive Matrices and Vocabulary Scales (Oxford, England: Oxford Psychologists Press/San Antonio, TX: The Psychological Corporation, 1998).

Ihssen, N., Mussweiler, T. & Linden, D. E. Observing others stay or switch how social prediction errors are integrated into reward reversal learning. Cognition 153 , 19–32 (2016).

Sniezek, J. A., Schrah, G. E. & Dalal, R. S. Improving judgement with prepaid expert advice. Journal of Behavioral Decision Making 17 , 173–190 (2004).

Apesteguia, J., Huck, S. & Oechssler, J. Imitation—theory and experimental evidence. Journal of Economic Theory 136 , 217–235 (2007).

Mesoudi, A. An experimental simulation of the copy-successful-individuals cultural learning strategy: adaptive landscapes, producerscrounger dynamics, and informational access costs. Evolution and Human Behavior 29 , 350–363 (2008).

Henrich, J. & Broesch, J. On the nature of cultural transmission networks: evidence from fijian villages for adaptive learning biases. Philosophical Transactions of the Royal Society of London B: Biological Sciences 366 , 1139–1148 (2011).

Mesoudi, A., Chang, L., Murray, K. & Lu, H. J. Higher frequency of social learning in china than in the west shows cultural variation in the dynamics of cultural evolution. Proceedings of the Royal Society of London B: Biological Sciences 282 , 20142209 (2015).

Acerbi, A., Tennie, C. & Mesoudi, A. Social learning solves the problem of narrow-peaked search landscapes: experimental evidence in humans. Royal Society Open Science 3 , 160215 (2016).

Laird, J. Chap. Theorizing culture: Narrative ideas and practice principles. Re-visioning Family Therapy (pp. 20–36. Guilford, New York, 1998).

Google Scholar  

Coon, H. M. & Kemmelmeier, M. Cultural orientations in the united states: (re) examining differences among ethnic groups. Journal of Cross-Cultural Psychology 32 , 348–364 (2001).

Sonn, C. Chap. Immigrant adaptation: Understanding the process through sense of community. Psychological Sense of Community: Research, Applications & Implications (pp. 205–222. Kluwer Academic Plenum Publishers, New York, 2002).

Chapter   Google Scholar  

Eriksson, K. & Coultas, J. Are people really conformist-biased? an empirical test and a new mathematical model. Journal of Evolutionary Psychology 7 , 5–21 (2009).

Festinger, L. A theory of social comparison processes. Human relations 7 , 117–140 (1954).

Mesoudi, A. How cultural evolutionary theory can inform social psychology and vice versa. Psychological Review 116 , 929–952 (2009).

Bandura, A. Social learning theory (Oxford, England: Prentice-Hall, 1977).

Boyd, R. & Richerson, P. J. An evolutionary model of social learning: the effects of spatial and temporal variation. Social learning : psychological and biological perspectives 29–48 (1988).

Boyd, R. & Richerson, P. J. Why culture is common, but cultural evolution is rare. In Proceedings - British Academy , vol. 88, 77–94 (1996).

Henrich, J. & Boyd, R. The evolution of conformist transmission and the emergence of between-group differences. Evolution and Human Behavior 19 , 215–241 (1998).

Enquist, M., Eriksson, K. & Ghirlanda, S. Critical social learning: a solution to rogers’s paradox of nonadaptive culture. American Anthropologist 109 , 727–734 (2007).

Borenstein, E., Feldman, M. W. & Aoki, K. Evolution of learning in fluctuating environments: when selection favors both social and exploratory individual learning. Evolution 62 , 586–602 (2008).

Lehmann, L. & Feldman, M. W. Coevolution of adaptive technology, maladaptive culture and population size in a producer–scrounger game. Proceedings of the Royal Society of London B: Biological Sciences 276 , 3853–3862 (2009).

Aoki, K. Evolution of the social-learner-explorer strategy in an environmentally heterogeneous two-island model. Evolution 64 , 2575–2586 (2010).

Tomasello, M. & Call, J. Primate cognition (Oxford University Press, USA, 1997).

Aoki, K. & Nakahashi, W. Evolution of learning in subdivided populations that occupy environmentally heterogeneous sites. Theoretical population biology 74 , 356–368 (2008).

Fischbacher, U. z-tree: Zurich toolbox for ready-made economic experiments. Experimental Economics 10 , 171–178 (2007).

Hamel, R. & Schmittmann, V. D. The 20-minute version as a predictor of the raven advanced progressive matrices test. Educational and Psychological Measurement 66 , 1039–1046 (2006).

Article   MathSciNet   Google Scholar  

Baron-Cohen, S. & Wheelwright, S. The empathy quotient: An investigation of adults with asperger syndrome or high functioning autism, and normal sex differences. Journal of Autism and Developmental Disorders 34 , 162–175 (2004).

Wunderlich, K., Rangel, A. & O’Doherty, J. P. Neural computations underlying action-based decision making in the human brain. Proceedings of the National Academy of Sciences of the United States of America 106 , 17199–17204 (2009).

Download references

Acknowledgements

The authors gratefully acknowledge the financial support of the European Research Council (ERC Consolidation Grant 617629).

Author information

Authors and affiliations.

Center for Mind/Brain Sciences, University of Trento, Trento, Italy

Alexander Vostroknutov & Luca Polonio

Department of Economics, University of Southern California, California, USA

Giorgio Coricelli

You can also search for this author in PubMed   Google Scholar

Contributions

All authors (A.V., L.P. and G.C.) were equally involved in the data collection, statistical analysis, and the preparation of the manuscript.

Corresponding author

Correspondence to Alexander Vostroknutov .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary material, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Vostroknutov, A., Polonio, L. & Coricelli, G. The Role of Intelligence in Social Learning. Sci Rep 8 , 6896 (2018). https://doi.org/10.1038/s41598-018-25289-9

Download citation

Received : 17 November 2017

Accepted : 17 April 2018

Published : 02 May 2018

DOI : https://doi.org/10.1038/s41598-018-25289-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Learning few-shot imitation as cultural transmission.

  • Avishkar Bhoopchand
  • Bethanie Brownfield
  • Lei M. Zhang

Nature Communications (2023)

Adaptive learning strategies in purely observational learning

Current Psychology (2023)

Strategic complexity and cognitive skills affect brain response in interactive decision-making

  • Carlo Reverberi
  • Doris Pischedda
  • Aldo Rustichini

Scientific Reports (2022)

Testing the Effect of Learning Conditions and Individual Motor/Cognitive Differences on Knapping Skill Acquisition

  • Justin Pargeter
  • Dietrich Stout

Journal of Archaeological Method and Theory (2022)

Timing of social feedback shapes observational learning in strategic interaction

  • Joshua Zonca
  • Alexander Vostroknutov
  • Luca Polonio

Scientific Reports (2021)

By submitting a comment you agree to abide by our Terms and Community Guidelines . If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research paper on social intelligence

Help | Advanced Search

Computer Science > Computation and Language

Title: sotopia-$π$: interactive learning of socially intelligent language agents.

Abstract: Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$\pi$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered social interaction data according to large language model (LLM) ratings. We show that our training method allows a 7B LLM to reach the social goal completion ability of an expert model (GPT-4-based agent), while improving the safety of language agents and maintaining general QA ability on the MMLU benchmark. We also find that this training paradigm uncovers some difficulties in LLM-based evaluation of social intelligence: LLM-based evaluators overestimate the abilities of the language agents trained specifically for social interaction.

Submission history

Access paper:.

  • HTML (experimental)
  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

Social Intelligence

  • Living reference work entry
  • First Online: 01 January 2016
  • Cite this living reference work entry

research paper on social intelligence

  • Daniel A. Belton 2 ,
  • Ashley M. Ebbert 2 &
  • Frank J. Infurna 2  

324 Accesses

4 Citations

Social intelligence ; Social relationships ; Emotional intelligence ; Perspective taking

Social relationships are an invaluable component of one’s life. The quality and structure of social relationships are consistently associated with better outcomes across the lifespan, ranging from academic achievement and substance use in adolescence to mental and physical health and longevity in adulthood through old age (Holt-Lunstad et al. 2015 ; Umberson and Montez 2010 ). More specifically, empirical evidence from research studies has repeatedly shown that reporting stronger social relationships and being more integrated or active in one’s social network is associated with feeling happier, better coping with daily and major life stressors, protecting individuals from the incidence of disease, and living a longer life (Cacioppo et al. 2015 ; House et al. 1988 ; Infurna and Luthar in press ). Given the importance of social relationships, understanding how to maintain and strengthen them is...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Albrecht K (2005) Social intelligence: the new science of success. Pfeiffer, New Jersey

Google Scholar  

Baumeister RF, Leary MR (1995) The need to belong: desire for interpersonal attachments as a fundamental human motivation. Psychol Bull 117(3):497–529

Article   Google Scholar  

Berkman LF, Syme SL (1979) Social networks, host resistance, and mortality: a nine-year follow-up study of Almeda County residents. Am J Epidemiol 109(2):186–204

Bonanno RA, Hymel S (2010) Beyond hurt feelings: investigating why some victims of bullying are at greater risk for suicidal ideation. Merril-Palmer Q 56(3):420–440

Cacioppo S, Grippo AJ, London S, Goossens L, Cacioppo JT (2015) Loneliness: clinical import and interventions. Perspect Psychol Sci 10(2):238–249

Dymond RF (1949) A scale for the measurement of empathic ability. J Couns Psychol 13:127–133

Dymond RF (1950) Personality and empathy. J Couns Psychol 14:343–350

Ertel KA, Glymour MM, Berkman LF (2009) Social networks and health: A life course perspective integrating observational and experimental evidence. J Soc Pers Relat 26(1):73–92

Everson-Rose SA, Lewis TT (2005) Psychosocial factors and cardiovascular disease. Annu Rev Public Health 26:469–500

Faust J, Baum CG, Forehand R (1985) An examination of the association between social relationships and depression in early adolescence. J Appl Dev Psychol 6(4):291–297

Goleman D, Boyatzis R (2008) Social intelligence and the biology of leadership. Harv Bus Rev 86(9):96–104

Guilford JP (1967) The nature of human intelligence. McGraw-Hill, New York

Hawkley LC, Cacioppo JT (2010) Loneliness matters: a theoretical and empirical review of consequences and mechanisms. Ann Behav Med 40(2):218–227

Holt-Lunstad J, Smith TB, Baker M, Harris T, Stephenson D (2015) Loneliness and social isolation as risk factors for mortality: a meta-analytic review. Perspect Psychol Sci 10(2):227–237

Honeywill R (2015) The man problem: destructive masculinity in Western culture. Palgrave Macmillan, New York

House JS, Landis KR, Umberson D (1988) Social relationships and health. Science 241(4865):540–545

Hunt T (1928) The measurement of social intelligence. J Appl Psychol 12(3):317–334

Infurna FJ, Luthar SS (in press) The multidimensional nature of resilience to spousal loss. J Personal Social Psychol

Jolliffe D, Farrington DP (2011) Is low empathy related to bullying after controlling for individual and social background variables? J Adolesc 34(1):59–71

Kosmitzki C, John OP (1993) The implicit use of explicit conceptions of social intelligence. Personal Individ Differ 15(1):11–23

La Greca AM, Lopez N (1998) Social anxiety among adolescents: linkage with peer relations and friendships. J Abnorm Child Psychol 26(2):83–94

Moss FA, Hunt T, Omwake KT, Woodward LG (1955) Manual for the George Washington University Series Social Intelligence Test. The Center for Psychological Service, Washington, DC

O’Sullivan M, Guilford JP, deMille R (1965) The measurement of social intelligence. Reports from the Psychological Laboratory, University of Southern California, 34 , 1–44

O’Sullivan M, Guilford JP (1966) Six factor tests of social intelligence: manual of instructions and interpretations. Sheridan Psychological Services, Beverly Hills

Robles TF, Kiecolt-Glaser JK (2003) The physiology of marriage: pathways to health. Physiol Behav 79(3):409–416

Silvera DH, Martinussen M, Dahl TL (2001) The Tromsø Social Intelligence Scale, a self-report measure of social intelligence. Scand J Psychol 42(4):313–319

Snow NE (2010) Virtue as social intelligence: an empirically grounded theory. Taylor & Francis, New York

Thorndike EL (1920) Intelligence and its use. Harper’s Mag 140:227–235

Uchino BN (2006) Social support and health: a review of physiological processes potentially underlying links to disease outcomes. J Behav Med 29(4):377–387

Umberson D, Montez J (2010) Social relationship and health: a flashpoint for health policy. J Health Soc Behav 51(1):S54–S66

Walker RE, Foley JM (1973) Social intelligence: its history and measurement. Psychol Rep 33(3):839–864

Zaccaro SJ, Gilbert JA, Thor KK, Mumford MD (1991) Leadership and social intelligence: linking social perspective and behavioral flexibility to leader effectiveness. Leadersh Q 2(4):317–342

Zautra AJ, Infurna FJ, Zautra E, Gallardo CE, Velasco L (2016) The humanization of social relations: nourishment for resilience. In: Ong A, Löckenhoff CE (eds) Emotion, aging, and health. American Psychological Association, Washington, DC, pp 207–227

Chapter   Google Scholar  

Zautra EK, Zautra AJ, Gallardo C, Velasco L (2015) Can we learn to treat one another better? A test of a social intelligence curriculum. PLoS One 10(6)

Zautra AJ, Zautra EK, Ribers C, Rivers D (2012) Foundations of social intelligence: a conceptual model with implications for business performance. Curr Top Manag 16:15–37

Download references

Author information

Authors and affiliations.

Arizona State University, Tempe, AZ, USA

Daniel A. Belton, Ashley M. Ebbert & Frank J. Infurna

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Frank J. Infurna .

Editor information

Editors and affiliations.

Florida Atlantic University, Boca Raton, Florida, USA

Ali Farazmand

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this entry

Cite this entry.

Belton, D.A., Ebbert, A.M., Infurna, F.J. (2016). Social Intelligence. In: Farazmand, A. (eds) Global Encyclopedia of Public Administration, Public Policy, and Governance. Springer, Cham. https://doi.org/10.1007/978-3-319-31816-5_2393-1

Download citation

DOI : https://doi.org/10.1007/978-3-319-31816-5_2393-1

Received : 28 March 2016

Accepted : 23 April 2016

Published : 30 May 2016

Publisher Name : Springer, Cham

Online ISBN : 978-3-319-31816-5

eBook Packages : Springer Reference Economics and Finance Reference Module Humanities and Social Sciences Reference Module Business, Economics and Social Sciences

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Healthcare (Basel)

Logo of healthcare

Emotional Intelligence Measures: A Systematic Review

Lluna maría bru-luna.

1 Department of Basic Psychology, Faculty of Psychology and Speech Therapy, Universitat de València, 46010 Valencia, Spain; [email protected]

Manuel Martí-Vilar

César merino-soto.

2 Psychology Research Institute, Universidad de San Martín de Porres, Lima 15102, Peru

José L. Cervera-Santiago

3 Department of Psychology, Faculty of Psychology, Universidad Nacional Federico Villarreal, San Miguel 15088, Peru; ep.ude.vfnu@arevrecj

Associated Data

Not applicable.

Emotional intelligence (EI) refers to the ability to perceive, express, understand, and manage emotions. Current research indicates that it may protect against the emotional burden experienced in certain professions. This article aims to provide an updated systematic review of existing instruments to assess EI in professionals, focusing on the description of their characteristics as well as their psychometric properties (reliability and validity). A literature search was conducted in Web of Science (WoS). A total of 2761 items met the eligibility criteria, from which a total of 40 different instruments were extracted and analysed. Most were based on three main models (i.e., skill-based, trait-based, and mixed), which differ in the way they conceptualize and measure EI. All have been shown to have advantages and disadvantages inherent to the type of tool. The instruments reported in the largest number of studies are Emotional Quotient Inventory (EQ-i), Schutte Self Report-Inventory (SSRI), Mayer-Salovey-Caruso Emotional Intelligence Test 2.0 (MSCEIT 2.0), Trait Meta-Mood Scale (TMMS), Wong and Law’s Emotional Intelligence Scale (WLEIS), and Trait Emotional Intelligence Questionnaire (TEIQue). The main measure of the estimated reliability has been internal consistency, and the construction of EI measures was predominantly based on linear modelling or classical test theory. The study has limitations: we only searched a single database, the impossibility of estimating inter-rater reliability, and non-compliance with some items required by PRISMA.

1. Introduction

1.1. emotional intelligence.

Emotional intelligence (EI) was first described and conceptualized by Salovey and Mayer [ 1 ] as an ability-based construct analogous to general intelligence. They argued that individuals with a high level of EI had certain skills related to the evaluation and regulation of emotions and that consequently they were able to regulate emotions in themselves and in others in order to achieve a variety of adaptive outcomes. This construct has received increasing attention from both the scientific community and the general public due to its theoretical and practical implications for daily life. The same authors defined EI as “the ability to carry out accurate reasoning about emotions and the ability to use emotions and emotional knowledge to enhance thought” [ 2 ] (p. 511). This definition suggests that EI is far from being conceptualized as a one-dimensional attribute and that a multidimensional operationalization would be theoretically coherent.

1.2. Conceptualizations of Emotional Intelligence

However, over the past three decades, different ways of conceptualizing EI have emerged, which are mainly summarized in three models: ability, trait, and mixed. These models have influenced the construction of measuring instruments. In the ability model, developed by Mayer and Salovey, EI is seen as a form of innate intelligence made up of several capacities that influence how people understand and manage their own emotions and those of others. These emotion processing skills are: (1) perception, evaluation and expression of emotions, (2) emotional facilitation of thought, (3) understanding and analysis of emotions, and (4) reflective regulation of emotions [ 3 , 4 ]. Consistent with this conceptualization, the measures were designed as performance tests. Subsequently, the model proposed by Petrides and Furnham [ 5 ], the trait model, was developed. This model defines EI as a trait; that is, as a persistent behaviour pattern over time (as opposed to skill, which increases with time and training), and it is associated with dispositional tendencies, personality traits or self-efficacy beliefs. It is composed of fifteen personality dimensions, grouped under four factors: well-being, self-control, emotionality and sociability [ 6 ]. The last of the three main models of conceptualization of EI is the mixed one. It is made up of two large branches that consider this construct a mixture of traits, competencies and abilities. According to the first one, developed by Bar-On [ 7 ], EI is a set of non-cognitive abilities and competences that influence the ability to be successful in coping with environmental demands and pressures, and it is composed of five key components: intrapersonal skills, interpersonal skills, adaptation skills, stress management skills and general mood. The second one, proposed by Goleman [ 8 ], also conceptualizes EI as a mixed model that shares certain aspects with the Bar-On model. It is made up of the following elements: recognition of one’s own emotions, management of emotions, self-motivation, recognition of emotions in others, and management of relationships. These emotional and social competencies would contribute to managerial performance and leadership.

1.3. Importance of Emotional Intelligence

To date, the importance that academics attach to the study of EI has been recognized by the literature in many areas, such as the workplace. For example, in professions where working with people is needed, burnout syndrome is common. It is a syndrome that is expressed by an increase in emotional exhaustion and indifference, as well as by a decrease in professional effectiveness [ 9 ]. To date, numerous studies have shown that EI can help change employee attitudes and behaviours in jobs involving emotional demands by increasing job satisfaction and reducing job stress [ 10 , 11 , 12 , 13 ]. Likewise, on the one hand, it has been found that certain psychological variables, including EI and social competence, are related to less psychological distress. On the other hand, the acquisition of emotional and social skills can serve to develop resilience, which is a protective variable against psychological distress [ 14 ].

1.4. Types of Measures

With the challenge of choosing the conceptual model of EI also appears the challenge of choosing the appropriate measures to estimate it. For this reason, part of the work developed in the field of EI has focused on the creation of objective instruments to evaluate aspects associated with this construct. Most of them have been created around the main conceptualization models described in the previous paragraphs. Ability-based tools indicate people’s ability to understand emotions and how they work. These types of tests require participants to solve problems that are related to emotions and that contain answers deemed correct or incorrect (e.g., participants see several faces and respond by indicating the degree to which a specific emotion is present in the face). These instruments are maximal capacity tests and, unlike trait tests, they are not designed to predict typical behaviour. Ability EI instruments are usually employed in situations where a good theoretical understanding of emotions is required [ 15 ].

Trait-based instruments are generally composed of self-reported measures and are often developed as scales where there are no correct or incorrect answers, but the individual responds by choosing the item which relates more or less to their behaviour (e.g., “Understanding the needs and desires of others is not a problem for me”). They tend to measure typical behaviour, so they tend to provide a good prediction of actual behaviours in various situations [ 5 ]. Trait EI is a good predictor of effective coping styles when facing everyday stressors, both in adults and children, so these instruments are often used in situations characterized by stressors such as educational and employment contexts [ 15 ].

Questionnaires based on the EI mixed conceptualization often measure a combination of traits, social skills, competencies, and personality measures through self-reported modality (e.g., “When I am angry with others, I can tell them”). Some measures typically take 360-degree forms of assessment too (i.e., a self-report along with reports from supervisors, colleagues and subordinates). They are generally used in work environments, since they are often designed to predict and improve workplace performance and are often focused on emotional competencies that correlate with professional success. Despite the different ways of conceptualizing EI, there are some conceptual similarities between most instruments: they are hierarchical (i.e., they produce a total EI score along with scores on the different dimensions) and they have several conceptual overlaps that often include emotional perception, emotional regulation, and adaptive use of emotions [ 15 ].

1.5. Relevance of the Study

The proliferation of EI measures has received a lot of attention. However, this has not been the case in studies that synthesize their psychometric qualities, as well as those that describe their strengths and limitations. Therefore, there is a lack of studies that collect, with a wide review coverage, the instruments developed in recent years. The few reviews that can be found [ 16 , 17 , 18 , 19 ] are limited to describing both the most popular measures (e.g., Mayer–Salovey–Caruso Emotional Intelligence Test [MSCEIT], Emotional Quotient Inventory [EQ-i], Trait Meta-Mood Scale [TMMS], Trait Emotional Intelligence Questionnaire [TEIQue], or Schutte Self-Report Inventory [SSRI]) and those validated only in English, producing an apparent “Tower of Babel” effect (i.e., the over-representation of studies in one language and the under-representation in others) [ 20 ]. This is a problem that is not only more common than is believed, but it is also persistent [ 21 ]. This effect produces a barrier for the complete knowledge of current EI measures, the breadth of their uses in different contexts, and their incorporation into substantive studies relevant to multicultural understanding. In summary, it reduces the commonality of efforts made in different contexts to identify common and communicable objectives [ 22 ], specifically around the study of EI.

Therefore, a systematic review allows us to establish a knowledge base that contributes by (a) guiding and developing research efforts, (b) assisting in professional practice when choosing the most appropriate model in possible practical scenarios, and (c) facilitating the design of subsequent systematic evaluative reviews and meta-analysis of relevant psychometric parameters (e.g., factorial loads, reliability coefficients, correlations, etc.). For this reason, the aim of this article is to provide an updated systematic review of the existing instruments that allow the evaluation of EI in professionals, focusing on the description of its characteristics, as well as on its psychometric properties (reliability and validity). This systematic review is characterized by having a wide coverage (i.e., studies published in languages other than English) and having as a framework a consensus of description and taxonomy of valid evidence (i.e., “Standards”) [ 23 ].

2. Materials and Methods

This work contains a systematic review of the scientific literature published to date that includes measurements of EI. For its preparation, the guidelines proposed in the PRISMA statement [ 24 ] ( Table A1 ) carrying out systematic reviews have been followed. Regarding the evaluation of the quality of the articles, since our study does not analyse the studies that employ the EI instruments but the instruments themselves, the assessment of the internal or external validity of the studies is not applicable to this research. However, an internationally proposed guide to the study of the validity of instruments, called “Standards”, has also been used [ 23 ]. It presents guidelines for the study of the composition, use, and interpretation of what a test aims to measure and proposes five sources of validity of evidence: content, response processes, internal structure, relationship with other variables and the consequences of testing. Likewise, a recently proposed registration protocol [ 25 ] for carrying out systematic reviews has also been followed based on the five validity sources of the “Standards”.

2.1. Information Sources

The bibliographic search was carried out in three phases: an initial search to obtain an overview of the current situation, a system that applies inclusion–exclusion criteria, and a manual search to evaluate the results obtained. The search was conducted in February 2021 in the Web of Science (WoS) database, including all articles published from 1900 to 2020 (inclusive). This database was selected to perform the search because (a) it is among the databases that allows for a more efficient and adequate search coverage [ 26 ]; (b) it provides a better quality of indexing and of bibliographic records in terms of accuracy, control and granularity of information compared to other databases [ 27 ]; (c) the results are highly correlated with those of other search engines (e.g., Embase, MEDLINE and Google Scholar) [ 26 ]; (d) it is controlled by a human team specialising in the selection of its content (i.e., it is not fully automated) [ 28 ]; and (e) it has experienced a constant increase in scientific publications [ 29 ].

2.2. Eligibility Criteria

Although no protocol was written or registered prior to the research, the inclusion and exclusion criteria for articles and instruments were previously defined. The search was conducted according to these criteria.

2.2.1. Inclusion Criteria

The inclusion criteria for the studies are made up of the following points: (a) published in peer-reviewed journals, (b) presented as full articles or short communications, (c) containing empirical and quantifiable results on psychometric properties (i.e., not only narrative descriptions), (d) containing cross-sectional or longitudinal designs, (e) written in any language (in order to collect as many instruments as possible, as well as to reduce the “Tower of Babel” effect) [ 20 ], and (f) published from 1900 to 2020 (to maximize the identification of EI measures).

As for the inclusion criteria of the instruments, they are made up of the following points: (a) instruments that measure EI, (b) articles that are the first creation study of the instrument, (c) instruments aimed at people over 18 years, (d) instruments that can be applied in the workplace.

2.2.2. Exclusion Criteria

On the other hand, research that presented at least one of the following exclusion criteria was discarded: (a) contains synthesis studies (i.e., systematic reviews or meta-analyses), instrument manuals or narrative articles of instrument characteristics, (b) contains only qualitative research designs, (c) published after 2020.

Instruments that presented at least one of the following exclusion criteria were discarded: (a) instruments that were validations of the original one, (b) instruments aimed at people under 18, (c) instruments to be used in areas specifically different from the workplace.

2.3. Search Strategy

All available methods to obtain empirical answers have been included so as to maximize the coverage of the results. The following terms were included: test, measure, questionnaire, scale and instrument. The combinations of terms used were: “emotional intelligence AND test”, “emotional intelligence AND measure”, “emotional intelligence AND questionnaire”, “emotional intelligence AND scale”, and “emotional intelligence AND instrument”. Only those article-type studies were selected.

In the selection process, the title, abstract and keywords of the studies identified in the search were reviewed with the aforementioned criteria. This was carried out by only one of the authors.

2.4. Data Collection

The data to be extracted from each of the instruments were also defined in advance, ensuring that the information was extracted in a uniform manner. The selected documents were then recorded in a Microsoft Excel spreadsheet to check for duplicate records.

Thus, the name of the instrument and its acronym, the language and country in which it was created, and its structural characteristics (i.e., type of measurement, number of items, dimensions and items of which they were composed, and theoretical model) were extracted together with relevant psychometric information (i.e., reliability and validity). This procedure was also carried out by the same author. Articles that used different versions of the original EI instrument were accepted, but the analysis was made only on their originals. Instruments whose original manuscript were inaccessible were discarded ( n = 10), but they are presented at the end of the results. All those articles that were duplicated or that had used measures aimed at people under 18 or for contexts specifically different from the professional area (e.g., school contexts, sports contexts, etc.) were eliminated. The search process and the number of selected and excluded results can be seen in Figure 1 . Regarding the ethical standards, no ethical approval or participant consent is required for this type of research (i.e., systematic review).

An external file that holds a picture, illustration, etc.
Object name is healthcare-09-01696-g001.jpg

Flowchart according to PRISMA.

A total of 40 instruments were found ( Table 1 shows a synthesis of all of them). Below, a brief description of each one is presented, following which a division according to the theoretical model they use (i.e., ability-based model, trait-based model, mixed approach model, and others that do not correspond to any of them), and the psychometric properties of each one are explained.

Main characteristics of the included instruments.

TMMS: Trait Meta-Mood Scale, LOT: Life Orientation Test, CES-D: Center for Epidemiologic Studies Depression Scale; SSRI: Schutte Self-Report Inventory, BFP: Big Five Personality, TAS: Toronto Alexithymia Scale, ZDS: Zung Self-Rating Depression Scale, BIS: Barratt Impulsiveness Scale; MEIS: Multifactor Emotional Intelligence Scale; MSCEIT: Mayer-Salovey-Caruso Emotional Intelligence Test, MSCEIT 2.0: Mayer-Salovey-Caruso Emotional Intelligence Revised Version, MSCEIT-YV: Mayer-Salovey-Caruso Emotional Intelligence Youth Version, MSCEIT-TC: Mayer-Salovey-Caruso Emotional Intelligence Chinese Version; PIEMO: Profile of Emotional Intelligence; WLEIS: Wong and Law’s Emotional Intelligence Scale, EQ-i: Emotional Quotient Inventory; WEIP-3: Workgroup Emotional Intelligence Profile-3, WEIP-S: Workgroup Emotional Intelligence Profile-Short Version, IRI: Interpersonal Reactivity Index, JABRI: Job Associate-Bisociate Review Index; MEIA: Multidimensional Emotional Intelligence Assessment, JPI-R: Jackson Personality Inventory-Revised, MEIA-W: Multidimensional Emotional Intelligence Assessment-Workplace, MEIA-W-R: Multidimensional Emotional Intelligence Assessment-Workplace-Revised; EmIn: Emotional Intelligence Questionnaire; IIESS-R: Sojo and Steinkopf Emotional Intelligence Inventory-Revised Version; SREIS: Self-Rated Emotional Intelligence Scale; EISDI: Emotional Intelligence Self-Description Inventory; GEIS: Greek Emotional Intelligence Scale, SSI: Social Skills Inventory, EES: Emotion Empathy Scale, SWLS: Satisfaction with Life Scale, PANAS: Positive and Negative Affect Schedule, ASSET: An Organisational Stress Screening Tool; STEM: Situational Test of Emotion Management; OCEANIC-20: Openness Conscientiousness Extraversion Agreeableness Neuroticism Index Condensed 20-item version, STEM-B: Situational Test of Emotion Management-Brief Version; STEU: Situational Test of Emotional Understanding, STEU-B: Situational Test of Emotional Understanding-Brief Version; ESCQ: Emotional Skills and Competence Questionnaire; AVEI: Audiovisual Test of Emotional Intelligence; GERT: Geneva Emotion Recognition Test, GERT-S: Geneva Emotion Recognition Test-Short Version, GECo: Geneva Emotional Competence Test; TIE: Test of Emotional Intelligence, SIE-T: Emotional Intelligence Scale-Faces, NEO-FFI: NEO Five-Factor Inventory; EIQ-SP: Self-Perception of Emotional Intelligence Questionnaire; TEIFA: Three-Branch Emotional Intelligence Forced-Choice Assessment; TEIRA: Three-Brach Emotional Intelligence Rating Scale Assessment; NEAT: North Dakota Emotional Abilities Test, DANVA 2-AF: Diagnostic Analysis of Nonverbal Accuracy-Adult Faces; IIEP: Perceived Emotional Intelligence Inventory; MEIT: Mobile Emotional Intelligence Test; RAVEN: Raven’s Progressive Matrices; EIT: Emotional Intelligence Test; EQ-i: S: Emotional Quotient Inventory Short Version, EQ-i: 2.0: Emotional Quotient Inventory Revised Version, EQ-i: 360°: Emotional Quotient Inventory-360-degree version; EQ-i: YV: Emotional Quotient Inventory-Youth Version, EQ-i: YVS: Emotional Quotient Inventory Youth Short Version; ECI 2.0: Emotional Competence Inventory 2.0, ECI-U: Emotional Competence Inventory University Version; EIQ: Emotional Intelligence Questionnaire; 16PF: Sixteen Personality Factor Questionnaire, OPQ: Occupational Personality Questionnaire, BTR: Belbin Team Roles; EIA: Emotional Intelligence Appraisal; EIS: Emotional Intelligence Scale; USMEQ-I: USM Emotional Quotient Inventory; TEIQue: Trait Emotional Intelligence Questionnaire, TEIQue-SF: Trait Emotional Intelligence Questionnaire-Short Form, TEIQue-360°: Trait Emotional Intelligence Questionnaire-360-degree version, TEIQue-AF: Trait Emotional Intelligence Questionnaire Adolescent Form, TEIQue-CF: Trait Emotional Intelligence Questionnaire-Child Form; REIS: Rotterdam Emotional Intelligence Scale, PEC: Profile of Emotional Competence.

3.1. Ability-Based Measures

The first category includes those instruments based on the ability-based model, mainly on that of Mayer and Salovey [ 4 ]. The first instrument created under this conceptualization is the Trait Meta-Mood Scale (TMMS) [ 30 ], a self-report scale designed to assess people’s beliefs about their own emotional abilities. It measures three key aspects of perceived EI: attention to feelings, emotional clarity and repair of emotions. It presents a very good reliability [ 80 ] and convergent validity with various instruments, although the authors recommend the use of a later version of 30 items. It also presents a widely used 24-item version [ 31 ] that has been validated in many countries.

Three years later, the Schutte Self-Report Emotional Intelligence (SSRI) test was developed [ 33 ]. This questionnaire is answered through a five-point Likert scale and is composed of one factor that is divided into three categories: appraisal and expression of emotion in the self and others, regulation of emotion in the self and others and utilization of emotions in solving problems. It shows excellent internal consistency. It presents negative correlations with instruments that measure alexithymia, depression and impulsivity among others, which confirms its convergent validity. There is a modified version [ 34 ] and an abbreviated version [ 35 ], and it has been translated into many languages.

The Multifactor Emotional Intelligence Scale (MEIS) [ 37 ] is another tool developed by the authors that originally defined and conceptualized EI. The MEIS is a scale made up of 12 different tasks that contains 402 items and it has been translated into several languages. However, it has strong limitations such as its length and the low internal consistency offered by some of the tasks (e.g., “blends” and “progressions”; α = 0.49 and 0.51, respectively). These authors developed, years later, the Mayer–Salovey–Caruso Emotional Intelligence Test (MSCEIT) [ 38 ]. The items developed for the MEIS served as the starting point for the MSCEIT. This measure is composed of a five-point Likert scale and multiple response items with correct and incorrect options, which comprise eight tasks. Each of the four dimensions is assessed through two tasks. It presents an adequate internal consistency. It currently has a revised version by the same authors, and another validated in a young population. In addition, it has been translated into many languages. This instrument has detractors. Its convergent validity has been questioned since no correlation has been found between the emotional perception scale of MSCEIT and other emotional perception tests [ 81 ]. As can be seen in Table 1 , the MSCEIT has two different approaches to construct the score (consensus score and expert score). In the case of EI, it is difficult to classify an answer as correct or incorrect, so if a person responds in a different way to the experts or the average, it might mean that they have low emotional capacity or present a different way of thinking [ 81 ].

In the same year, three more instruments based on this conceptualization were developed in different countries. The first one, the Profile of Emotional Intelligence (PIEMO) [ 40 ] is an inventory developed in Mexico. Their items consist of a statement that represents a paradigmatic behaviour trait of EI with true and false answers. It is composed of eight independent dimensions that together constitute a profile. Its internal consistency is excellent and its validity has been tested by a confirmatory factor analysis and expert consultations on the items.

The second instrument is Wong and Law’s Emotional Intelligence Scale (WLEIS) [ 41 ]. It was developed in China to measure EI in a brief way in leadership and management studies. It has an adequate internal consistency and has positive correlations with the TMMS and the EQ-i. Subsequent studies have shown its predictive validity in relation to life satisfaction, happiness or psychological well-being, and its criteria’s validity with respect to personal well-being. Measurement equivalence of scores in different ethnic and gender groups has also been tested [ 82 ]. It has been translated into a multitude of languages and it is currently one of the most widely used instruments.

The third instrument is the Workgroup Emotional Intelligence Profile-3 (WEIP-3) [ 43 ]. It is a scale designed in Australia as a self-report to measure the EI of people in work teams. It has very good internal consistency and presents correlations with several instruments that prove its convergent validity. The authors made a particularly interesting finding in their study. Teams that scored lower in the WEIP-3 performed at lower levels in their work than those with high EI. This instrument has a short version and has been translated into different languages.

The Multidimensional Emotional Intelligence Assessment (MEIA) [ 45 ] was developed in the USA. The authors state that the test takes only 20 min. It has very good internal consistency. Its validity has been tested in different ways. Content validity was tested by independent experts who considered each element as representative of its target scale. Convergent validity was tested by significant correlations between the scores and personality tests. Finally, the lack of correlation between the MEIA and theoretically unrelated personality tests proved the divergent validity. It has a version for the work context.

The Sojo and Steinkopf Emotional Intelligence Inventory—Revised version (IIESS-R) [ 47 ] was developed in Venezuela to measure the three dimensions that compose it. It presents 34 phrases that describe the reactions of people with high EI, as well as contrary behaviours. It has excellent internal consistency and its content has been validated through expert judgment. It shows correlations with some scales of similar instruments and its internal structure has been tested by exploratory analysis and PCA.

In the original article of the Emotional Intelligence Questionnaire (EmIn), created for the Russian population [ 46 ], its author proposes his own model of ability-based EI that differs in some aspects from that proposed by Mayer and Salovey. Accordingly, he designed a questionnaire to measure the participants’ beliefs about their emotional abilities under this model. It is composed of two dimensions answered using a 4-point Likert scale. Their scales have a good internal consistency, but their validity has not been tested beyond the factor analysis of its internal structure. Years later, this same author developed the Videotest of Emotion Recognition [ 59 ], an instrument that uses videos as stimuli. It was also designed in Russia to obtain precision indexes in the recognition of the types of emotions, as well as the sensitivity and intensity of the observed emotions. It has 15 scales that measure through a single item each of the emotions recorded by the instrument. Its internal consistency is good. It is correlated with MSCEIT and EmIn, which proves its convergent validity.

Another instrument based on the Mayer and Salovey model is the Self-Rated Emotional Intelligence Scale (SREIS) [ 49 ]. It was developed throughout three studies that used the MSCEIT as a comparison. The first one did not show a very high correlation between the scores of both tools. In the second one, only men’s MSCEIT scores correlated with perceived social competence after personality measures remained constant. Finally, in the third only MSCEIT predicted social competence, but only for males again. Internal consistency was also not consistent throughout the three studies, as the α yielded values were 0.84, 0.77, and 0.66, respectively. Its internal structure was tested by a confirmatory factor analysis and the content of each item was validated by the judgment of students familiar with the Mayer and Salovey model. It has been translated into several languages.

The Emotional Intelligence Self-Description Inventory (EISDI) [ 49 ] is also a short instrument, consisting of four dimensions designed to assess EI in the workplace. It has an excellent internal consistency. It presents correlations with instruments such as the WLEIS and the SREIS and a discriminant validity with the Big Five Personality. The same year, the Greek Emotional Intelligence Scale (GEIS) [ 51 ] was developed in Greek to assess four basic dimensions of EI. Its internal consistency is very good, as well as its test–retest value. Its internal structure was verified by a PCA, and its convergent and divergent validity were tested by a series of studies with 12 different instruments.

MacCann and Roberts [ 51 ] developed two instruments to assess EI according to the ability-based model: the Situational Test of Emotion Management (STEM) and the Situational Test of Emotional Understanding (STEU). Both are made up of three dimensions and a similar number of items. The first one measures the management of emotions such as anger, sadness and fear, and it can be administered in two formats: multiple choice response and rate-the-extent (i.e., test takers rate the appropriateness, strength, or extent of each alternative, rather than selecting the correct alternative). The STEU presents a series of situations about context-reduced, personal-life context, and workplace-context, which provoke a main emotion that is the correct answer to be chosen by the participant among other incorrect ones. Both instruments have similar internal consistency for the multiple response format, while for the rate-the-extent format it is much higher. Both present criteria and convergent validity and have an abbreviated version.

The Emotional Skills and Competence Questionnaire (ESCQ) [ 53 ] is an instrument developed in Croatia that measures EI through three basic dimensions using a five-point Likert scale. The subscales have a reliability that varies between good an excellent, and they correlate with other EI and personality instruments. The ESCQ has been translated into several languages.

The Audiovisual Test of Emotional Intelligence (AVEI) [ 55 ] is an Israeli instrument aimed at educational settings related to care-centred professions. Their items are developed from primary and secondary emotions, both positive and negative. Each one consists of short videos generated by researchers with training in psychology and visual arts. People should choose the correct answer among 10 alternatives and it takes between 12 and 18 min to be completed. It requires computers equipped with audio. The internal consistency was calculated using ICC coefficients. It has content validations through expert consultations on the items and criteria since it correlates with measures traditionally related to EI.

The Geneva Emotion Recognition Test (GERT) [ 57 ] is a German test composed of 14 scales. The stimuli are, as in the AVEI, short image and audio videos recorded by five men and five women of different ages. Thus, people must choose which of the 14 emotions is being expressed by the actors, with the responses labelled as correct or incorrect. The reliability of the test is considered excellent, and the ecological and construct validity of the instrument has been tested.

The Test of Emotional Intelligence (TIE) [ 58 ] is developed in Poland. It consists of the same four dimensions as the MSCEIT. After providing participants with different emotional problems, they should indicate which emotion is most likely to occur or choose the most appropriate action. The score is based on expert judgment. It has a very good internal consistency. It has convergent validity since it correlates with the SSEIT and has construct since women scored higher than men.

The Self-Perception of Emotional Intelligence Questionnaire (EIQ-SP) [ 60 ] is an instrument designed in Portugal and composed of the four dimensions belonging to the Mayer and Salovey’s ability-based model. Their scales have good internal consistency and are correlated with each other.

The Three-Branch Emotional Intelligence Rating Scale Assessment (TEIRA) [ 61 ] and the Three-Branch Emotional Intelligence Forced-Choice Assessment (TEIFA) [ 61 ] were developed in 2015. The first is made up of three scales and is answered by a six-point Likert scale. It presents internal consistency between good and excellent and convergent validity with STEU-B and STEM-B. On the other hand, TEIFA presents a format of forced choice in order to avoid the problem of social desirability in the rating scales. In this format, participants must choose among several positive statements and therefore they cannot simply rate themselves highly on everything (e.g., “Which one is more like you: I know why my emotions change or I manage my emotions well”). It consists of the same items and dimensions as the TEIRA. The study does not report the reliability of TEIFA, as the reliability of the forced-choice tests is artificially high. It presents convergent validity with the SSRI.

A year later, the North Dakota Emotional Abilities Test (NEAT) [ 62 ] was developed in the USA to assess the ability to perceive, understand and control emotions in the workplace. It contains items that describe scenarios of work environments, in which the person must rate the extent of certain emotions that the protagonist would experience in a certain situation. The internal consistency of its scales varies between good and excellent and its internal structure has been tested by a confirmatory factor analysis. In addition, the predictive validity of the instrument has also been tested.

The Inventory of Perceived Emotional Intelligence (IIEP) [ 63 ] was developed in Argentina. It measures different components of intrapersonal and interpersonal EI. This inventory is answered using a five-point Likert scale and it has reliable dimensions. Its content validity has been tested through consultations with judges to evaluate the items.

The last of the instruments in this category is the Emotional Intelligence Test (EIT) [ 65 ]. It was developed in Russia and has four dimensions that assess EI in the workplace. It has excellent internal consistency and convergent validity tested by correlations with the MSCEIT 2.0. No information regarding the items that compose it has been found.

3.2. Measures Based on the Mixed Model

The second category includes those instruments based on the mixed EI model, mainly the Bar-On model [ 7 ] and the Goleman model [ 8 ]. The first instrument of this model is the Emotional Quotient Inventory (EQ-i) [ 7 ]. Its author was the first to define EI as a mixed concept between ability and personality trait. It is a self-report measure of behaviour that provides an estimate of EI and social intelligence. Their items are composed of short sentences that are answered using a five-point Likert scale. It takes about 30 min to complete, so other shorter versions have been developed, as well as a 360-degree version and a version for young people. It has been translated into more than 30 languages. It has an internal consistency between good and very good and its construct validity has been tested by correlations with other variables.

Emotional Competence Inventory 2.0 (ECI 2.0) [ 67 ], also called ESCI, is a widely used instrument. It was developed in the USA by another of the authors who conceptualized the mixed model of EI. It was designed in a 360-degree version to assess the emotional competencies of individuals and organizations. The internal consistency of others’ ratings is good, while that of oneself is questionable, and it shows positive correlations with constructs related to the work environment. It has a version for university students and has been translated into several languages.

The Emotional Intelligence Questionnaire (EIQ) [ 68 ] is another tool designed to measure EI in the workplace. It has face, content, construct, and predictive validity, although the internal consistency of its scales varies between good and not very acceptable. Years later, the Emotional Intelligence Inventory [ 69 ] was developed in India. It was also designed to measure EI using a mixed concept in the workplace. It is made up of 10 dimensions, which have an internal consistency between acceptable and excellent. It has correlations with several related scales and with the number of promotions achieved and success in employment, which is proof of its predictive validity.

The Emotional Intelligence Appraisal (EIA) [ 70 ] is a set of surveys that measures EI in the workplace using the four main components of the Goleman model. Their items have been evaluated by experts. It has an internal consistency between very good and excellent. It has three versions: an online self-report, an online multi-rater report (which is combined with responses from co-workers), and another one that has anonymous ratings from several people to get an EI score for the whole team. The Emotional Intelligence Scale (EIS) [ 71 ] is another tool based on the Goleman model. It is composed of three dimensions and it has excellent internal consistency. The content of the items has been validated by expert evaluations.

The USM Emotional Quotient Inventory (USMEQ-i) [ 72 ] is a tool developed in Malaysia. It consists of a total of seven dimensions composed of 46 items. Seven of these items make up the “faking index items”, that measure the tendency of respondents to manifest social desirability and have a very good internal consistency ( α = 0.83). The reliability of the total instrument yields excellent values.

The Indigenous Scale of Emotional Intelligence [ 73 ] is a Pakistani instrument developed in the Urdu language. The final items were selected from an initial set after passing through the judgment of four experts based on the fidelity to the construct: clarity, redundancy, reliability, and compression. It has excellent internal consistency. Additionally, it presents construct validity (as women obtain higher scores than men) and correlations with the EQ-i.

Years later, the Mobile Emotional Intelligence Test (MEIT) was developed [ 64 ]. It is a Spanish instrument used to measure EI online in work contexts. It is made up of seven tasks (perceptive tasks and identification tasks) to assess the emotional perception of both others and oneself, respectively, face task, in which the most appropriate photograph related to the demanded emotion must be chosen, three comprehension tasks (composition, deduction and retrospective), and story task, in which participants must choose the best action to manage feelings in a given story. It presents excellent internal consistency and convergent validity.

3.3. Trait-Based Measures

This category is composed of trait-based instruments. The Trait Emotional Intelligence Questionnaire (TEIQue) [ 6 ] is the main instrument of this model. It is a tool widely used in many countries. It has excellent internal consistency and it shows significant correlations with the Big Five Personality. It has a short version, a 360-degree version, a version for children and another one for teenagers. It has been translated into many languages.

Years later, the Rotterdam Emotional Intelligence Scale (REIS) [ 75 ] was developed, the other instrument belonging to this category. It is a self-report instrument designed in Dutch. It has a very good internal consistency and it presents correlations with WEIS, TEIQue and PEC and its validity criterion has also been tested.

3.4. Measures Based on Other Models

Some instruments cannot be included within these categories since they have been conceptualized under different models. The first one is the Genos Emotional Intelligence Inventory [ 76 ], previously known as SUEIT. It is based on an original model. It was specifically designed for use in the workplace, but it does not measure EI per se, but rather the frequency with which people display a variety of emotionally intelligent behaviours in the workplace. It presents very good reliability and convergent and predictive validity. In addition, it has two reduced versions.

The Profile of Emotional Competence (PEC) [ 77 ] is based on the model of Mikolajczak [ 83 ], which replicates the four dimensions proposed by Mayer and Salovey but separates the identification from the expression of the emotions and distinguishes the intrapersonal aspect from the interpersonal aspect of each dimension. It contains two main scales, and has excellent internal consistency and convergent, divergent and criterion validity. The original one was developed in French, but it has been translated into several languages.

The last of the instruments identified is the Group-level Emotional Intelligence Questionnaire [ 79 ]. It was designed in the USA to assess EI in work groups under Ghuman’s theoretical model [ 79 ]. This model conceives EI as a two-component construct: group relationship capability (GRC) and group emotional capability (GEC). All of them have very good internal consistency.

Regarding the framework of the Standards, differences were found among them, resulting in an unequal distribution throughout the articles. The percentages of each type of validity can be seen in Table 2 .

Number of studies and percentages for each validity test.

The instruments whose original sources could not be retrieved are cited in Table 3 . The main reasons were that they were articles from books to which the authors did not have access, unpublished documents or documents with restricted access.

Information of the non-accessible instruments.

4. Discussion

The main aim of this study is to offer an updated systematic review of EI instruments in order to provide researchers and professionals with a list of tools that can be applied in the professional field with their characteristics, psychometric properties and versions, as well as a brief description of the instrument. For this purpose, a systematic review of the scientific literature on EI has been carried out using the WoS database through a search of all articles published between 1900 and the present.

The number of instruments developed has been increasing in recent years. In the 1990s barely any instruments were developed and their production was limited to approximately one per year and to practically one country (i.e., the USA). This may be due to the recent conceptualisation of EI, as well as to the difficulty that researchers found in constructing emotion-centred questions with objective criteria [ 15 ]. However, over the years, the production of instruments to measure EI has been increasing and, in addition, it has been extended to other geographical areas. This may be due to the importance that EI has reached over the years in multiple areas (e.g., health, organizational, educational, etc.). With the passage of time, and the introduction of new technologies, multimedia platforms have begun to be used to present stimuli to participants. Recent research in EI has determined that emotions are expressed and perceived through visual and auditory signals (i.e., the tone of voice and the dynamic movements of the face and body) [ 94 ]. Thus, a meta-analysis revealed that video-based tests tend to have a higher criterion-related validity than text-based stimuli [ 95 ].

Regarding the results, a total of 40 instruments produced from 1995 to 2020 have been located. The instruments registered in a greater number of studies, and that have been most used over the years are EQ-i, SSRI, MSCEIT 2.0, TMMS, WLEIS, and TEIQue. These tools have the largest number of versions (e.g., reduced or for different ages or contexts) and are the ones that have been validated in more languages. The most recent instruments hardly have translations apart from their original version, and they have been tested on very few occasions. Most of the articles have not been developed for a specific context.

On the other hand, as can be seen in the results, most of the instruments are grouped under the three main conceptual models described in the introduction (ability, trait and mixed). These models are vertebrated around the construct of EI. However, they present differences in the way of conceptualizing it and, therefore, also of measuring it. For example, the ability-based concept of EI is measured by maximum performance tests while trait-based EI is measured by self-report questionnaires. This may, in itself, lead to different outcomes, even if the underlying model used is the same [ 96 , 97 ].

The ability model, introduced by Mayer and Salovey, is composed of other hierarchically ordered abilities, in which the understanding and management dimensions involve higher-order cognitive processes (strategic), and are based on perception and facilitation, which involve instantaneous processing of emotional information (experiential) [ 4 ]. This model has received wide recognition and has served as a basis for the development of other models. However, it has been questioned through factor analysis that does not support a hierarchical model with an underlying global EI factor. Furthermore, emotional thought facilitation (second dimension) did not arise as a separate factor and was found to be empirically redundant with the other branches [ 96 ].

Intelligence and personality researchers have questioned the very existence of ability EI, and they suggest that it is nothing more than intelligence. This fact is supported by the high correlations found between ability-based EI and the intellectual quotient [ 15 , 96 ]. On the other hand, there is the possibility of falsifying the results by responding strategically for the purpose of social desirability. However, one of the advantages of the ability model is that, through the maximum performance tests, it is not possible to adulterate them. This is because participants must choose the answer they think is correct to get the highest possible score. Another advantage is that these types of instruments tend to be more attractive because they are made up of tests in which it is required to resolve problems, solve puzzles, perform comprehension tasks or choose images [ 15 ].

The Petrides and Furnham model [ 5 ] emerged as an alternative to the ability-based model and is related to dispositional tendencies, personality traits, or self-efficacy beliefs that are measured by self-report tests. The tools based on this model are not exempt from criticism. These instruments present a number of disadvantages, the most frequently cited are being vulnerability to counterfeiting and social desirability [ 96 ]. The participant can obtain a high EI profile by responding in a strategically and socially desirable way, especially when they are examined in work contexts by supervisors or in job interviews. People are not always good judges of their emotional abilities [ 98 ], and may tend to unintentionally underestimate or overestimate their EI. Another criticism of self-report tools is their ecological validity (i.e., external validity that analyses the test environment and determines how much it influences the results) [ 96 ].

On the contrary, the fact that such tools do not present correct or incorrect answers can be advantageous in certain cases. High EI trait scores are not necessarily adaptive or low maladaptive. That is, self-report tools give rise to emotional profiles that simply fit better and are more advantageous in some contexts than in others [ 97 ]. On the other hand, trait-based tools have demonstrated good incremental validity over cognitive intelligence and personality compared to ability-based EI tests [ 99 ]. Furthermore, they tend to have very good psychometric properties, have no questionable theoretical basis, and are moderately and significantly correlate with a large set of outcome variables [ 15 ].

One aspect observed in this systematic review is that the main measure of the estimated reliability in the analysed studies has been internal consistency. However, this estimate is not interchangeable with other measurement error estimates. This coefficient gives a photographic picture of the measurement error and does not include variability over time. There are other reliability indicators (e.g., stability or test–retest) that are more relevant for social intervention purposes [ 100 ], and that according to the estimation design, can differentiate into trait variability or state variability, that is, respectively stability and dependability [ 101 ]. It has been found that the use of stability measures as a reliability parameter is not frequent. In methodological and substantive contexts, reproducibility is essential for the advancement of knowledge. For this reason, it is necessary to identify measures that can be used as parameters to compare the results of different studies [ 102 ]. On the other hand, the standard coefficient of internal consistency has been coefficient α [ 103 ]. This measure has been questioned in relation to its apparent misinformed use of its restrictions [ 104 , 105 , 106 ], of which Cronbach himself highlighted its limited applications [ 104 ]. Other reliability measures have been recommended (e.g., ω) [ 107 ], and the reliability estimation practice in the creation of EI measurements needs to be updated. Usually, ω estimation is integrated into the modelling-based estimation, where SEM or IRT methodology is required to corroborate the internal structure of the score [ 108 , 109 , 110 ] and extract the parameters used to calculate reliability (i.e., factorial loads).

Another methodological aspect to highlight is that predominantly, the construction of EI measures was based on linear modelling or classical test theory. In contrast, the least used approach was item response theory (IRT), which provides other descriptive and evaluative parameters of the quality of the score measurement, such as the information function or the characteristic curves of the options, among others.

On the other hand, it is striking that some of the articles found prove the construct validity of their instruments by obtaining higher EI scores by women than men [ 56 , 58 , 73 ]. This has also been seen in the scientific literature and in research such as that of Fischer et al. [ 111 ], in which it was found that women tend to score higher in EI tests or empathy tests than men, especially, but not only, if it is measured through self-report. Additionally, striking is the study by Molero et al. [ 112 ], in which significant differences were observed among the various EI components between men and women. However, this is not the case in all the articles analysed in this study, nor in all the most current scientific literature. This fact has led to the development of different hypotheses about how far, why, and under what circumstances women could outperform men. There are several theories that have emerged around it. There is one that claims that these differences could be related to different modes of emotional processing in the brain [ 113 , 114 ]. Another theory points to possible differences in emotional perception that suggest that women are more accurate than men in this process when facial manifestations of emotion are subtle, but not when stimuli are highly expressive [ 115 ]. Additionally, another one points out that the expression of emotions is consistent with sex, which may be influenced by contextual factors, including the immediate social context and broader cultural contexts [ 116 ]. However, other variables such as age or years of experience in the position should also be taken into account. For example, the study by Miguel-Torres et al. [ 117 ] showed a better ability to feel, express, and understand emotional states in younger nurses, while the ability to regulate emotions was greater in those who had worked for more years. For this reason, nowadays firm conclusions cannot be drawn and it must be taken into account that the differences found are generally small. Thus, more research is needed on the differences that may exist between men and women in the processes of perception, expression and emotional management before establishing possible social implications of these findings.

4.1. Limitations

This study is not without limitations. Some are inherent in this type of studies, such as publication bias (i.e., the non-publication of studies with results that do not show significant differences) that could have resulted in a loss of articles that have not been published and that used instruments other than those found. In addition, instruments that could not be accessed from their original manuscript could not be included in the systematic review. On the other hand, despite the advantages of WoS, the fact that the search was conducted in a single database may lead to some loss of literature. Furthermore, the systematic review was restricted to peer-reviewed publications and thus different studies may be presented in other information sources, such as books or grey literature. Articles that were in the press and those that may have been published in the course of the compilation of this study have not been collected either. Additionally, the entire process of searching for references was carried out by only one investigator, so an estimate of inter-judge reliability cannot be made, as well as data extraction. There are many aspects of the PRISMA statement that, due to the purpose of our research, our study does not include (visible as NA in Table A1 ). However, it is necessary to develop a protocol for recording the inclusion and exclusion criteria of the primary studies to prevent bias (e.g., bias in the selection process). There are also some methodological aspects to be improved, such as the lack of methods used to assess the risk of bias in the included studies, the preparation or synthesis of the data, or the certainty in the body of evidence of a result. In future research it is necessary to take into account and develop these aspects in order to improve the replicability and methodological validity of the study, and to facilitate the transparency of the research process. In contrast to the above, one of the strengths of this study was to minimize the presence of biases that could alter the results. To minimize language bias, articles submitted in any language were searched for and accepted to avoid over-presentation of studies in one language, and under-presentation in others [ 20 ]. In addition, this study takes into account and exposes five sources of evidence of validity of the instruments through the Standards: content, response processes, internal structure, relationship with other variables and the consequences of testing. Other aspects to be improved in the future include performing the same search in other databases such as EBSCO and Scopus to obtain possible articles not covered in WoS. A manual search for additional articles would also be useful, for example, in the references of other articles or in the grey literature.

4.2. Practical Implication

The relationship between EI and personal development has been of great interest in psychological research over time [ 8 ]. A good study of the instruments that measure constructs such as EI can be of great help both in the field of prevention and psychological intervention in social settings. The revision of EI instruments is intended to contribute to facilitating work in the general population in a way that the development of EI is promoted and antisocial behaviours are reduced. In addition, since it correlates with variables that serve as protectors against psychological distress, this work also contributes to improving, in some cases, the general level of health.

Through this systematic review, we can see the great effort that has been made by researchers not only to improve existing EI measurement instruments, but also in the construction of new instruments that help professionals in the educational, business and health fields, as well as the general population. However, given the rapid changes that society is experiencing, partly due to the effects of modernization and technology, there is a demand to go beyond measurement. For example, from educational and business institutions and from family and community organizations it is necessary to promote activities, support and commitment towards actions oriented to EI under the consideration that this construct can be improved at any age and that it increases with experience.

5. Conclusions

From the results obtained in this study, numerous instruments have been found that can be used to measure EI in professionals. Over the years, the production of instruments to measure EI has been increasing and, moreover, has spread to other geographical areas. The most recent instruments have hardly been translated beyond their original version and have been tested very rarely. In order for future research to benefit from these new instruments, a greater number of uses in larger samples and in other contexts would be desirable.

In addition, most of the instruments are grouped under the three main conceptual models described in the introduction (ability, trait and mixed). Each model has a number of advantages and disadvantages. In the ability model it is not possible to adulterate the results by strategic responses and they tend to be more attractive tests; however, factor analyses do not support a hierarchical model with an underlying global EI factor. The trait-based model, on the other hand, employs measures that have no right or wrong answers, so they result in emotional profiles that are more advantageous in some contexts than others, and they tend to have very good psychometric properties. However, they are susceptible to falsification and social desirability.

On the other hand, it is necessary to identify measures that can be used as parameters to compare the results of different studies. In addition, the standard coefficient of internal consistency has been the α coefficient, which has been questioned in relation to its apparent misinformed use of its restrictions. It would be advisable to use other reliability measures and to update the reliability estimation practice in the creation of EI measures.

Finally, some of the articles found test the construct validity of their instruments by obtaining higher EI scores from women than from men. Different hypotheses have been developed about to what extent, why and under what circumstances women would outperform men; differences may be related to different modes of emotional processing in the brain or possible differences in emotional perception or to the influence of contextual factors. However, it would be interesting to further investigate the differences that may exist between men and women or to take into account other factors such as age or number of years of experience before establishing possible practical implications.

Acknowledgments

The authors thank the casual helpers for their aid with information processing and searching.

PRISMA 2020 checklist.

NA = Not applicable.

Author Contributions

Conceptualization, L.M.B.-L. and M.M.-V.; methodology, L.M.B.-L.; validation, L.M.B.-L.; formal analysis, L.M.B.-L.; investigation, L.M.B.-L.; data curation, L.M.B.-L.; writing—original draft preparation, L.M.B.-L.; writing—review and editing, L.M.B.-L., M.M.-V., C.M.-S. and J.L.C.-S. All authors have read and agreed to the published version of the manuscript.

This research received no external funding.

Institutional Review Board Statement

Informed consent statement, data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

IMAGES

  1. (PDF) A Research Paper on Social Engineering and Growing Challenges in

    research paper on social intelligence

  2. Part I: Introduction

    research paper on social intelligence

  3. (PDF) Social Intelligence--A Review and Critical Discussion of

    research paper on social intelligence

  4. (PDF) Applied Social Intelligence

    research paper on social intelligence

  5. Research Paper on Social Media

    research paper on social intelligence

  6. ⇉Importance of Research to Social Work Essay Example

    research paper on social intelligence

VIDEO

  1. Social Intelligence

  2. Research paper- social media

  3. social experiment 🥹❤️‍🩹 #viral #socialexperiment #love #social #shortvideo #prank #funny #beautiful

  4. 10th social science public question paper 2024

  5. Social Engineering in Hacking #shorts

  6. 5 Social Intelligence Skills To Have To Be Attractive

COMMENTS

  1. (PDF) Social Intelligence

    While definitions will vary, social intelligence (SI) is a set of abilities that enable a person. to build and maintain healthy relationships, whether in dyads, teams or large groups. Most. simply ...

  2. Social Intelligence: What It Is and Why We Need It More than Ever

    The definition of social intelligence has evolved over time (see Kihlstrom & Cantor, 2011, in press, for a much more detailed history of this evolution).Social intelligence was first mentioned and described by Dewey as the ability to observe and understand social circumstances as part of the ultimate goal of moral education.Later on, the concept of social intelligence was included in one of ...

  3. Social intelligence: What it is and why we need it more than ever before

    In this chapter, we discuss social intelligence and why it is of crucial importance to the world today. We open by defining social intelligence. Then we discuss whether social intelligence should be separated from general intelligence. Then we discuss the role of nonverbal communication in social intelligence. Finally, we discuss how social intelligence fits into a broader notion of adaptive ...

  4. (PDF) Social Intelligence

    Social Intelligence. May 2016. DOI: 10.1007/978-3-319-31816-5_2393-1. In book: Global Encyclopedia of Public Administration, Public Policy, and Governance (pp.1-5) Authors: Daniel Belton. Arizona ...

  5. Social Intelligence and Academic Achievement as Predictors of

    Social Intelligence and Popularity. The relationship between social intelligence and popularity appears to be positive for both boys and girls. Sociometrically popular students are prosocial and helpful to their peers (Coie and Kupersmidt 1983).They have a behavioral repertoire (social problem-solving skills, positive social actions, prosocial traits) that promotes success in friendships ...

  6. Social Intelligence

    Social intelligence, or street smarts or people skills, is the ability to understand and manage men and women, boys and girls, and to act wisely in human relations or social situations. It is about figuring out the best way to get along with others. It is the ability to adequately understand and evaluate one's own behavior and the behavior of ...

  7. Social Intelligence

    Social intelligence is a major building block of developing and maintaining social relationships. Thorndike originally explained social intelligence to be a facet of generalized intelligence and defined it as the ability to understand humans and act wisely in human interactions.Snow further expanded upon Thorndike's definition by describing that social intelligence is the accumulation of ...

  8. The Role of Intelligence in Social Learning

    We measure fluid intelligence and study experimentally how individuals learn from observing the choices of a demonstrator in a 2-armed bandit problem with changing probabilities of a reward ...

  9. PDF Social Intelligence, Study Habits and Academic Achievements of College

    Abstract. This study was undertaken to study the social intelligence, study habits and academic achievement of college students of district pulwama (J and k).The sample for the study was 410 including 193 male and 217 female college students by using random sampling technique. Chadha and Ganesan Social Intelligence Scale (1986), Palsane and ...

  10. Social Intelligence (Chapter 31)

    Summary. This chapter reviews the literature on social intelligence (SI) as it has evolved over the century since Thorndike (1920) popularized the concept. Most research on SI has been guided by an ability view, and an analogy to IQ, as exemplified by the George Washington University Social Intelligence Test, and the "behavioral" contents ...

  11. PDF DO STUDENTS EXPERIENCE "SOCIAL INTELLIGENCE," LAUGHTER, AND ...

    Graduate students in online and blended programs at Texas Tech University and the University of Memphis were surveyed about how often they laughed, felt other emotions, and expressed social intelligence. Laughter, chuckling, and smiling occurred "sometimes," as did other emotions (e.g., anticipation, interest, surprise).

  12. Types of Intelligence and Academic Performance: A Systematic Review and

    In recent years (2000-2020), the relationship between types of intelligence and academic performance at different educational stages has been studied in depth. In total, the meta-analysis ( Table 1) consists of 27 studies with k = 47 samples from Europe, Asia, Africa, America and Oceania. According to Bonett 's ( 2006) criteria, the sample ...

  13. Relationship between Emotional Intelligence, Social Skills and Peer

    1.1. Emotional Intelligence. EI is a construct that has gained enormous interest in recent decades. One of its main areas of study has been the influence it has on interpersonal relationships by contributing to optimal social functioning [].There are two major models of IE: the mixed model [] and Bar-On [], and the ability model, represented mainly by Mayer, Salovey and Caruso [].

  14. PDF Role of Social Intelligence in Student S Educational Development

    "Social intelligence shows itself abundantly in the nursery, on the playground, in barracks and factories and salesrooms, but it eludes the formal standardized conditions of the testing laboratory." Now, almost a century later, "social intelligence" has become ripe for rethinking as neuroscience begins to map the brain areas that

  15. The Relevance of Social Intelligence for Effective Optimization of

    Social Intelligence can also be said as the traditional wisdom we possess or the modern-day smartness required to deal various situations. If a person has social intelligence he or she can also deal effectively with social myths and prejudices. We can say that this kind of intelligence is a social trail which enhances our generosity and ...

  16. Toward a Sociology of Artificial Intelligence: A Call for Research on

    This article outlines a research agenda for a sociology of artificial intelligence (AI). The authors review two areas in which sociological theories and methods have made significant contributions to the study of inequalities and AI: (1) the politics of algorithms, data, and code and (2) the social shaping of AI in practice.

  17. [2403.08715] SOTOPIA-$π$: Interactive Learning of Socially Intelligent

    Humans learn social skills through both imitation and social interaction. This social learning process is largely understudied by existing research on building language agents. Motivated by this gap, we propose an interactive learning method, SOTOPIA-$π$, improving the social intelligence of language agents. This method leverages behavior cloning and self-reinforcement training on filtered ...

  18. Social Intelligence

    Social intelligence is a major building block of developing and maintaining social relationships. Thorndike originally explained social intelligence to be a facet of generalized intelligence and defined it as the ability to understand humans and act wisely in human interactions.Snow further expanded upon Thorndike's definition by describing that social intelligence is the accumulation of ...

  19. How Sensitive Are the Free AI-detector Tools in Detecting AI-generated

    For example, in a study, two essays were written by ChatGPT and then paraphrased using AI; the "real" percentage of applying the GPT-2 Output Detector markedly changed from 0.02% to 99.5% in essay one and from 61.96% to 99.8% in essay two. 7 Similarly, the results of our analysis on rephrasing the ChatGPT 3.5 version generated text using ...

  20. Emotional Intelligence Measures: A Systematic Review

    Emotional intelligence (EI) refers to the ability to perceive, express, understand, and manage emotions. Current research indicates that it may protect against the emotional burden experienced in certain professions. This article aims to provide an updated systematic review of existing instruments to assess EI in professionals, focusing on the ...