U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of plosone

Determinants of content marketing effectiveness: Conceptual framework and empirical findings from a managerial perspective

Clemens koob.

Department of Health and Nursing, Katholische Stiftungshochschule München, Munich, Germany

Associated Data

All relevant data are within the paper and its Supporting Information files.

Content marketing has gained momentum around the world and is steadily gaining importance in the marketing mix of organizations. Nevertheless, it has received comparatively little attention from the scientific community. In particular, there is very little knowledge about the effectiveness, optimal design and implementation of content marketing. In this study, the authors conceptualize content marketing as a set of activities that are embedded in and contingent on the specific organizational context. Based on this framework, the authors empirically investigate the context features determining content marketing effectiveness from a managerial perspective, using primary data collected from senior marketers in 263 organizations from various sectors and across different size categories, conducting multiple regression analysis. The empirical results indicate that clarity and commitment regarding content marketing strategy and a content production in line with the organization’s target groups’ content needs as well as normative journalistic quality criteria are context factors associated with higher content marketing effectiveness. The outcomes also reveal that regularly measuring content marketing performance and using the data obtained as guidance for improving content offerings positively influence content marketing effectiveness, as do structural specialization and specialization-enabling processes and systems. The insights provided in this study could offer important theoretical contributions for research on content marketing and its effectiveness and may help practitioners to optimize the design and implementation of content marketing initiatives.

Introduction

In times when consumers are becoming increasingly skeptical of traditional advertising, organizations need, more than ever, effective alternatives to traditional marketing communications. In these circumstances, content marketing (CM) has gained momentum around the world and is steadily gaining importance in the marketing mix of organizations, complementing traditional marketing instruments [e.g., 1 ]. CM investments have increased substantially. In the German-speaking area, for example, investments have risen from € 4.4b in 2010 to € 9.4b in 2019 and are forecast to grow further to € 12.5b by 2023 [ 2 ].

Content marketing refers to the creation and distribution of relevant, valuable brand-related content to current or prospective customers or other target groups (e.g. jobseekers, employees or investors) via digital platforms or print media to drive strategic business objectives [ 3 – 5 ]. Unlike traditional advertising, which typically denotes a form of communication designed to persuade or even push target groups to take some action, now or in the future [ 6 ], content marketing focuses on adding value to their lives, for instance by educating them, helping them solve problems, entertaining them or supporting them make well-informed decisions. Thus, content marketing is based on the social exchange theoretical principle that an organization’s delivery of valuable content to a target group will see it rewarding the organization in exchange with positive attitudes (e.g. brand trust) or behaviors (e.g. brand related interactions).

However, despite content marketing’s growing importance, it has received comparatively little attention from the scientific community [ 3 , 5 ]. So far, research has primarily focused on definitions and conceptualizations of content marketing [e.g. 3 , 5 , 7 , 8 ] and potential consumer- and firm-based consequences. Besides, there is a limited number of exploratory analyses and investigations about the effectiveness of content marketing that focus on specific sectors and types of media. Wang et al. [ 4 ], e.g., found CM effectiveness in the B2B domain to depend on the frequency of customers’ content consumption. Taiminen and Ranaweera [ 9 ] identified specific helpful brand actions, i.e. approaching content marketing with a problem-solving orientation, as increasing the effectiveness of B2B content marketing. With respect to consumers and branded social content, Ashley and Tuten [ 10 ] identified frequent updates, incentives for participation, as well as experiential, image and exclusivity messages to be associated with effectiveness. Chwialkowska’s study [ 11 ] revealed that customer-centric as opposed to brand-centric social content is more effective. Also, Liu and colleagues [ 12 ] provided evidence that short video clips can be effective to drive usage of other branded online content. However, apart from such rather focused studies, we have very little overall knowledge about the effectiveness of content marketing. In particular, and as Hollebeek and Macky [ 3 ] noted, still “little is known regarding its optimal design and implementation”. The question “what are the key factors for effectiveness” has long been an important theme in the marketing communications literature, but academic understanding regarding the determinants of content marketing effectiveness lags behind to date [ 3 ], generating an important knowledge gap that we address in this paper.

To investigate this gap, we conceptualize content marketing from an activity-based perspective. In line with the activity-based perspective of marketing [ 13 , 14 ], we propose to view content marketing as a set of specific activities, comprising content marketing strategizing, content production, content distribution, content promotion, performance measurement and content marketing organization. Referring to the concept of embeddedness [ 15 , 16 ], we further assume that these content marketing activities are rooted in and contingent on the specific organizational context, and that particular context features are potential determinants of content marketing effectiveness. Based on this framework, we will empirically investigate the features driving content marketing effectiveness.

Our contribution is as follows: As far as we know, the determinants of content marketing effectiveness have not yet been empirically investigated from a broader perspective. We therefore first provide a theoretical framework for analyzing content marketing effectiveness. Second, we offer empirical insights that could help marketers to potentially improve the design and implementation of their content marketing initiatives, which researchers have called for [ 3 , 5 ]. Third and in doing so, we might help to move the research on content marketing effectiveness beyond the prevailing anecdotal to an evidence-based level. Fourth, for scholars, this research could offer a platform for further studies into the drivers of content marketing effectiveness. Taken together, these advances could extend current academic and managerial discussions of how to achieve effective marketing communications.

Theoretical framework and derivation of hypotheses

Any empirical investigation of the determinants of content marketing effectiveness requires a proper conceptualization of CM effectiveness. Hence, the next section proposes such a conceptualization. After that, we propose that content marketing activities take place in an organizational context [ 15 , 16 ] affecting their effectiveness. Context refers to the specific intra-organizational circumstances, environments and constellations of forces shaping the character of the content marketing activities and their outcome [ 17 ]. We outline the potentially relevant context dimensions, being content marketing strategizing, content production, content distribution, content promotion, content marketing performance measurement, and content marketing organization, respectively.

Content marketing effectiveness

Based on a literature review ([ 3 – 5 , 8 , 18 – 23 ], see S1 Appendix for details), content marketing activities can be seen as effective if they trigger superior levels of cognitive, emotional and behavioral customer engagement at the appropriate points throughout the customer journey, strengthen customers’ brand trust and induce favorable brand attitudes, and increase customers’ perceived value of a brand, leading to more favorable responses to the brand and its communications, and thus helping the focal organization reach its strategic business objectives.

CM effectiveness and CM strategizing

Porter and McLaughlin [ 15 ] conclude that there is no universally agreed-upon set of components that comprise the relevant organizational context dimensions. However, they point to the strategizing context to be one of them, i.e. the constellations under which strategizing in the sense of ‘doing of strategy’ unfolds [ 15 ]. Strategy research supports the idea that strategic clarity is one aspect of the strategizing context that plays a key role regarding effectiveness since it gives direction and provides orientation [ 24 , 25 ]. This is also in line with goal setting theory which posits that specific and well-defined challenging goals lead to higher performance [ 26 ]. Strategy research also suggests strategy commitment, which can be defined as the extent to which managers and employees comprehend and support the goals and objectives of a strategy [ 27 , 28 ], as an essential aspect, as it is known to affect strategy supportive behavior. We assume these two factors to be pivotal for content marketing effectiveness, too. In the content marketing domain, strategizing comprises, e.g., the crafting of a content marketing mission and vision, the definition of objectives, the identification and prioritization of target groups, the specification of the unique value an organization is looking to provide through its content, the clarification of key stories to be communicated, or decisions regarding the platforms that will be used to disseminate content [e.g., 5 ]. A clearly defined content marketing strategy that is communicated and understood within the organization might positively influence CM effectiveness, because it allows to select those CM projects which promise a high strategy contribution. In case commitment to a content marketing strategy is high, all managers and employees might show vigor, get engaged and take personal responsibility for the successful realization of the content marketing initiative. Thus, we expect:

  • Hypothesis 1 : Content marketing is more effective when organizations have a stronger CM strategizing context characterized by strategic clarity and commitment .

CM effectiveness and content production

Furthermore, we suggest a strong content production context will be positively related to CM effectiveness. By this, we refer to content production environments in which high quality content can be created [ 5 ]. The necessity to create and provide quality content is widely acknowledged in the CM literature [e.g. 5 ], as it is assumed that quality content is more likely to be interacted with. However, this raises the question of what constitutes quality content. Uses-and-gratifications-theory supports the idea that people seek out media that satisfy their needs and lead to gratification [ 29 , 30 ]. From this perspective, consumers may select content for functional (e.g. learning about brands, self-education), hedonic (e.g. entertainment, diversion, relaxation) or authenticity motives (e.g. identity construction, self-assurance) [ 3 , 30 ]. In addition to that, research proposes that ‘quality content’ not only has to meet consumers’ subjective standards, but also certain objective specifications or normative principles. The criteria mentioned in the literature typically include aspects like timeliness, objectivity, accuracy, or diversity of viewpoints [ 31 – 36 ]. Hence, we believe:

  • Hypothesis 2 : A strong content production context , characterized by efforts to optimize customer-perceived content value and to adhere to normative quality criteria should be associated with higher content marketing effectiveness .

CM effectiveness and content distribution

We assume a specific content distribution context will also be positively related to CM effectiveness. The content distribution context refers to the conditions under which content is distributed and particularly includes the media platforms (e.g. customer magazines, digital magazines, blogs, podcasts, social media, chatbots etc.) used [ 3 , 5 , 8 ]. Research generally supports the idea that communications efforts using multiple media platforms are more effective than initiatives using only a single medium [e.g. 37 , 38 ]. According to Voorveld et al. [ 39 ], two psychological processes play a role in explaining these effects. First, forward encoding implies that the exposure to content in the first medium primes interest in the content in the second medium, which in turn stimulates deeper processing and easier encoding of the second content piece, resulting in multiple content retrieval cues and higher effectiveness. Second, multiple source perception refers to the effect that consumers perceive cross-media communications as more expensive, leading to the belief that the communicating brand has to be popular and successful, also resulting in more positive communications results. Furthermore, benefits from combining multiple media distribution platforms might arise from accompanying prospects and customers with the appropriate content platforms at the different points in their consideration and buying processes [ 40 ]. On the other hand, it could be argued that investment in too many media distribution properties might attenuate the power of communications, because it prevents an organization from focusing its resources on the most suitable platforms [ 38 ]. Reactance theory also suggests that communication across multiple media platforms could unfold negative consequences as customers might associate a brands omnipresence at various platforms with increasing pressure from the firm’s communications attempts which could be perceived as obtrusive [ 41 ]. Based on these considerations we believe:

  • Hypothesis 3a : Content marketing is more effective , when the content distribution context is characterized by the usage of an intermediate number of media platforms .

Content marketers continue to watch out for new opportunities to reach customers and, over time, have shifted content distribution budgets away from print media such as customer magazines to digital media such as digital magazines, blogs, social media and the like [ 2 ]. The question is whether and to what extent this shift is beneficial for improving CM effectiveness. Communications theory implies that for effective communication, the sender should match the channel that the receiver prefers [ 42 ]. Based on this recommended practice of media matching, organizations ought to be cognizant of customers’ media platform preferences as well as actual media use and adjust their channel choices accordingly. With regard to media preferences, research has repeatedly revealed a high level of consumer conservatism, indicating that established media channels, especially print media, retain favored attributes such as trust, high perceived value, intimacy or visual power, whereas digital media are, e.g., more strongly associated with speed, convenience and efficiency [ 42 , 43 ]. Considering media use, two models predict different relationships between new and established media. The displacement model assumes increases in new media use will go along with declines in the use of established media (e.g. due to functional advantages of new media or limited time budgets [ 44 , 45 ]). The complementary model hypothesizes new media usage has no or even a positive effect on established media use within a content domain, as people “interested in procuring information in a particular content area expose themselves to a multitude of media outlets to optimize the information on that particular content area” [ 46 ]. Recent studies [ 45 , 47 ] have provided evidence that adoption of new platforms is reducing the consumption of established media, but that established media will not be fully displaced. Other theoretical accounts also suggest not to neglect print media for digital media. Psychological ownership theory implies that print media, being physical goods, might have a greater capacity to garner an association with the self than digital media, leading to greater value ascribed to them [ 48 ]. Regarding text-based content, educational research points to the fact that reading on paper leads to significantly better content comprehension than reading digitally [ 49 ], possibly due to better spatial mental representation of the content and more visual and tactile cues fostering immediate overview of the content. Consequently, we expect:

  • Hypothesis 3b : Content marketing is more effective , when the content distribution context is characterized by a joint deployment of print and digital media platforms .

CM effectiveness and content promotion

Furthermore, we propose the content promotion context is key for CM effectiveness. Content promotion refers to any paid measures an organization takes to draw attention to its content or to stimulate interest in or usage of its content, typically with the help of or on third-party platforms, with the aim of optimizing content reach. Instruments include, amongst others, influencer marketing, social media and search engine advertising, or classic public relations [ 50 ]. Research has repeatedly suggested an attention economy [e.g., 51 ], denoting a world where people are awash in content, and where peoples’ available time and attention spans are limited, creating an environment in which content competes for customers’ time and attention as scarce resources. Under these circumstances, we expect that paid content promotion measures can help to accentuate content and draw attention to potentially relevant and valuable content pieces, so that these pieces can break though the “content clutter” [ 52 ].

Furthermore, the power law of practice and cognitive lock-in theory [ 43 , 53 ] state, that when people practice specific tasks, the repetition of these tasks increases efficiency, which induces familiarity, from which in turn people are inclined to get cognitively locked-in to the respective media environment. Cognitive lock-in thus denotes a condition wherein a consumer has learned how to use a specific media environment, thanks to multiple interactions with it, with the effect that more familiarity decreases his propensity to search for and switch to competing media alternatives. Research has demonstrated these effects for websites [ 53 , 54 ], as well as for print media [ 43 ]. We believe this thinking may be applicable for a broad range of media environments and applying it to the content marketing context leads us to believe that if customers are already accustomed to use specific content offerings, they see no need to switch to a new content offering. Under these conditions, paid content promotion measures might help to stimulate customers to try a focal organization’s content offer, potentially breaking up existing and initiating new cognitive lock-in processes, thereby supporting the organization’s attempt to transition customers to its own content offerings. Hence:

  • Hypothesis 4 : Content marketing is more effective when organizations have a stronger content promotion context characterized by comprehensive paid content promotion measures .

CM effectiveness and CM performance measurement

We also propose that a strong content marketing performance measurement context within an organization will be positively related to CM effectiveness. Content marketing performance measurement (CMPM) can be defined as establishing metrics related to the organization’s content marketing objectives and measuring and evaluating performance relative to these objectives, for the purpose of providing evidence for effectiveness and efficiency of content marketing activities and optimizing these activities. Previous studies have shown positive performance implications of marketing performance measurement in contexts other than content marketing [e.g. 55 – 57 ]. We believe for four reasons, that this also applies to the content marketing domain. First, the attention-based view of the firm accentuates that one of the key characteristics of measurement systems is their property to focus and direct attention of organizational members to important issues [ 58 ]. By directing minds at what needs to be done, chances increase that it will get done. Thus, we expect, that content marketing performance measurement will get an organization to attend to essential content marketing objectives and activities. We believe that the presence of CMPM activates managers and employees and causes them to achieve coordinated action and to orient their efforts to succeeding on the measured content marketing aspects. Second, previous research [ 59 ] has shown that producing measurements is not enough to get the organization into acting, but that organizations are also sensitive to what issues are internally discussed. We argue that CMPM sparks discussions about important content marketing issues, which helps to summon attention and resources for acting, ultimately improving content marketing effectiveness. Third, performance measurement usually allows to monitor the performance of marketing activities, be it relative to prior objectives, similar activities in the past, or other benchmarks, lowering uncertainty about the performance of decisions and about whether the decisions were the right ones, which in turn helps to learn and plan marketing activities producing desired outcomes [ 56 ]. We thus expect that CMPM will nurture learning, which in turn will improve content marketing decisions, and thus content marketing effectiveness. Fourth, performance measurement usually includes performance feedback, and previous studies have consistently shown that performance feedback is positively associated with work engagement [ 60 ]. Higher work engagement in turn implies that managers and employees invest more energy into their work roles, leading to superior work outcomes [ 61 ]. Thus, we expect that CMPM energizes organizational actors to act in desired ways to meet the organization’s goals. Hence:

  • Hypothesis 5 : Content marketing is more effective when organizations have a stronger content marketing performance measurement context .

CM effectiveness and content marketing organization

Finally, we expect a strong content marketing organization will be positively related to CM effectiveness. Porter and McLaughlin [ 15 ] indicate that organizational structures and processes are one of the major components contextualizing activities within an organization. Research on marketing organization also highlights the importance of organizational structures and processes for marketing performance [ 62 , 63 ]. It is widely acknowledged in the marketing literature, that organizations face dynamic and complex marketing communications environments, e.g. in terms of the development and transformation of technology and media or consumer behavior evolving at an increasingly rapid pace [ 6 ]. Under these conditions, specialization and autonomy seem to be favorable characteristics of organizational structure [ 64 ]. Specialization denotes the level to which activities in the organization are differentiated into unique elements, while autonomy refers to the level to which employees have control in executing those activities. Organizations high in specialization and autonomy have a high share of specialist employees who direct their efforts to a clearly defined set of activities, and as experts with specialized knowledge in their particular work areas, they enjoy substantial autonomy to determine the best approach to carry-out their tasks [ 65 ]. According to prior research, the combination of specialization and autonomy enables an organization to assign tasks to those employees who are best able to perform them, it enhances the organization’s knowledge base, and it promotes the development of innovative ideas and solutions [ 62 , 63 , 66 ]. However, research has also indicated that specialized organizational structures with high degrees of autonomy need the support of adequate processes and systems to function properly [ 62 ].

The application of this thinking to content marketing leads us to two considerations: First, we believe that, also in this domain, structural specialization coupled with autonomy could be beneficial. It could allow an organization to assign content marketing tasks to managers and employees that are best prepared to tackle them. Further, specialization could enhance an organization’s content marketing knowledge base, foster the development of innovative content marketing ideas and solutions and enable the organization to quickly respond to upcoming communication needs. An example for such a structure could be a dedicated content marketing unit with a high share of task- and skill-specialized content marketing experts that have control over how they organize their work and that have significant autonomy in making decisions. Second, we assume that an increase in content marketing specialization and autonomy within an organization also demands processes and information technology systems with a proper fit [ 67 ]. We believe that processes and systems are required that enable and support interaction and collaboration between content marketing specialists, between content marketing experts and further marketing functions, and also between content marketing experts and other relevant organizational entities. To sum up, we posit:

  • Hypothesis 6 : Content marketing is more effective when organizations have a stronger content marketing organization .

Fig 1 provides a summary of the proposed theoretical framework.

An external file that holds a picture, illustration, etc.
Object name is pone.0249457.g001.jpg

Data collection and sample

We gathered data from organizations with over 250 employees in the German-speaking area, that is Germany, Switzerland, and Austria. Regarding industry characteristics, organizations from all sectors in line with the business registers of the three countries, comprising a broad range of industrial, services, finance and trade sectors, were eligible to take part in the investigation. We targeted medium- and large-sized organizations because they are more likely to employ complex marketing practices such as content marketing. All data were collected using an online survey with the sample drawn from an online panel provider. There is profound evidence from prior research that online panel data is capable of delivering high-quality data outcomes [ 68 ]. Porter et al. [ 68 ] recommend using online panel data particularly for studies requiring access to specific populations. Referring to this guidance, online panel data and the online panel provider Norstat were deliberately chosen for this study, because it required access to the very specific population of senior marketing or communications directors, and people in equivalent positions, responsible for the respective firms’ content marketing activities, as key informants, with the online panel provider being capable of recruiting this hard-to-reach sample. The aforementioned group of managers was identified as key informants because they are organizational members who can provide reliable data on the organizations’ content marketing activities and effectiveness. Data collection was carried out in accordance with further recommendations compiled from the literature by Porter et al. [ 68 ] regarding participant recruitment, selection and information and data quality measures. We captured participants’ managerial positions and involvement in content marketing activities in a screener survey to verify key informant appropriateness and reduce potential key informant bias, used attention checks and applied lower and upper limits of survey completion time to ensure high-quality responses, and captured IP addresses to control for potential multiple responses from the same managers.

Before carrying out the study, the University Ethics Review Board regulations indicated that a research ethics review was not required. Reasons for this decision are that the investigation does not include any manipulations or vulnerable groups, and participants were guaranteed that their data is treated anonymously. Moreover, the data has been collected consistent with the ethical guidelines of the Academy of Marketing Science and in accordance with the EU General Data Protection Regulation. All participants provided informed consent by clicking on the link to start the study, participation was completely voluntary, and only data from participants were used who fully completed the study.

In total, data collection yielded 319 responses. The sample comprised 53 managers from organizations that do not apply content marketing practices and 3 executives that failed to pass the aforementioned data quality checks. We therefore eliminated those respondents from the sample. Hence, the final sample comprised the answers from 263 organizations.

The characteristics of respondents were in line with our expectations of key informants. We were successful in getting senior-level marketing and communications executives as respondents: 131 were board members such as CMOs, 56 were marketing vice presidents or directors, 38 were corporate communications vice presidents or directors, 36 were vice presidents or directors of a dedicated content marketing unit, and the remaining 2 were senior executives in other marketing communications functions. Of the 263 organizations in our sample, 125 were from the services sector, 67 from the industrial sector, 51 from the finance sector, and 20 from the trade sector. Regarding size, 69 organizations had between 250 and 499 employees, 58 had 500 to 999 employees, 72 had between 1,000 and 4,999 employees, and 64 employed a workforce of 5,000 or more people.

For collecting data, we relied on a structured questionnaire. Whenever possible, we used measures from previous research and modified them for our study. All questions were asked in German language. The measures of the main variables are displayed in the table in S1 Table .

Dependent variable

Content marketing effectiveness (CMEFFECT) . To capture the degree of achieved content marketing effectiveness, we asked senior marketing and communications executives for their evaluations. For assessing attained customer engagement as aspect of content marketing effectiveness, we adapted three items from the consumer brand engagement scale which was developed by Hollebeek et al. [ 69 ]. These questions capture the managerial assessment of the extent to which focal content marketing activities foster positive brand-related cognitive, affective and conative activity, i.e. consumers’ brand processing, affection, and activation. To assess content marketing’s effects on brand attitudes and perceived brand value as further aspects of content marketing effectiveness, we adapted four perceptual items drawn from Sirdeshmukh et al. [ 70 ] and Sengupta and Johar [ 71 ]. These questions capture the managerial assessment of the degree to which the respective organization’s content marketing activities trigger brand trust in terms of credibility (expectancy that a promise made by the brand can be relied upon) and benevolence (confidence in the brand motives) and contribute to favorable brand evaluations. Responses to all items of content marketing effectiveness were given on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree). An exploratory factor analysis delivered a one-factor solution; thus, we averaged all items to calculate the overall index of content marketing effectiveness. Cronbach’s alpha coefficient for content marketing effectiveness was .88, exceeding the recommended minimum of .70, indicating a very good reliability [ 72 ].

Independent variables

Content marketing strategizing context (CMSTRAT) . The content marketing strategizing context was assessed using a four-item scale that measured whether the organization had a defined, comprehensible, long-term content marketing strategy and to what extent managers and employees support the strategic direction. The items for strategic clarity and strategy commitment were adapted from related scales developed by Bates et al. [ 73 ] and Noble and Mokwa [ 74 ]. Responses were given on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree).

Content production context (CPROD) . We assessed the content production context using a three-item scale. The items rest on previous research by Hollebeek and Macky [ 3 ], Urban and Schweiger [ 35 ] and Chen and colleagues [ 75 ] and include an organization’s efforts to optimize customer-perceived content value, to adhere to normative content quality criteria, and to plan and create content systematically. Responses were given on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree).

Content distribution context / intermediate number of media platforms (CDIST1) . In line with previous research by Kabadayi and colleagues [ 76 ], we used a single item to measure the number of media platforms the organizations used for content distribution purposes. We presented our respondents with the following seven media platform alternatives and asked them to mark the ones used by their organizations: customer magazines or newspapers, corporate books, company reports, owned digital media (websites, apps, newsletters, blogs), organic social media, paid social media and emerging platforms (e.g. chatbots, voice assistants). We developed this list on the basis of a review of the academic and trade literature combined with prestudy interviews of content marketing executives. Although we intended the list to be comprehensive, we asked respondents with media platforms not included in the list to add those platforms in a space that was provided. The measure of platform number was simply the number of platforms that each organization used. The range on this item was 1 to 7 platforms. Based on this item, we calculated our measure so that the usage of the intermediate number of four media platforms was assigned the maximum value 4, while lower or higher number of platforms used were assigned values in the range between 1 and 3.

Content distribution context / joint deployment of print and digital media platforms (CDIST2) . To operationalize the joint deployment of print and digital media platforms in content distribution, we asked respondents–as done in prior research [ 77 ]–how much of their content distribution budgets their organizations were allocating to print or digital media platforms, respectively, with the percentages summing up to 100 percent. We used this information to construct the joint deployment score for each organization and assigned values between zero (print or digital only) and fifty (balanced budget shares) to reflect joint platform usage.

Content promotion context (CPROM) . To measure the weight organizations attached to content promotion, respondents were requested to state the share of overall content marketing investments that their organizations allocated to content promotion measures. We adapted this approach from Fam and Yang [ 77 ] because marketing executives are usually sensitive to budget information, hence they would feel more comfortable in providing the relative weight of content promotion budgets rather than an absolute figure, leading to more accurate data. The range on this item was 0 to 100 percent.

Content marketing performance measurement context (CMPERME) . We assessed the CM performance measurement context using a three-item scale. The items rest on previous research by O’Sullivan and colleagues [ 55 ] and Mintz and Currim [ 56 ]. They capture content marketing performance measurement frequency regarding deployed print and digital content platforms as well as actual performance measurement data use in terms of the employment of data as guidance for continuously improving content offerings. Responses were given on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree).

Content marketing organization (CMORG) . To capture structural specialization and autonomy in the content marketing domain and specialization-enabling processes and systems, we used four questions based on prior research by Olson et al. [ 63 ], Walker and Ruekert [ 66 ], Barclay [ 78 ] and Škrinjar and Trkman [ 79 ]. These questions capture the presence of dedicated content marketing units, task- and skill-specialized, autonomous content marketing experts, and processes and information technology systems that enable collaboration of specialized staff and units. Responses were given on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree).

Control variables

In addition to the above variables, we considered control variables in our analyses. We followed recommendations for control variable use in the literature that suggest a focused use of controls to not unnecessarily loose available degrees of freedom and statistical power [ 80 , 81 ]. We also opted for a focused approach to avoid increase in questionnaire length, because this commonly leads to higher response burden [ 82 ], which is associated with lower response rates and more response biases. First, we included organizational size (SIZE) as a control variable. Size is established to potentially confound marketing practices [ 83 ] and organizational performance measures [ 84 ]. For example, compared to larger organizations, smaller organizations were found to be more informal with regard to marketing planning and to use fewer ways to measure performance [ 83 ]. Thus, organizational size may relate to an organization’s content marketing activities and CM effectiveness. Organizational size was measured by asking the key informants for the number of full-time employees, referring to four size categories. Three dummy variables were used, concerning organizations with 500 to 999, 1,000 to 4,999, and 5,000+ employees, respectively. Organizations with 250–499 employees served as the comparative category. Second, we also controlled for an organization’s sector affiliation (SECTOR) . A dummy-coded variable (0 = industrial sector and 1 = services sectors) was assigned to the participating organizations. The rational for selecting sector affiliation as control was that it is well established that sector characteristics, in particular differences between industry and services, play an important role for organizational behavior and outcomes [ 85 ]. Examples for sector-specific features are legal restrictions, competitive specifics, ethical concerns, or customer specifics [ 86 ]. In content marketing it could, e.g., be that creating attractive, compelling content is harder for organizations in industrial sectors.

Measure validation and analytical approach

Measure validation.

As our data met sample size recommendations [ 87 ], we assessed the validity of our measures using confirmatory factor analysis. The analysis was performed using the lavaan package in R. We estimated a measurement model with the seven reflective constructs in our study (CMSTRAT, CPROD, CDIST1, CDIST2, CPROM, CMPERME and CMORG). Regarding the inclusion of the three single-indicator latent variables (CDIST1, CDIST2, CPROM) in the analysis, we followed the recommendations in the literature [ 88 , 89 ] to fix loadings at “.95 * variance” and to calculate error variance as “sample variance of the indicator * (1 - .85)”, thus separating the single indicators from the latent variables. We used the robust Satorra-Bentler MLM estimator, since the multivariate normality assumption was not met (Mardia Statistics: skew = 41.95, p < .01 and kurtosis = 374.90, p < .01). The results indicate adequate levels of fit (CFI = 0.97, SRMR = 0.04, RMSEA = 0.05, χ 2 /df = 145.5/101), in accordance with the guidelines provided by Hu and Bentler [ 90 ].

We assessed convergent validity of the measures by examining factor loadings. The analysis indicated that all factor loadings are high (ranging from 0.58 to 0.92), in line with the guidelines of Hair et al. [ 91 ], and significant. Cronbach’s alphas of all of the measures range from 0.71 to 0.86, surpassing the acceptable level of 0.70, and composite reliabilities also surpass the acceptable level of 0.60 suggested by Fornell and Larcker [ 92 ]. Average variance extracted (AVE), reflecting the amount of variance in the indicators that is accounted for by the latent construct, is a more conservative estimate of the validity of a measurement model [ 92 ], and was also calculated for each construct. With the exception of CPROD (0.45), the AVE for each construct is greater than the 0.50 level recommended by Fornell and Larcker [ 92 ]. In sum (see table in S2 Table ), these results indicate convergent validity of the measures.

To test for discriminant validity , we calculated the difference between one model, which allowed the correlations between the constructs (with multiple indicators) to be constrained to unity (i.e. perfectly correlated), and another model, which allowed the correlations between the constructs to be free [ 93 ]. This was done for one pair of constructs at a time. For example, in testing CPROD and CMPERME, the chi-square difference test between the two models (χ 2 d (1) = 362.69, p < .001) affirmed the discriminant validity of these constructs. Similar results were obtained for the other chi-square difference tests, indicating discriminant validity.

To assess content marketing effectiveness, we drew on subjective measures . A part of the literature on performance measurement tends to conclude that subjective measures, compared with objective measures, are less appropriate for performance assessments. It has been argued that managers may tend to overrate their organization’s performance [e.g., 94 ], and that using subjective measures can be problematic when explanatory variables of performance are measured using the same informant, as this can implicate common method bias [ 95 ]. However, as done in prior research [ 96 ], we deliberately decided to rely on managers’ subjective evaluations because of the lack of generally accepted and comparable objective content marketing effectiveness indicators. Moreover, Singh et al. [ 96 ] have demonstrated that carefully collected subjective performance measures can yield reliable and valid data. To alleviate common method concerns we first used procedural remedies in line with recommendations provided by Podsakoff et al. [ 95 ]. We divided the questionnaire into various subsections, so respondents were required to pause and carefully read instructions for each set of questions, contributing to the psychological separation of predictor and criterion measures. We relied on different scale types to reduce common scale properties. In addition, we kept items specific and labeled every point on the response scales to minimize item ambiguity. We also guaranteed anonymity to diminish the tendency to respond in a socially desirable manner, and we kept the questionnaire as short as possible to maintain motivation to respond accurately. In addition to these procedural remedies, we used the regression-based marker variable technique proposed by Siemsen et al. [ 97 ] to statistically control for potential method bias. According to this approach, common method bias can be effectively reduced when estimating a regression equation by adding a marker variable that is largely uncorrelated with the substantive variables of interest and suffers from some type of method bias. Hence, we deliberately included impression management , i.e. the conscious attempt to present oneself positively, as a potentially ideal marker variable into our study, based on the expectation that this measure is theoretically unrelated and similarly vulnerable to common method variance relative to other study variables. We measured the impression management form of social desirability via the three-item scale described by Winkler et al. [ 98 ]. Items were on 5-point agreement scales (1 = strongly disagree and 5 = strongly agree). Analysis of our data exhibited no to small bivariate correlations (< .15) of the impression management marker ( IMM ) with the substantive variables of interest, supporting the assumed unrelatedness. Thus, we added the marker variable to our regression analysis, described in more detail below, to control for potential common method bias.

The study variables were on different response scales. Hence, we followed the recommendation from Cohen et al. [ 99 ] to put research findings into common, easily understandable metrics, and used simple linear transformations of the original scale units to convert the scores of all variables into standardized units of 0 to 100 (0, 100 for dichotomous variables), representing the percent of maximum possible (POMP) scores for each scale. This approach simplifies interpretability for example by giving immediate meaning to summary statistics such as means and measures of variability or by facilitating comparisons of scores across constructs.

We used linear multiple regression analysis for hypotheses testing in which all variables entered the regression equation on the same step. With regard to Hypothesis 3a, which predicts that content marketing is more effective when an intermediate number of media platforms is used, we categorized, as described above in the measures section, the originally continuous predictor variable so that an intermediate number of media platforms used was assigned the maximum value. Though such categorization is accompanied by loss of information, this allowed us to investigate whether CM effectiveness at an intermediate number of platforms used was different from when more or less platforms were used without resorting to a quadratic function. We proceeded analogously with regard to the analysis of Hypothesis 3b. Statistical analyses were performed using SPSS Statistics 24.0.0.1 software, reporting adheres to the SAMPL guidelines [ 100 ]. Prior to the main analysis, the assumptions of regression analysis were tested. To check linearity between the dependent and the independent variables, we employed partial residual plots of independent variables [ 101 ]. The plots exhibited only minor deviations from linear relations. Hence, we concluded that there was no major problem with the linearity assumption. Regarding multicollinearity, the highest value of variance-inflation factor was 2.81, and the highest value of the condition index equaled 24.90. Since these values are below the recommended threshold of respectively 10 and 30 [ 72 ], there is no indication for collinearity concerns. A Shapiro-Wilk test of the residuals (W(263) = 0.985, p < .01) found some evidence of nonnormality and a Koenker test (K = 29.97, p < .01) indicated presence of heteroscedasticity in the residuals. We therefore used the generalized information matrix (GIM) test described by King and Roberts [ 102 ] to detect potential model misspecification. Since the value (GIM = 1.375) is below the recommended threshold of 1.5, denoting that robust standard errors are not 1.5 times larger than classic standard errors, there is no indication for misspecification. Hence, we proceeded with our model, and to account for nonnormality and heteroscedasticity, we followed the recommendation of Dudgeon [ 103 ] to use HC3 as robust standard error estimator in our regression. Multiple regression with robust standard errors was carried out using the SPSS macro by Daryanto [ 104 ]. A p-value of < .05 was considered significant.

Descriptive statistics

Table 1 lists the means, standard deviations, correlations, and Cronbach’s alphas of the study variables. In line with expectations, CMEFFECT related positively to CMSTRAT (r = .66, p < .001), to CPROD (r = .68, p < .001), to CMPERME (r = .61, p < .001), and to CMORG (r = .62, p < .001). Notably, CMEFFECT was not correlated with CDIST1, CDIST2, and CPROM.

Notes: N = 263. POMP scores for all variables.

a Dummy coded. All |r| > .11 are significant at p < .05, all |r| > .19, p < .01. Cronbach’s alphas for multi-item measures are in italics on the diagonal in the correlation matrix.

Hypothesis testing

Results of the multiple regression analysis with CMEFFECT as dependent variable are presented in Table 2 . The study variables explained a substantial proportion of variance in content marketing effectiveness (R 2 = .61, F(12, 250) = 36.71, p < .001). In Hypothesis 1, we expected that there would be a positive association between a strong content marketing strategizing context, characterized by strategic clarity and strategy commitment, and content marketing effectiveness. The regression coefficient indicates that as we hypothesized, CMSTRAT is significantly and positively associated with CMEFFECT (β = .23, t(250) = 2.94, p < .01). Therefore, the data support Hypothesis 1.

Note: N = 263.

With regard to Hypothesis 2, we predicted that a strong content production context, characterized by efforts to optimize customer-perceived content value and to adhere to normative quality criteria, should be associated with higher content marketing effectiveness. Results showed that CPROD was positively related to CMEFFECT (β = .37, t(250) = 5.05, p < .001). Thus, Hypothesis 2 cannot be rejected. Hypotheses 3a and 3b predicted that two aspects of content distribution, the usage of an intermediate number of media platforms and a joint deployment of print and digital media platforms, each affect content marketing effectiveness. However, results showed that CDIST1 (β = .01, t(250) = .29, p = .77) and CDIST2 (β = -.02, t(250) = -.50, p = .62) were not significantly related to CMEFFECT. Therefore, Hypotheses 3a and 3b are not supported by our data. Related to Hypothesis 3a, we conducted two exploratory post-hoc analyses to examine whether there might be (a) a linear relationship between the number of content distribution platforms used and content marketing effectiveness, or (b) an inverted U‐shaped relationship between the number of content distribution platforms used and content marketing effectiveness. With regard to (b), we introduced the square of the number of media platforms used as a new variable in the regression model in addition to the number of platforms used. With respect to Hypothesis 3b, we also conducted (a) a post-hoc analysis to test an alternative model that included the potential effect of focusing on print or digital media platforms on content marketing effectiveness, and (b) an analysis testing for a U‐shaped relationship between the share of content distribution budget allocated to digital media platforms and content marketing effectiveness. With regard to (b), we introduced the square of the budget share as a new variable in the regression model in addition to the budget share. However, none of these post-hoc analyses yielded significant effects. In Hypothesis 4, we predicted that there would be a positive relation between a strong content promotion context in terms of paid content promotion budgets and content marketing effectiveness. With respect to this hypothesis, CPROM was not found to have a significant impact on CMEFFECT (β = .02, t(250) = .41, p = .69). Hence, we find no support for Hypothesis 4. To further evaluate the relationship between content promotion and content marketing effectiveness, we conducted an additional exploratory post-hoc analysis. We tested an alternative model that assessed whether the number of content promotion measures is positively related to content marketing effectiveness. The number of measures was also not linked to content marketing effectiveness. Hypothesis 5 stated that content marketing is more effective when organizations have a stronger content marketing performance measurement context. Regarding this Hypothesis, the regression coefficient indicates that CMPERME is significantly and positively associated with CMEFFECT (β = .18, t(250) = 2.69, p < .01). This is the hypothesized outcome, and therefore the data support Hypothesis 5. Furthermore, a specialized content marketing organization with supporting processes and information technology systems (CMORG) was found to have a positive effect on content marketing effectiveness (CMEFFECT) (β = .14, t(250) = 1.97, p < .05), as we hypothesized in Hypothesis 6. Consequently, Hypotheses 6 cannot be rejected.

Finally, we conducted a robustness check of our results by adding the respective organization’s annual content marketing budget to the model. Including this variable into our model did not change our findings, all the variables that were significant remained significant, while the overall annual budget was not significant (β = -.04, t(245) = -0.70, p = .48).

This study examined whether and how the organizational context in which content marketing activities are embedded in determines content marketing effectiveness. We conceptualized and empirically tested a model that proposed that strong content marketing strategizing, content production, content distribution, content promotion, content marketing performance measurement, and structural and processual contexts drive content marketing effectiveness.

Summary of findings and theoretical implications

Considered together, our analysis of the data reveals that context features have a substantial impact on the effectiveness of content marketing activities. Table 3 summarizes the findings.

Notes: + = a positive hypothesized relationship. Yes = the hypothesis was supported. No = the hypothesis was not supported.

Regarding the strategizing context, we found that a well-defined content marketing strategy that is clearly communicated, thoroughly understood by managers and employees, and widely supported within the organization positively influences content marketing effectiveness. The demonstration of this link between strategic clarity and strategy commitment on the one hand and content marketing effectiveness on the other hand adds to the theoretical and empirical elaboration of the determinants of content marketing effectiveness while incorporating insights from strategy research [ 24 , 25 , 27 , 28 ] into the content marketing domain.

In addition, we found that a strong content production context, characterized by the optimization of customer-perceived content value and adherence to normative content quality criteria, has a significant, positive impact on content marketing effectiveness. Our results support the line of reasoning in the uses-and-gratifications- as well as information quality literature [ 29 – 32 ], that providing content aligned with a target group’s subjective judgement of usefulness will increase the likelihood that content is interacted with, in turn positively influencing content marketing effectiveness. While prior content marketing research focused on this argument [e.g., 3 ], we also introduce the compliance with normative content quality criteria (such as diversity of viewpoints or impartiality) as a novel content production context factor that positively influences content marketing effectiveness. From this perspective, the integration of research on journalistic quality in theories about content marketing effectiveness is essential for the progress of knowledge about content marketing effectiveness.

With regard to the content distribution context, we did not find that the usage of an intermediate number of media platforms has a positive influence on content marketing effectiveness. This finding is noteworthy since research on integrated marketing communications generally assumes that using multiple media platforms will increase the effectiveness of communications efforts but that deploying too many media properties will attenuate effectiveness [ 37 , 38 , 40 , 41 ]. One reason for our result could be that the assumption of reactance theory underlying our hypothesis, that, from a certain point, the negative consequences of using an increasing number of media platforms outweigh the positive effects [ 41 ], does not hold. This explanation would be supported by a positive linear association between the number of content distribution platforms used and content marketing effectiveness. However, our post hoc analysis did not provide any evidence for this kind of relationship. Contrary to expectations, we also did not find a positive influence of a joint deployment of print and digital media platforms on content marketing effectiveness. In addition, post hoc analyses showed no significant effects of focusing on print or digital platforms only on CM effectiveness. These findings suggest that there is no general difference in effectiveness between these two kinds of media platforms, a result similar to the conclusion by Kwon and colleagues [ 105 ]. Heterogeneity of preferences theory suggests one interpretation for this [ 41 ], positing that media platform preference is idiosyncratic and that heterogeneity in individual platform preferences influences customer response to content marketing activities. Taking the aforementioned results together, the present study advances research on content marketing effectiveness by suggesting that effectiveness may be less a question of how many or whether print or digital content distribution vehicles are used, but more of utilizing precisely those media platforms that are best aligned with the respective organization’s target groups’ preferences. Following up on this, further research on the effects of using various content distribution platforms on content marketing effectiveness is warranted.

The present study did not find a positive relationship between paid content promotion budgets and content marketing effectiveness. This is not what we expected. However, empirical evidence from the field of advertising effectiveness research suggests an interpretation of the finding that more paid media investments are not always consistent with higher performance. According to respective descriptive knowledge [ 106 ], a metric that determines the level of performance is excess share of voice, defined as a brand’s share of voice minus share of market. Arguably, then, the amount invested in paid content promotion by a brand would have to be related to the total amount invested in paid content promotion in the brand’s category, and to the brand’s market position. Also, the contribution of paid content promotion to content marketing effectiveness could be shaped by the balance between paid promotion and owned content distribution platforms (e.g., [ 107 ]). This research therefore highlights that further work is needed to untangle the conditions under which paid content promotion measures might positively influence content marketing effectiveness.

Our theoretical elaboration and empirical investigation also provided evidence that core elements of the content marketing performance measurement context–regularly measuring the performance of print and digital content platforms and actually using the data obtained as guidance for continuously improving content offerings–positively influence content marketing effectiveness. Though previous research has shown positive performance implications of performance measurement in contexts other than content marketing [e.g., 55 – 57 ], this is the first study to successfully demonstrate this relationship for the content marketing domain. Our research thus expands previous research on CM effectiveness by incorporating performance measurement as a central element of a model of content marketing effectiveness. This finding might also have implications for future research, e.g. regarding the optimal configuration of content marketing performance measurement systems.

Finally, our work extends previous research on content marketing effectiveness by including structural specialization and specialization enabling processes and information technology systems as a new factor that positively influences content marketing effectiveness. The demonstration of the link between organizational structural and processual design elements on the one hand and content marketing effectiveness on the other hand lends support to researchers, such as Lee et al. [ 62 ], who have called for a new perspective of structural marketing, recognizing the importance of using organizational design elements to achieve marketing outcomes.

Overall, the aforementioned findings are important giving the centrality of empirical insights regarding the optimal design and implementation of content marketing initiatives to current academic interest [ 3 , 5 , 8 ].

Management implications

The present study has important implications for practice as well. It clearly identifies four context factors that positively influence content marketing effectiveness. However, it is noteworthy that the strength of relationship between each of these factors and content marketing effectiveness varies. This implies, that managers could, e.g. if necessary due to budget or attention restrictions, prioritize improvements in the content marketing context factors in line with their order of importance for effectiveness as it was found in this study, being (1) content production context, (2) content marketing strategizing context, (3) content marketing performance measurement context and (4) content marketing organization. Nevertheless, efforts to drive improvement in a single context domain are less beneficial than a comprehensive effort to establish strong content marketing context conditions across the entire range of content marketing activities.

In the following sections, we present individual management recommendations, based on the order in which the various context areas in this study were found to be important.

We first advise managers to constitute a strong content production environment. To do so, we encourage content marketing executives to systematically evaluate and optimize customer-perceived content value, which means putting the audience and its needs and wants first while at the same time keeping an eye on the organization’s communications objectives without becoming self-centered. Moreover, our findings provide a powerful argument that organizations should not compromise on the journalistic quality of their content, but instead strive for creating content pieces that stand out regarding journalistic aspects such as narrative perspective, originality, diversity of viewpoints, accuracy, comprehensibility, or compliance with ethical standards.

Our findings also suggest that a strong content marketing strategizing context is associated with higher content marketing effectiveness. In this respect, managers should work towards establishing strategic clarity. To do so, crafting a compelling content marketing purpose and vision, formulating clear content marketing goals and objectives, defining content creation principles and standards, clarifying key stories and main topics, developing customer personas, investing care about what the most appropriate content formats would be for the audiences being targeted, or planning content that is matched to customers’ buying processes would be beneficial for marketers. In addition, our findings suggest that practitioners should pursue strengthening commitment to the content marketing strategy within the organization. Possible measures to enhance comprehension and backing of the content marketing strategy include regularly communicating its core pillars, rigorously and openly addressing areas of concern, explaining strategic decisions, continuously training employees, or fostering strategic conversations (e.g., [ 108 ]).

Third, we highly recommend establishing a strong content marketing performance measurement context because that would quite certainly go along with a higher level of content marketing effectiveness. Establishment of a strong content marketing performance measurement context requires content marketers to shift part of their content marketing budgets from actual content marketing initiatives to measurement and analytic efforts. Doing so would be counterproductive if it did not enhance content marketing effectiveness. Our research supports exactly such a reallocation of resources, demonstrating that it can positively affect content marketing effectiveness.

Fourth, our investigation implies that shaping the structural and processual context of content marketing activities is a central task of managers since a specialized organizational context unfolds positive effects on content marketing effectiveness. One promising way to advance structural specialization is setting up organizational platforms offering shared and specialized working environments, often referred to as brand newsrooms or content factories. Such platforms could include various desks dedicated to specific topics, media, and target groups, teams devoted to strategy, project management, and further service areas such as graphics, video, or analytics, and an editorial board ensuring integration. To unleash agility, these structures should be supported by processes and underlying information technology solutions enabling interaction and collaboration between content marketing specialists as well as integration with further marketing functions and other relevant organizational entities.

Finally, our study questions the current high level of practitioner enthusiasm for focusing on digital content distribution platforms and multichannel communications. In the light of this study’s findings, it seems to be beneficial for organizations to utilize precisely those media platforms and systems that are best aligned with the respective organization’s target groups’ preferences. Caution is also advised regarding practitioner enthusiasm for paid content promotion measures. “Pay to play” measures such as influencer marketing, social media advertising or native ads in editorial environments have been presented as indispensable means to boost content marketing reach and thus improve content marketing effectiveness. However, we do not observe any simple and direct positive effect of content promotion budgets on content marketing effectiveness. As this is one of the first investigations to examine the impact of paid content promotion in the content marketing domain and given that the use and functionality of content promotion measures evolve continuously, our findings are preliminary. Scholars and practitioners need to further explore this emerging field.

Limitations and research directions

As all empirical research, the present investigation has limitations that call for attention in interpreting its findings. First, the data was cross-sectional which prohibits unambiguously interpreting the findings as indicating causality. Still, based on the theoretical argumentation provided above, the directions of causality implied in this study are likely. Future research might try to replicate these relationships via longitudinal or experimental study designs. A second limitation is that, though the study included organizations from various sectors and across different size categories, the sample is rather homogeneous with respect to cultural factors, as all participating organizations were located in Germany, Switzerland or Austria. Hence and given the global nature of content marketing research, scholars could investigate the suggested relationships in other contexts in order to further generalize the current findings. Third, the measurement of content marketing effectiveness is a potential limitation of this investigation, since we relied on subjective ratings rather than objective data. Thus, researchers might validate our findings with objective content marketing performance data. The study builds upon the views of a single key informant in every organization. While the key informant approach is common, relying on multiple informants from each organization might provide an even more balanced view. Besides, as earlier mentioned, the lack of any evidence of effects of the content distribution and content promotion contexts on content marketing effectiveness could be due to the way we framed them in this study. Therefore, other conceptualizations are worth investigating, including considering interactions of these context factors, as each factor’s contribution to content marketing effectiveness might be contingent upon the other. Also, only a limited number of potential confounders could be taken into account in this study. We adjusted for potential effects of firm size and industry, controlled for social desirability, and conducted an additional robustness check of our results that included the respective organization’s annual content marketing budget. In future, researchers could map out the nomological network of the research field in more detail using causal graph analysis [ 81 ], and subsequently conduct studies including further control variables to rule out alternative explanations for the observed relationships. Beyond addressing limitations, this study offers a number of additional directions for prospective research. For example, given that a strong content marketing performance measurement context offers demonstrable benefits, scholars might consider whether certain findings from the general marketing performance measurement field [e.g., 55 , 109 ] also apply to the content marketing domain. Research might, e.g., explicitly take into account whether content marketing performance measurement is comprehensive or selectively focused on particular dimensions, because larger organizations could benefit from more comprehensive and smaller organizations from more focused approaches. Furthermore, future studies may explore the influence of the organizational content marketing context on content marketing effectiveness via structural characteristics other than specialization. Other major structural characteristics, such as centralization, formalization, or modularity, might also exert influence on content marketing effectiveness. Importantly, future research might investigate mediating or moderating variables, such as external environmental effects. Market turbulence, for example, may moderate the value of content marketing context factors. Such investigations could further deepen the understanding of the determinants of content marketing effectiveness.

Supporting information

S1 appendix, acknowledgments.

We gratefully acknowledge the valuable comments of Vanessa Haselhoff in the development of earlier drafts of this article.

Funding Statement

The author received no specific funding for this work.

Data Availability

To read this content please select one of the options below:

Please note you do not have access to teaching notes, a systematic literature review: digital marketing and its impact on smes.

Journal of Indian Business Research

ISSN : 1755-4195

Article publication date: 20 February 2023

Issue publication date: 3 March 2023

This study aims to analyze the available literature on the use of digital marketing and its impact on small- and medium-sized enterprises (SMEs). This study identifies the use of digital marketing practices and its impact on SMEs.

Design/methodology/approach

A systematic literature review has been conducted on digital marketing, and its implementation in SMEs. The impact of digital marketing on SMEs performance is observed over the past 12 years through the resources which are undertaken for the study, namely, Science Direct, Scopus, Springer, IEEE Explorer, ACM Digital Library, Engineering Village, ISI Web of Knowledge database is used to search the research publications on the selected topic.

Although some SME firms use digital marketing, their impact is not similar where we can recommend a fixed strategy for applying digital marketing. This review provides an insight into how digital marketing has evolved over the period of time and how SMEs are adopting it for their sustenance.

Practical implications

This study will give theoretical analysis of various benefits received by SMEs because of digital marketing in the different capacities helping organizations to uplift their productivity. Mind mapping will give the idea of impact of SMEs on their various performances in rural as well as in the urban areas. This study will give further scope for digital marketers to approach those industries specifically at rural parts of the nation for bringing change into their marketing operations and also for increasing turnover by the use of digital marketing.

Originality/value

Research on the use of digital marketing by SMEs firms is still at the embryonic stage in India. This study is a pioneering effort to review the use of digital marketing in SMEs and identify research priorities for scholars and practitioners.

  • Literature review
  • Digital marketing
  • Small and medium enterprises
  • Impact on SMEs

Jadhav, G.G. , Gaikwad, S.V. and Bapat, D. (2023), "A systematic literature review: digital marketing and its impact on SMEs", Journal of Indian Business Research , Vol. 15 No. 1, pp. 76-91. https://doi.org/10.1108/JIBR-05-2022-0129

Emerald Publishing Limited

Copyright © 2023, Emerald Publishing Limited

Related articles

We’re listening — tell us what you think, something didn’t work….

Report bugs here

All feedback is valuable

Please share your general feedback

Join us on our journey

Platform update page.

Visit emeraldpublishing.com/platformupdate to discover the latest news and updates

Questions & More Information

Answers to the most commonly asked questions here

Extracting marketing information from product reviews: a comparative study of latent semantic analysis and probabilistic latent semantic analysis

  • Original Article
  • Published: 08 April 2023
  • Volume 11 , pages 662–676, ( 2023 )

Cite this article

literature review on content marketing

  • Shimi Naurin Ahmad   ORCID: orcid.org/0000-0002-7266-4506 1 &
  • Michel Laroche 2  

338 Accesses

2 Citations

1 Altmetric

Explore all metrics

User-generated content (UGC) contains customer opinions which can be used to hear the voice of customers. This information can be useful in market surveillance, digital innovation, or brand improvisation. Automated text mining techniques are being used to understand these data. This study focuses on comparing two common text mining techniques namely: Latent Semantic Analysis (LSA) and Probabilistic Latent Semantic Analysis (pLSA) and evaluates the suitability of the methods in two differing marketing contexts: Reviews from a product category and from a single brand from Amazon. The objectives of review summarization are fundamentally different in these two scenarios. The first scenario can be considered as market surveillance where important aspects of the product category are to be monitored by a particular seller. The second scenario examines a single product, and it is used to monitor in-depth customer opinions of the product. The results support that depending on the objective, the suitability of the technique differs. Different techniques provide different levels of precision and understanding of the content. The power of machine learning methods, domain knowledge and Marketing objective need to come together to fully leverage the strength of this huge user-generated textual data for improving marketing performance.

Similar content being viewed by others

literature review on content marketing

A survey on sentiment analysis methods, applications, and challenges

literature review on content marketing

Artificial intelligence in E-Commerce: a bibliometric study and literature review

literature review on content marketing

Sentiment Analysis in the Age of Generative AI

Avoid common mistakes on your manuscript.

Introduction

Due to the proliferation of the use of internet, e-commerce and social media, text data are readily available on the web. These text data can be a great resource for marketers who are eager to listen to the customers to better manage the marketing process. Traditionally, marketers send surveys to the customers; nowadays, product reviews, blogs or other digitized communication provide information on the attributes that are relevant to marketing decision making such as, adoption of a new product, possible composition of consideration set, etc. (Lee and Bradlow 2011 ). Due to the large volume and unstructured nature of the text data, sophisticated modeling is warranted, and researchers have been using data analytics, specifically machine learning techniques to uncover various patterns that helps the business (Mikalef et al. 2020b ).

A large number of studies have examined the effect of big data analytics on firm’s performance and the evidence quite overwhelmingly suggests that data analytics improves firm’s decision making and innovation (Branda et al. 2018 ; Gupta and George 2016 ), customer relationship management, management of operations risk and efficiency, market performance (Wamba et al. 2017 ) and at the end, overall performance (Kiron 2013 ). By providing accessible information to managers, data analytics creates a competitive advantage (Mikalef 2020a ). Studies have also shown a positive association between customer analytics and firm’s performance (German et al. 2014 ). Customer analytics may tap into different areas of customer experience ranging from purchasing behavior, prediction of buying trend to product recommendation, co-creation (Acharya et al. 2018 ) and opinion summarization about a specific feature of a product, etc.

The task of summarizing opinions or reviews has become one of the central research areas among the text mining community, mainly in the information retrieval literature (Mudasir et al. 2020 ; Hu et al. 2017 ). The techniques are becoming more sophisticated, and studies are increasingly reporting methods for extracting aspects/topics, textual summaries, etc. (Mudasir et al. 2020 ). The different formats and techniques provide different levels of understanding or precision of the content. Therefore, the users need to adapt these methods according to their own needs.

In the influential paper on automated text analysis in marketing, Berger et al. ( 2020 ) emphasized that regardless of the focus (to make a prediction, to assess impact or to understand a phenomenon), “doing text analysis well requires integrating skills, techniques, and substantive knowledge from different areas of marketing.” (Berger et al. 2020 , p. 6). Text analysis yields its best result when the positivist analysis (the factual knowledge gained by a scientific process, usually a quantitative method) is used in combination with qualitative and interpretive analysis. For example, Kubler et al. ( 2017 ) used tailored marketing dictionary which allows the analysis to be interpreted in the marketing context, rather than in a general context. A word may have a different interpretation depending on the context where it is used. The author then utilized this exclusive dictionary ( tailored for marketing domain) inside an automatic text analysis-based sentiment extraction tool, namely, support vector machine (Cui and Curry 2005 ) to uncover different marketing metrics from user-generated content. Berger et al. ( 2020 ) further elaborated this point and explained that quantitative skill helps building the right mathematical model, but behavioral skill relates the phenomenon (the findings) to underlying psychological processes, and most importantly for marketers, strategy skill which can be defined as the skill to understand the findings from the big data and convert these findings into firm’s actionable items and outcomes helps reach firm’s goals. Therefore, these text data can ultimately aid a firm’s marketing decision making and be a great resource, but the combinations of above mentioned tools seem to be necessary. In this light, it is very important that marketers build their tools using machine learning techniques as well as the other soft skills, especially Marketing-specific knowledge and skill to get most out of the data (Ma and Sun 2020 ). However, Marketing analytics literature is still premature in providing guidance about the suitability of a particular analytics tool in crafting overall firm’s strategy (Vollrath and Villegas 2022 ). Machine learning and text mining experts are skilled in building accurate and precise mathematical models and often, their goal is to improve prediction. The “right answer” for goal might be different for different objectives. Therefore, when the goal is to improve overall marketing metrics, it is recommended in the literature that domain knowledge be incorporated in the process (Hair and Sarstedt 2021 ). In a recent paper, Huang and Rust ( 2021 ) elaborated that Artificial Intelligence use in Marketing should be in three stages: “Mechanical AI” for repetitive tasks, “Thinking AI” for analyzing data and making a decision and “Feeling AI” for understanding consumers and interacting with them. The latter two need domain knowledge as input to optimize the goal of improving marketing metrices. This paper responds to that call of integrating quantitative model with goal-specific domain knowledge to better assist managers in taking actions. A firm’s marketing decision making through text analysis task is better served when domain knowledge is incorporated rather than borrowing predefined model invented for a different purpose.

As mentioned before, opinion summarization has been an active area of research in information retrieval literature for over decades now, marketers need to tailor these methods according to their objectives and needs to leverage the strength of this huge textual data. The strength of statistical power and goal of marketers need to come together to fully utilize this opportunity. With this in mind, the current research focuses on comparing two common text mining techniques from a marketer’s perspective. Analyzing the text data of the reviews posted on Amazon.com, the current study compares: Latent Semantic Analysis (LSA) (Deerwester et al. 1990 ) and Probabilistic Latent Semantic Analysis (PLSA) in extracting useful summarizing information in terms of common themes. In the first context, the reviews are taken from the category of kitchen appliances where there were different brands and several kitchen products within this dataset of reviews. Second, only one brand of a product’s review is examined in handbag category. The objectives of review summarization are fundamentally different in these two scenarios when the analysis is intended to provide information about market research. The first scenario provides information about the whole market in that product category. It can be considered as market surveillance where important aspects of the product category are to be monitored by a particular seller to find out what characteristics of the product category are of main concern. These are also the key aspects of the whole customer experience that determine customer satisfaction or dissatisfaction. The insight can be used to improve an offering through innovation or by combining digital aspect to it, also known as digital innovation (Sahut et al. 2020 ). Since the information comes from well-represented consumers are “organic’ text, it is free from any bias and doesn’t restrict any topic which is a common problem even in well-crafted surveys (Savage and Burrows 2009 ). The second scenario examines a single brand. This is useful for brand managers when an in-depth analysis of consumers’ opinion is sought after. There have been studies that have looked at the performances of LSA and PLSA (Ke and Luo 2015 ; Kim and Lee, 2020). However, to the best of our knowledge, there is no study that compares these two methods in two different scenarios where marketing goals are different and evaluate the suitability of the techniques in differing contexts.

The rest of the paper is organized as follows: We review the literature on User-generated content, Customer analytics, opinion summarization and some text mining tools. Experimentation is presented next along with findings, followed by the Discussion and managerial implications.

Literature review

User-generated content.

UGC refers to any content created by users or consumers of a product or service, such as product reviews, social media posts, blog articles, and videos. In recent years, user-generated content (UGC) has exploded, and these UGCs are often text data in the form of blogs, reviews, or social media interactions. The scholars have examined a range of issues (Iacobucci et al. 2019 ), such as how and why people make UGC contributions (Braune and Dana 2021 ; Moe and Schweidel 2012 ; Ransbotham et al. 2012 ) and the impacts of UGC (Zhang et al. 2012 ) including review rating and text (Sallberg et al. 2022 ), among others.

User-generated content (UGC) can benefit firms in several ways, including increased customer engagement (Bijmolt et al. 2010 ), improved brand loyalty (Llopis-Amorós et al. 2019 ), and brand co-creation (Koivisto and Mattila 2020 ). A study by Constantinides and Fountain ( 2008 ) found that UGC can positively impact the credibility and perceived quality of a brand, leading to increased brand loyalty and purchase intentions. Additionally, UGC can enhance the authenticity of a brand by providing real-life examples of product usage and customer experiences. More importantly, UGC can also provide valuable insights into customer preferences, needs, and touch points, which can help firms improve their products and services. In a study by Bernoff and Li ( 2008 ), it was found that UGC can help firms identify customer needs and trends, leading to improved innovation and product development.

In a study of UGC and its impact, Li et al., ( 2021 ) modeled consumer purchase decision process and found evidence that UGC impacts every state of this process. UGC can also provide valuable insights and ideas that firms can use to develop new products, services, or marketing strategies (Hanna et al. 2011 ).

Although UGC can be generated in various forms, product reviews ratings and content are very influential in terms of sales (Mudambi and Schuff 2010 ). The impact of review ratings on product sales has been thoroughly studied (Chevalier & Mayzlin 2006 ; Liu 2006 ) including various product categories. The sales of books (Chevalier & Mayzlin 2006 ) and movies (Liu 2006 ) were affected by ratings of the review generated by users. Research has also explored the impact of review content on marketing parameters, such as the helpfulness vote (Ghose and Ipeirotis 2011 ), consumer engagement (Yang et al. 2019 ) and digital innovation (Sahut et al. 2020 ). Although these data can provide valuable information about market and customers, it can be hard to decipher the actual information from the unstructured data (Zhu et al. 2013 ) and gave rise to customer analytics.

Customer analytics

As mentioned before, a large number of studies investigated the relationship between big data analytics or customer analytics and results suggest that data analytics enhances firm’s decision making and innovation (Branda et al. 2018 ; Gupta and George 2016 ). To analyze the customer generated text data, which most commonly occur across web, marketing scholars are using text analysis tools and methods to analyze these data automatically (Kamal 2015 ). These data types and analytical methods vary widely across different branches of Marketing analytics (Iacobucci et al. 2019 ). There are many cutting edge methods that have been used by Marketing scholars to analyze UGC and consumer reviews, in particular.

Ghose and Ipeirotis 2011 showed strong evidence that consumer review affect economic outcome, product sales and some aspects of reviews such as subjectivity, informativeness, readability, and linguistic correctness in reviews affects potential sales and perceived usefulness. They use Random forest model and text mining to uncover the insight. Netzer et al. 2012 came up with a market-structure perceptual map using consumer review data on diabetes drugs and sedan cars. The authors utilized the combination of text mining techniques and network analysis to introduce this map.

With a little bit different focus, Hou et al. ( 2022 ) studied driving factors of web-platform switching behavior using dataset of both blogging and microblogging activities of the same set of users. The authors used a sophisticated text analysis technique: multistate survival analysis. Skeen et al. ( 2022 ) took a very innovative approach to combine qualitative analysis with natural language processing and designed a mobile health app which was very customer centered.

Given this huge amount of user-generated content, it is quite useful to summarize consumers’ opinion in the aggregate level and derive marketing information from there. Li and Li ( 2013 ) summarized a large volume of microblogs to discover Market intelligence. Since our study is closely related to this area, we next review the literature on opinion summarization and sentiment analysis.

Opinion summarization and sentiment analysis in marketing

As the name implies, opinion summarization provides an idea about the whole document collection in brief. There is vast research investigating algorithms for summarization using different technical methods (Moussa et al. 2018 ). In Marketing related opinion summarization techniques, Vorvorean et al. ( 2013 ) introduced a method of using social media analytics that can decipher the topics of UGC, assess a major event and at the end, can have useful impact on marketing campaign.

Sentiment classification is one of the important steps in analyzing text data and can be used as part of opinion summarization. In this process, orientation of sentences or the whole documents are identified. This will result in an overall summarization of the documents as users get an idea about what is being said (positive and negative). There are several approaches in identifying sentiments which find out the adjective in the text and thus try to understand the positivity or negativity of the text (Li et al. 2018 ; Salehan and Kim 2016 ). Salehan and Kim ( 2016 ) used sentiment analysis to see the impact of online consumer review in terms of their readership and helpfulness.

Sentiment classification can be used as a simple summary, this method is very useful when there is a large collection of data involved and aggregate level opinion is sought after. Some technical methods studies (Jimenez et al. 2019 ; Kamps and Marx 2001 ) used WordNet-based approach using semantic distance from a word to “positive” and “negative” as a classification criterion between sentiments. Ku et al., ( 2006 ) used frequency of the terms for feature identification and used sentiment words to assign opinion scores. Lu et al. ( 2009 ) used natural language processing techniques to K ( K  = any number) interesting aspects and utilized bays classifier for sentiment prediction.

As mentioned before, extracting common themes along with its sentiment from user-generated content can be considered as summarizing the content since it tends to reflect the whole content. Next, we review some of the text analysis techniques that have been used in prior research.

Text analysis tools and methods

Studies have used a wide variety of techniques to analyze texts and specially to extract themes from texts. One of the foundational techniques to extract themes from a body of text is Latent Semantic Analysis (LSA). There are many studies that used LSA for the purpose of opinion summarization (Steinberger and Ježek 2009 ). Sidorova et al., ( 2008 ) used LSA to uncover the intellectual core of information research from published journal papers. The method mainly relies on the co-occurrence of the word and is not based on statistical modeling. Cosine distance can also be used in latent semantic analysis space to measure topics in the text (Turney and Littman 2003 ).

Another stream of techniques that focuses on extracting themes is defined as generative probabilistic model and is based on a solid foundation of statistics. Vocabulary distribution is used to find topics of texts. Basically, it first identifies the word frequencies and relation between other words (co-occurrences) effectively. There are several topic modeling approaches in this family. Probabilistic Latent Semantic Analysis (PLSA) (Hofmann 1999 ) and LDA (Latent Dirichlet Allocation) are the important ones. Table (Table 1 ) shows that identifies some key literature using these methods:

Comparative studies between LSA and PLSA

There are some studies that have compared these two techniques (LSA vs. PLSA) in various contexts. One study (Kim et al. 2020 ) compared two text mining techniques to predict blockchain trends by analyzing 231 abstracts of papers and their topics. The techniques were W2V-LSA which is an improvised version of LSA and PLSA. The study concluded that the new technique W2V-LSA worked better in finding out proper topics and in showing a trend. Ke and Luo ( 2015 ) compared LSA and PLSA as automated essay scoring tools. The result showed that both methods have some correlation in their performances, and both did well in their task. A bit different, a study by Cvitanic et al. ( 2016 ) compared the suitability of using LDA and LSA in the context of textual content of patents. The study suggested that more work is needed to recommend one method versus another to analyze and categorize patents.

Although along the same line, the current study does not fully focus on summary presentation; instead, it focuses on features and their sentiment orientation that are visible in the topics. Summary presentation is often used to make the summary of the reviews more understandable to customers. From a managerial perspective, they need to know in detail what is being said about a particular feature. Therefore, the current study examines the topic extraction and the suitability of these two techniques from a managerial perspective. As mentioned in the previous paragraph, there have been studies where performance of these two techniques is compared. Some of them found evidence of the superiority of one method, some reported the same kind of efficiency, and some recommended more studies to conclude. However, to the best of our knowledge, no study has looked at these methods in two different contexts with varying objectives. Given the new understanding of automatic text analysis, where quantitative skill is to be combined with domain knowledge, and the fact that accuracy of retrieval is not the focus in marketing, the current study tries to fill the void in research in this area.

Methods and data

For the purpose of this study, as a starting point of domain-specific tool adaptation, we use two fundamental techniques (LSA and PLSA) of topic modeling. Both use topic modeling algorithms and the basic assumption of this type of modelling algorithms are (a) each document consists of a mixture of topics, and (b) each topic consists of a collection of words. LSA is one of the foundational techniques in topic modeling. LSA takes a document and terms matrix and decompose it in two reduced dimension matrices: one is document-topic matrix and the other is topic-term matrix. The whole technique is based upon singular value decomposition (SVD) and dimension reduction. pLSA, on the other hand, belongs to another stream of techniques within topic modeling. It is based on probabilistic method; Instead of SVD used in LSA, pLSA tries to come up with a probabilistic model with latent topics which can ultimately reproduce the data. There are other topic modeling techniques that build on pLSA like LDA (Latent Dirichlet Allocation) which is basically a Bayesian version of pLSA and therefore uses Dirichlet priors. Next, we describe the methods in detail:

Latent semantic analysis

Latent Semantic Analysis (LSA) is a text mining technique that extracts concepts hidden in text data. This is based solely on word usage within the documents and does not use a priori model. The goal is to represent the terms and documents with fewer dimensions in a new vector space (Han and Kamber 2006 ). Mathematically, it is done by applying singular value decomposition (SVD) on a term-by-document matrix (X) that holds the frequency of terms in all the documents of a given collection. When the new vector space is created by retaining a small number of significant factors k and X is approximated by X  =  T k S k D k T (Landauer et al. 1998 ). Term loadings ( L T  =  T k S k ) are rotated (varimax rotation is used) to obtain meaningful concepts of the document collection. The algorithm is shown in Fig.  1 . It is implemented using Matlab.

figure 1

Algorithm flow chart (LSA)

Probabilistic latent semantic analysis

Probabilistic Latent Semantic Analysis (pLSA) is another text mining method that was developed after LSA (Hofmann 1999 ). Unlike LSA, it is based on a probabilistic method, namely, a maximum likelihood model instead of a Singular Value Decomposition. The goal is to recreate the data in terms of term-document matrix by finding the latent topics. So, a model P(d,w) is put forward where document d and word w are in the corpus and P(d,w) corresponds to that entry in the document-term matrix. In this scenario, a document is sampled first, and in that document, a topic z is sampled, and based on the topic z, a word w is chosen. Therefore, d and w are conditionally independent given a hidden topic ‘z’. This can be represented in Fig.  2 :

figure 2

A document can be selected from the corpus with a probability of P(d). In the selected document, a topic z can be chosen from a conditional distribution with a probability P(z|d) and a word can be selected with a probability of P(w|z). The model makes two assumptions. First, the joint variable (d,w) is sampled independently, and more importantly, words and the documents are conditionally independent.

After some mathematical manipulation, it can be written in the following form.

The modeled parameters are commonly trained using an Expectation–Maximization algorithm. The equation lets us estimate the odds to find a certain word within a chosen document using the likelihood of observing some document and then based upon the distribution of topics in that document, the odds to find a certain word within that topic. In a flowchart form (Fig.  3 ):

figure 3

Algorithm flow chart (PLSA)

Differences between LSA and PLSA

Both LSA and PLSA can recreate the data content based on the model. But there is an important difference between the two methods.

First, in LSA calculations, SVD is based on Matrix decomposition which is the F-norm approximation of the term frequency matrix, while PLSA relies on the likelihood function and prior probability of the latent class (probability of seeing this class in the data for a randomly chosen record, ignoring all attribute values) and, finds the maximum conditional probability of the model.

Second, in LSA, the recreated matrix X does not contain any normalized probability distribution, while in PLSA, the matrix of the co-occurrence table is a well-defined probability distribution. Both LSA and PLSA perform dimensionality reduction: LSA keeps only K singular values and PLSA, keeps K aspects.

For the purpose of the comparison, in the subsequent sections we need to find the comparable parameters of both models. From the mathematical and interpretation standpoint, the three matrices from SVD correspond to three probability distributions of PLSA:

T Matrix is synonymous to P(d|z) (doc to aspect).

D Matrix is related to P(z|w) (aspect to term).

S Matrix related to P(z) (aspect strength).

Performance Measure

To compare two techniques, one needs to evaluate the performance of each of these methods. In the analysis section, both quantitative evaluation and qualitative observations (Mei et al. 2007 ; Titov and McDonald 2008 ) are used to analyze the data results. Among the quantitative measure, precision/recall curve is the most widely used measure (Titov and McDonald 2008 ). Precision is defined as the number of relevant words retrieved divided by number of all words retrieved. This provides a measure of accuracy. The numbers of irrelevant words are counted to evaluate lack of accuracy.

Moreover, the following classification helps in the measure of accuracy:

Here, we measured the false positives and compared the two techniques. Ideally, false positives should be as low as possible. The measure of recall is used when the total of relevant words is known. Since, for conversational text, it is difficult to develop and measure a list of total relevant words, we did not use recall or false positive/negative as a measure of performance in this analysis.

To begin, we utilized a dataset containing reviews of kitchen appliances. It was sourced (downloaded) from publicly available dataset collected by Blitzer et al. ( 2007 ). There were also reviews on books in this dataset. We excluded book reviews, because the content of the book written in the review may confound the topics of the review. In total, there were 406 kitchen appliances reviews included in the dataset, with 148 reviews being positive and 258 reviews being negative. Additionally, the authors analyzed a second dataset consisting of reviews for a specific brand of handbag, "Rose Handbag by FASH," that was obtained from Amazon.com in 2011. This dataset contained a total of 389 reviews. We used LSA and pLSA to extract hidden topics and associated words from both datasets, and subsequently compared the performance accuracies of the two methods.

First, we analyze the brand-specific Handbag reviews. The reviews which got star rating 3 or more were classified in the positive reviews. On the other hand, reviews with star ratings 1 and 2 are classified as negative reviews. In the LSA model, three dimensions are retained after SVD. To compare the extracted topics with the topics extracted from the PLSA, we kept three topic groups for PLSA too (dimensions in LSA are comparable to topics in PLSA, shown in Table 2 ). For the positive reviews, the three topics/factors are named as “Leading positive attributes of the product”, “Core functionalities”, and “Affective” based on the associated words retrieved by both methods. On the other hand, for the negative reviews, the three topics are “not leather’, “Problems”, “Service failure” (shown in Table 3 ).

The comparison of the word associated with each positive topic (Table 2 ) shows that topics extracted by PLSA have more interpretability and contain more information. For example, for the positive reviews, the words which have high probability to be in the topic (“Leading Positive Attribute of the Product”) are “large”, “roomy”, “price”, “quality” (colored in pink). However, these important terms (since these words imply the competitive advantage of the brand and the topic) were not picked up by LSA. Moreover, among the words picked up by LSA, “review”, “purse”, “thank”, “shoulder” (colored in orange) is not relevant to this topic. The remaining words both in LSA and PLSA (colored black) contribute to the meaning of the factors (in both LSA and PLSA they are either relevant or neutral words). By neutral, we mean the words that are relevant and contribute to the better interpretation of the factor, but do not have unique power like the pink words in PLSA. For example, “amazing”, “beautiful”, “nice”, etc. contribute to the meaning of the “leading positive attributes” and help in the interpretation that customers are happy with these attributes of the product. However, these do not describe any of the leading attributes. The results show the top 10 terms (according to the probability for PLSA and loadings for LSA). A comparison of relevant and irrelevant words picked up by both methods are presented in Tables 4 and 5 , respectively.

To quantify the performance superiority of one method over the other, precisions of the two methods are calculated and shown graphically in Figs.  4 and 5 . The number of irrelevant words picked up by both methods implies the inferiority of the method. This is shown in Table 5 . A method needs to yield a high precision as well as low irrelevant words to be considered as superior technique. As mentioned before, there are some words that are neutral: neither uniquely relevant nor irrelevant. They do not yield additional information about a topic but help understand the meaning of the topic. For example, in the case of positive reviews of a handbag, the words: nice, beautiful, or bag do not provide additional information, but provides better comprehension of the sentiment and topic. Hence, these are not counted towards relevancy or irrelevancy of the topic.

figure 4

Precision curve of positive reviews

figure 5

Irrelevant words of positive reviews

For the negative reviews, the same pattern emerges (Tables 6 and 7 ). The associated words with the first topic are almost identical in both methods. In the next topic (“Problem”), PLSA extracts more unique words that represent specific problems like “Rough”, “Thread”, “Material”, etc., which are not present in the LSA extraction. Both models convey the information that the product does not “look” like the “picture/photo”. Moreover, the service failure topic of PLSA also contains more specifics than LSA.

The precision of the two techniques for negative reviews are calculated. The Graphical representation of the precision curve is provided in Fig.  6 :

figure 6

Precision curve of negative reviews

The percentage of irrelevant words retrieved by the techniques is shown in Fig.  7 . The graph shows that PLSA has a much lower percentage of irrelevant words than LSA (Fig. 7 ).

figure 7

Percentage of retrieved irrelevant words in negative reviews

figure 8

Example of positive and negative Reviews of the Handbag

It is quite clear from these figures that LSA performs less efficiently than PLSA when analyzing reviews from a particular brand, or LSA was not able to extract the specifics to the extent that PLSA did. The real example of positive and negative reviews (Fig 8 ) provides supports for the superiority of PLSA in this context. LSA was not able to effectively extract the complaints in negative review and large and spacious component of positive reviews.

With this in mind, we proceed to the next analysis to see if this pattern holds in other context. We extracted topics from a broader category “Kitchen Appliances” which contains reviews of various brands and appliances. As before, we divided positive and negative reviews into two groups based on their star rating. We then extract topics from the reviews. The results are shown in Table 8 . A careful examination of the topics reveals that PLSA has formed the topics according to the specific appliances. For example, oven, pan and skillet; baking needs, then knives. On the other hand, if we look at the topics from LSA, it provides an overall summarization of the important aspects and attributes of this product category.

It can be seen that LSA extracts topics that provide information about an attribute of the product category. For example, it can be inferred by looking at the factors extracted by LSA that, customers talk about core functionalities, aesthetics, branding, technical aspects, and affective content in the reviews. However, if the topics of PLSA are examined, it is evident that the topics are extracted according to the appliances. For example, first topic relates to “oven, pan, skillet”, the second one relates to “baking”, the third one to “knives”, and then “kettle and tea”. Unlike LSA topics, these do not express core themes of the reviews. Therefore, from a managerial perspective, information in the topics extracted by PLSA has little to no use. On the other hand, the topics in LSA provide the perspective of what customers generally look for in this broader product category. For example, customers are happy if the appliances have an aesthetic attribute in addition to the core functionalities and technical superiority. Moreover, this category seems to be a popular choice for gift giving. Customers also compare different brands when buying in this product category. All this information helps a manager decide about the attributes to include in a new product in this category or improvement of the product. Therefore, in this scenario, LSA works better in terms of interpretability. The following review supports the results we received from LSA which were not visible by PLSA.

“An elegantly designed LONG WIDE toaster…..Very clean, modern appearance. Looks great sitting on the kitchen counter, whereas many of the other toaster models today look like ugly chrome spaceships from the 1950's. Personally, I'm not into that kind of retro look……….” (aesthetics).

Or “This ice cream Maker is "GREAT". The fact that I can use an industrial motor (my kitchen Aid mixer) is fantastic…..” (technical aspect).

“….Also makes a fabulous wedding, shower, or housewarming gift. Forget expensive wedding registries—buy the bride a lodge dutch oven and skillet. She'll hand them down to the next generation…..” (gift giving/affective).

Therefore, depending on the objective of topic extraction, either PLSA or LSA becomes the superior method, and the superior performance of PLSA that was exhibited in the brand-specific reviews does not exist in every scenario. The result can be attributed to the fact that PLSA finds the highest probability terms that are likely to occur in the document. On the other hand, LSA tries to infer the topic based on the word co-occurrences.

We do not produce a performance measure curve for this section. As discussed before, the grouping of words is completely different, and a performance measure curve (or the table of relevance measurement) will not provide any meaningful comparison since there is no overlap of relevant and irrelevant words.

User-generated contents are everywhere. This data contains information on sentiment and customer experiences about products or services. For market researchers, these contents are very useful and important. The use of content analysis goes back several decades in marketing. Qualitative content analysis reveals patterns, and this technique has been used in marketing for a long time (Bourassa et al. 2018 ; Phillips and Pohler 2019 ). However, contents found on the web are huge in size and usually, it is very cumbersome to manually analyze these unstructured texts. An intelligent and automated method is needed where the analysis of large amounts of data can be completed. Research has shown that competencies in big data analysis of a firm predict better performance measured by innovation, customer relationship management, etc. Big data analysis can assist in knowledge co-creation which in turn assists in making better decision (Acharya et al. 2018 ). More specifically, research points to the fact that domain knowledge should be incorporated while crafting the model and interpreting the result (Berger et al. 2020 ). Only by breaking the silos of different knowledge base, Marketing analytics can achieve its best result (Petrescu and Krishen 2021 ).

The current study tries to find the best method for extracting managerial information in two different marketing scenarios. Every technique has its own advantages and disadvantages. The suitability of the techniques depends on the context where it is being used. Although computer science researchers have been looking into this area for a long time, the marketing discipline started to investigate this area about a decade ago only. The knowledge and performance measures of the techniques cannot be directly transferred to the marketing domain since the performances are context specific. For example, from a retrieval perspective (Information Technology literature), success is the system’s ability to retrieve similar words or documents containing the same topic when a query word is provided to a system. So, the higher the performance, the higher the rate of finding out relevant (similar) words. On the contrary, in this marketing context, the higher the performance, the higher retrieval of the marketing manager’s important information terms/documents. The current study supports the idea that the choice of a text mining approaches should be domain-specific and augmented with domain knowledge.

As mentioned before, the two contexts were different in terms of specificity, meaning that one context contained customer reviews of only one brand of handbag and the other context contained reviews of different brands and appliances of “Kitchen Products”. The results show that, in the former case, PLSA extracted topics that are more meaningful and concrete. It was more interpretable and contained more information. LSA extracted topics well; but they were not as complete as PLSA topics. There were cross-words meaning that one word belonged to more than one factors. There was also a high number of irrelevant words in a topic compared to PLSA. Based on the precision and number of irrelevant words extracted by these two techniques, it can be concluded that in this context, PLSA works better in achieving the goal.

In the second context, where the goal was to learn important topics in a product category with lots of brands and products, LSA outperformed PLSA. Here also, PLSA extracted meaningful topics; but not aligned with important marketing interests. Each topic represented each appliance in the product category “kitchen appliances”. More importantly, it did not group the topics according to the discussion topics of the product category (hence product attribute), which are of the main interest from a marketing manager’s perspective. For example, PLSA extracted topics (Oven, Baking, Knives, etc.) may not provide a marketing manager with useful insights. It should be noted that from an information retrieval perspective PLSA might have done a fair or even superior job; however, depending on what kind of information is needed, PLSA is not a superior technique in this context. On the contrary, LSA grouped the topics according to the discussion topics of the review: core functionalities, technical aspect, branding, etc. This information is of interest to the marketing manager. Therefore, the study concludes that if the goal is to learn about a specific brand and its positive and negative attributes, PLSA reveals more specific information. However, if the goal is to learn about important aspects of a broader product category, LSA works better. The current study contributes in two ways: firstly, it responds to the recent call for research for marketing specific data analytics tool where marketing knowledge and goal is incorporated with sophisticated machine learning tools. Secondly, by experimenting in two different marketing scenarios, the study examines the suitability and superiority of two data analytics techniques.

Managerial implications

Managers can benefit greatly from understanding the topics of positive and negative reviews because they provide valuable insights into customer perceptions and preferences. By analyzing the topics that customers mention in their reviews, managers can identify areas of strength and weakness in their products, services, and overall customer experience. Using the right text mining tools, managers can identify areas for improvement. For example, the handbag should be improved in terms of its look (customers were disappointed that it did not look like leather). They can also identify areas of strength: The handbag was stylish and spacious. Managers can highlight these in their marketing messages and product descriptions, potentially driving sales and customer loyalty. Managers may also compare their product with competitors by evaluating competitors’ brands. In the broader product category, the topics may reveal important aspects of the category. For example, LSA revealed that aesthetics and gift giving were important in kitchen appliances, which might not be evident. Managers can track the topics mentioned in positive and negative reviews over time and thus, can identify changes in customer perceptions and preferences.

Limitation and future research

Like any other studies, this study is not without limitation. First, for the performance measurement, the study uses a precision measure, which looks at the number of relevant words retrieved in all retrieved words. However, there are words that are relevant to the topic but not useful. For example, in the handbag positive reviews, the words “nice”, “favorite” does not provide any additional information. But these words are not irrelevant words at all. To be conservative, the present study kept these words out from the “relevant” and “irrelevant” word counts so that the results are not biased. A count of irrelevant words provides another measure of performance that was used in the current study. However, the main criticism of this kind of performance measure is the subjectivity of the meaning. The precision measure is a binary approach that fails to capture the fuzziness in meaning of the words. Although the present study uses manual inspection to measure precision, the subjectivity often becomes a problem and may bias the result. To combat this problem to some extent, the ambiguous meaning words are left out while performing measurement of irrelevant words. Another limitation was that the dataset was small. However, the size of the dataset aided the manual coding of relevant/irrelevant words that was needed to come up with precision/ recall measure. Big dataset will introduce more noise and the result may lack objectivity. As recommended in the literature, automatic text analysis can learn from manual coding of small dataset and the model can then be applied to big dataset for real-life use (Chen et al. 2018 ).

Application of text mining in the marketing domain is a rising phenomenon. The fact that if a text mining technique is superior in terms of information retrieval (for representing the data, retrieving similar documents, or search purposes), it might not be a superior text mining technique for a marketer’s point of view. This idea warrants marketing researchers to experiment with techniques and find their suitability in different marketing contexts and needs.

Acharya, A., S. Singh, V. Pereira, and P. Singh. 2018. Big data, knowledge co-creation and decision making in fashion industry. International Journal of Information Management 42: 90–101.

Article   Google Scholar  

Ansari, A., and Li, Y. and Zhang, J. 2018. Probabilistic topic model for hybrid recommender systems: A stochastic variational Bayesian approach. Marketing Science 37 (6): 987–1008.

Berger, J., A. Humphreys, S. Ludwig, W.W. Moe, O. Netzer, and D.A. Schweidel. 2020. Uniting the tribes: Using text for marketing insight. Journal of Marketing 84 (1): 1–25.

Bernoff, J., and C. Li. 2008. Harnessing the power of the oh-so-social web. MIT Sloan Management Review 49 (3): 36–42.

Google Scholar  

Bijmolt, T.H.A., P.S.H. Leeflang, F. Block, M. Eisenbeiss, B.G.S. Hardie, A. Lemmens, and P. Saffert. 2010. Analytics for customer engagement. Journal of Service Research 13 (3): 341–356.

Blitzer, J., Dredze, M., and Pereira, F. 2007 Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics , Association for Computational Linguistics, Prague, Czech Republic, (pp. 440–447).

Bourassa, M.A., P.H. Cunningham, L. Ashworth, and J. Handelman. 2018. Respect in buyer/seller relationships. Canadian Journal of Administrative Sciences 35 (2): 198–213.

Branda, A., V. Lala, and P. Gopalakrishna. 2018. The marketing analytics orientation (MAO) of firms: Identifying factors that create highly analytical marketing practices. Journal of Marketing Analytics 6: 84–94.

Braune, E., and L.P. Dana. 2021. Digital entrepreneurship: Some features of new social interactions. Canadian Journal of Administrative Sciences 39 (3): 237–243.

Chen, N., M. Drouhard, R. Kocielnik, J. Suh, and C. Aragon. 2018. Using machine learning to support qualitative coding in social science: Shifting the focus to ambiguity. ACM Transactions on Interactive Intelligent Systems 8 (2): 1–20.

Chevalier, J., and D. Mayzlin. 2006. The effect of word of mouth on sales: Online book reviews. Journal of Marketing Research 43 (3): 345–354.

Constantinides, E., and S.J. Fountain. 2008. Web2.0: Conceptual foundations and marketinngissues. Journal of Direct, Data and Digital Marketing Practice 9 (3): 231–244.

Cui, D., and D. Curry. 2005. Prediction in marketing using the support vector machine. Marketing Science 24 (4): 595–615.

Cvitanic, T., Lee, B., Song, H. I., Fu, K., and Rosen, D. 2016. LDA v. LSA: A comparison of two computational text analysis tools for the functional categorization of patents. In International Conference on Case-Based Reasoning.

Deerwester, S., S. Dumais, G. Furnas, T. Landauer, and R. Harshman. 1990. Indexing by latent semantic Indexing. Journal of the American Society for Information Science 41 (6): 33–47.

Germann, F., G.L. Lilien, L. Fiedler, and M. Kraus. 2014. Do retailers benefit from deploying customer analytics? Journal of Retailing 90: 587–593.

Ghose, A., and P. Ipeirotis. 2011. Estimating the helpfulness and economic impact of product reviews: Mining text and reviewer characteristics. IEEE Transactions on Knowledge and Data Engineering 23 (10): 1498–1512.

Gupta, M., and J. George. 2016. Toward the development of a big data analytics capability. Information & Management 53 (8): 1049–1106.

Hair, J.F., Jr., and M. Sarstedt. 2021. Data, measurement, and causal inferences in machine learning: Opportunities and challenges for marketing. Journal of Marketing Theory and Practice 29 (1): 65–77.

Han, J., and M. Kamber. 2006. Data mining: Concepts and techniques . Burlington: Morgan Kaufmann Publishers.

Hanna, R., A. Rohm, and V. Crittenden. 2011. We’re all connected: The power of the social media ecosystem. Business Horizons 54 (3): 265–273.

Hofmann, T. 1999. Probabilistic latent semantic analysis. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence , Stockholm, Sweden.

Hou, L., Guan, L., Zhou, Y., Shen, A., Wang, W., Luo, A., ... and Zhu, J. J. 2022. Staying, switching, and multiplatforming of user-generated content activities: a 12-year panel study. Internet Research (ahead-of-print).

Hu, H., Zhang, C., Luo, Y., Wang, Y., Han, J., and Ding, E. 2017 Wordsup: Exploiting word annotations for character-based text detection. In Proceedings of the IEEE international conference on computer vision (pp. 4940–4949).

Huang, M.H., and R.T. Rust. 2021. A strategic framework for artificial intelligence in marketing. Journal of the Academy of Marketing Science 49 (1): 30–50.

Iacobucci, D., M. Petrescu, A. Krishen, and M. Bendixen. 2019. The state of marketing analytics in research and practice. Journal of Marketing Analytics 7 (3): 152–181.

Jimenez, S., F.A. Gonzalez, A. Gelbukh, and G. Duenas. 2019. Word2set: WordNet-based word representation rivaling neural word embedding for lexical similarity and sentiment analysis. IEEE Computational Intelligence Magazine 14 (2): 41–53.

Kamal, A. 2015. Review mining for feature-based opinion summarization and visualization. arXiv preprint arXiv:1504.03068 .

Kamps, J., and Marx, M. (2001). Words with attitude. In 1st International WordNet Conference. (pp. 332–341).

Ke, X., and Luo, H. (2015, August) Using LSA and PLSA for text quality analysis. In 2015 International Conference on Electronic Science and Automation Control (pp. 289–291). Atlantis Press.

Kim, S., H. Park, and J. Lee. 2020. Word2vec-based latent semantic analysis (W2V-LSA) for topic modeling: A study on blockchain technology trend analysis. Expert Systems with Applications 152: 113401.

Kiron, D. 2013. Organizational alignment is key to big data success. MIT Sloan Management Review 54 (1): 15.

Koivisto, E., and P. Mattila. 2020. Extending the luxury experience to social media–User-Generated Content co-creation in a branded event. Journal of Business Research 117: 570–578.

Ku, L.-W., Liang, Y.-T., and Chen, H.-H. 2006. Opinion extraction, summarization and tracking in news and blog corpora. In AAAI Symposium on Computational Approaches to Analyzing Weblogs (AAAI-CAAW) , (pp. 100–107).

Kubler, R., Colicev, A. and Pauwels, K. 2017. Social media’s impact on consumer mindset: when to use which sentiment extraction tool, Marketing Science Institute Working Paper Series , 17-122-09.

Landauer, T.K., P.W. Foltz, and D. Laham. 1998. Introduction to latent semantic analysis. Discourse Processes 25 (1): 259–284.

Lee, T.Y., and E.T. Bradlow. 2011. Automated Marketing research using online customer reviews. Journal of Marketing Research 48 (5): 881–894.

Li, Y., and T. Li. 2013. Deriving market intelligence from microblogs. Decision Support Systems 55 (1): 206–217.

Li, Z., Y. Fan, W. Liu, and F. Wang. 2018. Image sentiment prediction based on textual descriptions with adjective noun pairs. Multimedia Tools and Applications 77 (1): 1115–1132.

Li, S.G., Y.Q. Zhang, Z.X. Yu, and F. Liu. 2021. Economical user-generated content (UGC) marketing for online stores based on a fine-grained joint model of the consumer purchase decision process. Electronic Commerce Research 21: 1083–1112.

Liu, Y. 2006. Word of mouth for movies: Its dynamics and impact on box office revenue. Journal of Marketing 70 (3): 74–89.

Liu, X., D. Lee, and K. Srinivasan. 2019. Large-scale cross-category analysis of consumer review content on sales conversion leveraging deep learning. Journal of Marketing Research 56 (6): 918–943.

Llopis-Amorós, M.-P., I. Gil-Saura, M. Ruiz-Molina, and M. Fuentes-Blasco. 2019. Social media communications and festival brand equity: Millennials vs Centennials. Journal of Hospitality and Tourism Management 40: 134–144.

Lu, Y., Zhai, C., and Sundaresan, N. (2009). Rated aspect summarization of short comments. In WWW ’09: Proceedings of the 18th international conference on World Wide Web . ACM, New York, NY, USA, (pp. 131–140).

Ma, L., and B. Sun. 2020. Machine learning and AI in marketing —Connecting computing power to human insights. International Journal of Research in Marketing 37 (3): 481–504.

Mei, Q., Ling, X., Wondra, M., Su, H., and Zhai, C. 2007.Topic sentiment mixture: modeling facets and opinions in weblogs. In WWW ’07: Proceedings of the 16th international conference on World Wide Web . ACM, New York, NY, USA, (pp. 171–180).

Mikalef, P., M. Boura, G. Lekakos, and J. Krogstie. 2020a. The role of information governance in big data analytics driven innovation. Information & Management 57 (7): 103361.

Mikalef, P., I.O. Pappas, J. Krogstie, and P.A. Pavlou. 2020b. Big data and business analytics: A research agenda for realizing business value. Information & Management 57 (1): 103237.

Moe, W., and D. Schweidel. 2012. Online product opinions: Incidence, evaluation, and evolution. Marketing Science 31 (3): 372–386.

Moussa, M.E., M.H. Ensaf, and M.H. Haggag. 2018. A survey on opinion summarization techniques for social media. Future Computing and Informatics Journal 3 (1): 82–109.

Mudambi, S., and D. Schuff. 2010. What makes a helpful online review? A study of customer reviews on Amazoncom. MIS Quarterly 34 (1): 185–200.

Mudasir, M., R. Jan, and M. Shah. 2020. Text document summarization using word embedding. Expert Systems with Applications 143 (4): 111–192.

Netzer, O., R. Feldman, J. Goldenberg and Fresko, M. 2012. Mine your own business: Market-structure surveillance through text mining. Marketing Science 31 (3): 521–543.

Petrescu, M., and M. Krishen. 2021. Focusing on the quality and performance implications of marketing analytics. Journal of Marketing Analytics 9: 155–156.

Phillips, B.J., and D. Pohler. 2018. Images of union renewal: A content analysis of union print advertising. Canadian Journal of Administrative Sciences 35 (4): 592–604.

Ransbotham, S., C. Kane, and N. Lurie. 2012. Network characteristics and the value of collaborative user-generated content. Marketing Science 31 (3): 387–405.

Sahut, J.M., L.P. Dana, and M. Laroche. 2020. Digital innovations, impacts on marketing, value chain and business models: An introduction. Canadian Journal of Administrative Sciences 37 (1): 61–67.

Salehan, M., and D.J. Kim. 2016. Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics. Decision Support Systems 81: 30–40.

Sällberg, H., S. Wang, and E. Numminen. 2022. The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch. Journal of Marketing Analytics 5: 8.

Savage, M., and R. Burrows. 2009. Some further reflections on the coming crisis of empirical sociology. Sociology 43 (4): 762–772.

Sidorova, A., N. Evangelopoulos, J. Valacich, and T. Ramakrishnan. 2008. Uncovering the intellectual core of the information systems discipline. MIS Quarterly 32 (3): 467–482.

Skeen, S.J., S.S. Jones, C.M. Cruse, and K.J. Horvath. 2022. Integrating natural language processing and interpretive thematic analyses to gain human-centered design insights on HIV mobile health: Proof-of-concept analysis. JMIR Human Factors 9 (3): e37350.

Steinberger, J., and Ježek, K. (2009). Update summarization based on latent semantic analysis. In I nternational Conference on Text, Speech and Dialogue (pp. 77–84). Springer, Berlin

Timoshenko, A., and J. Hauser. 2019. Identifying customer needs from user-generated content. Marketing Science 38 (1): 1–20.

Titov, I. and McDonald, R. (2008). Modeling online reviews with multi-grain topic models. In WWW ’08; Proceeding of the 17th international conference on World Wide Web . ACM, New York, NY, USA, (pp. 111–120)

Turney, P., and M.L. Littman. 2003. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transaction Information Systematic 21 (4): 315–346.

Vollrath, M., and S. Villegas. 2022. Avoiding digital marketing analytics myopia: Revisiting the customer decision journey as a strategic marketing framework. Journal of Marketing Analytics 10: 106–113.

Vorvoreanu, M., G. Boisvenue, C.J. Wojtalewicz, and E. Dietz. 2013. Social media marketing analytics: A case study of the public’s perception of Indianapolis as Super Bowl XLVI host city. Journal of Direct, Data and Digital Marketing Practice 14 (4): 321–328.

Wamba, F., A. Gunasekaran, S. Akter, S.. Ji.-fan Ren, R. Dubey, and S.J. Childe. 2017. Big data analytics and firm performance: Effects of dynamic capabilities. Journal of Business Research 70: 356–365.

Yang, M., Y. Ren, and G. Adomavicius. 2019. Understanding user-generated content and customer engagement on Facebook business pages. Information Systems Research 30 (3): 839–855.

Yu, X., Y. Liu, X. Huang, and A. An. 2012. Mining online reviews for predicting sales performance: A case study in the movie domain. IEEE Transactions on Knowledge and Data Engineering 4 (4): 720–734.

Zhang, K., T. Evgeniou, V. Padmanabhan, and E. Richard. 2012. Content contributor management and network effects in a UGC environment. Marketing Science 31 (3): 433–447.

Zhong, Ning, and David A. Schweidel. 2020. Capturing changes in social media content: A multiple latent changepoint topic model. Marketing Science 39 (4): 669–686.

Zhu, L., Gao, S., Pan, S. J., Li, H., Deng, D., and Shahabi, C. (2013) Graph-based informative-sentence selection for opinion summarization. In Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (pp. 408–412).

Download references

Author information

Authors and affiliations.

Department of Business Administration, Earl G. Graves School of Business and Management, Morgan State University, 1700 East Cold Spring Lane, Baltimore, MD, 21251, USA

Shimi Naurin Ahmad

Department of Marketing, John Molson School of Business, Concordia University, 1450 Rue Guy, Montréal, Québec, H3G 1M8, Canada

Michel Laroche

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Shimi Naurin Ahmad .

Ethics declarations

Conflict of interest.

The authors have no conflict of interest to disclose.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Ahmad, S.N., Laroche, M. Extracting marketing information from product reviews: a comparative study of latent semantic analysis and probabilistic latent semantic analysis. J Market Anal 11 , 662–676 (2023). https://doi.org/10.1057/s41270-023-00218-6

Download citation

Revised : 13 January 2023

Accepted : 06 March 2023

Published : 08 April 2023

Issue Date : December 2023

DOI : https://doi.org/10.1057/s41270-023-00218-6

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • User genrated content (UGC)
  • Text mining
  • Amazon reviews
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. (PDF) Marketing Literature Review

    literature review on content marketing

  2. (PDF) Social Media as a Prominent Marketing Management Tool: A

    literature review on content marketing

  3. Marketing Mix

    literature review on content marketing

  4. Table Of Content Literature Review Ppt Powerpoint Infographics

    literature review on content marketing

  5. Literature review matrix on content analysis applied on social media

    literature review on content marketing

  6. 92469282-Literature-Review

    literature review on content marketing

VIDEO

  1. Literature Review: Content Details

  2. 3. Mastering Literature Review

  3. 5 Layers of Content Marketing

  4. Strategies for Writing Literature Review

  5. What is content marketing?

  6. Literature review in research

COMMENTS

  1. A Scoping Review of the Effect of Content Marketing on Online Consumer

    Contrary to the scholarly belief about the paucity of research about content marketing in the extant literature, this scoping review established that many studies about content marketing exist. However, fewer studies are available with evidence on the effect of content marketing on online consumer behavior which has become an important lens to ...

  2. The Review of Content Marketing as a New Trend in Marketing Practices

    College of Art and Technology, Kuala Lumpur, Mal aysia. Abstract. Content marketing evolves to be a powerful marketing tactic in the digital, fast moving, information driven world. It is not a new ...

  3. Content marketing research: A review and research agenda

    The purpose of this study is to perform a comprehensive review of the existing literature on content marketing and an equally comprehensive research analysis in the field. Accordingly, the study synthesizes 112 items of content marketing literature, using bibliometric analysis and the TCCM framework, to examine the evolution of content ...

  4. Determinants of content marketing effectiveness: Conceptual framework

    Based on a literature review ([3-5, 8, 18-23], see S1 Appendix for details), content marketing activities can be seen as effective if they trigger superior levels of cognitive, emotional and behavioral customer engagement at the appropriate points throughout the customer journey, strengthen customers' brand trust and induce favorable ...

  5. Content marketing research: A review and research agenda

    This study extends the literature on content marketing, identifying 17 new content types that reflect the four motivation states of older consumers to engage with the online community: cognitive ...

  6. PDF Digital Content Marketing

    marketing and digital content marketing, (2) what has been done on digital content marketing in other countries and (3) how can Vietnamese organization take international experiences to help develop an effective online content marketing campaign. 2 Research methodology A systematic literature review of academic research on social media ...

  7. Content marketing: A review of academic literature and future research

    Content marketing: A review of academic literature and future research directions. In a world where traditional advertising gets a decreasing share of marketing budgets, companies seek new ways to engage their target audiences. In the intersection between paid, owned and earned media, content marketing has quickly become an industry buzzword.

  8. (PDF) Digital Content Marketing: Conceptual Review and ...

    Based on the theoretical review of the topic, the paper presents recommendations of content marketing management strategy for digital marketers. Discover the world's research 25+ million members

  9. Research on Social Media Content Marketing: An Empirical Analysis Based

    These results are in line with studies in the extant literature, such as Barker, who suggested that information content, entertainment content, and emotional content are the main factors influencing consumer purchase intention in marketing content (Barker, 2009). The next relationship investigated was between brand identity and consumer ...

  10. Internet marketing: a content analysis of the research

    The amount of research related to Internet marketing has grown rapidly since the dawn of the Internet Age. A review of the literature base will help identify the topics that have been explored as well as identify topics for further research. This research project collects, synthesizes, and analyses both the research strategies (i.e., methodologies) and content (e.g., topics, focus, categories ...

  11. Understanding digital content marketing

    Digital content is defined as: 'bit-based objects distributed through electronic channels'. A structured analysis is conducted on the basis of a set of questions in order to surface some of the unique characteristics of digital content marketing. The analysis is informed by a literature review, and the exploration of numerous web sites which ...

  12. A systematic literature review: digital marketing and its impact on

    A systematic literature review has been conducted on digital marketing, and its implementation in SMEs. The impact of digital marketing on SMEs performance is observed over the past 12 years through the resources which are undertaken for the study, namely, Science Direct, Scopus, Springer, IEEE Explorer, ACM Digital Library, Engineering Village ...

  13. Digital Content Marketing: A Literature Synthesis

    Nina Koiso-Kanttila. Purely digital products are increasingly part of the commercial landscape. Technology-facilitated environments in general have received considerable attention in the literature on marketing. This article reviews the existing knowledge base on digital products and the differences these products suggest for marketing activities.

  14. PDF Digital transformation and marketing: a systematic and ...

    This article provides a systematic review of the extensive and fragmented litera- ture focused on Digital Transformation (DT) and marketing by identifying the main themes and perspectives (i.e., employees, customers, and business processes) stud- ied by previous research. By mapping the DT literature in the area of marketing, 117 articles ...

  15. Digital Marketing Strategies: A Comprehensive Literature Review

    This paper aims to provide a comprehensive literature review and analysis of the main digital marketing strategies. ... (SEO), search engine marketing (SEM), content marketing, influencer ...

  16. An overview of systematic literature reviews in social media marketing

    Alves H, Fernandes C, Raposo M. Social media marketing: A literature review and implications. Psychology and Marketing 2016; 33: 1029-1038. Crossref. Google Scholar. 33. Abed SS, Dwivedi YK, Williams MD. Social media as a bridge to e- commerce adoption in SMEs: A systematic literature review. ... If you have access to journal content via a ...

  17. Drivers of Social Media Content Marketing in the Banking Sector: A

    Content marketing is a crucial ingredient to the overall digital marketing strategy to measure the effectiveness and success of an organization's online communication. ... This research paper is based on an extensive literature review that outlines the concept of social media content marketing while highlighting the various benefits it offers ...

  18. A Comprehensive Literature Review on Marketing Strategies ...

    Hansika, Rashini, A Comprehensive Literature Review on Marketing Strategies Adopting by Various Industries (December 20, 2022). ... including those for text and data mining, AI training, and similar technologies. For all open access content, the Creative Commons licensing terms apply. We use cookies to help provide and enhance our service and ...

  19. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  20. Extracting marketing information from product reviews: a ...

    User-generated content (UGC) contains customer opinions which can be used to hear the voice of customers. This information can be useful in market surveillance, digital innovation, or brand improvisation. Automated text mining techniques are being used to understand these data. This study focuses on comparing two common text mining techniques namely: Latent Semantic Analysis (LSA) and ...

  21. (PDF) A Literature Review on Digital Marketing: The Evolution of a

    as 'the Digital Transformation' of marketing, widely accepted and investigated by both. practitioners and a cademics. Digital advertiseme nts, e-commerce, mobile services, just to name. a few ...

  22. Technology Content Marketing Research 2024

    Eighty-two percent use thought leadership e-books/white papers, 81% use long articles/posts, 63% use data visualizations/visual content, 62% use product/technical data sheets, and 56% use research reports. Less than half of technology marketers use brochures (45%), interactive content (35%), livestreaming content (34%), and audio content (31%).

  23. Artificial intelligence in marketing: A systematic literature review

    By way of a systematic literature review (SLR), the article evaluates 57 qualifying publications in the context of AI-powered marketing and qualitatively and quantitatively ranks them based on their coverage, impact, relevance, and contributed guidance, and elucidates the findings across various sectors, research contexts, and scenarios ...

  24. Digital Content Marketing

    The literature on digital content marketing suggests its positive role in brand building, including generating electronic word-of-mouth, building trust and credibility, and cultivating engaged and ...

  25. Spring 2024: Sadie Wiswall

    II. Literature Review. This literature review will examine prior scholarship on TikTok and its growing popularity as a means of marketing communications, the TikTok algorithm, messaging strategies and content type on TikTok, trends and virality on TikTok, and the theoretical basis that underpins this study, the uses and gratifications theory.

  26. Co-creation of Value and Social Marketing: Systematic Literature Review

    The objective of this study was to develop a systematic literature review (SLR) on value co-creation and social marketing and propose an agenda for future research. We searched for articles related to the themes of value co-creation and social marketing in Scopus and Web of Science databases.