header
published quarterly by the University of Borås, Sweden

vol. 24 no. 1, March, 2019



Proceedings of ISIC: The Information Behaviour Conference, Krakow, Poland, 9-11 October, 2018: Part 2.

A classification scheme for analyses of messages exchanged in online health forums


Carla Teixeira Lopes and Bárbara Guimarães Da Silva


Introduction. Online health forums help to surface and organize patients’ knowledge and make it useful for many. They are used by many to seek for advice or to share what they know about health subjects. Being an important communication medium, it’s important to understand why and how it is used.
Method. In this work we examine and categorize messages of an online health forum, with the purpose of providing a classification scheme that can be used by the research community in future analyses. The definition of the classification scheme was iterative and its inter-rater reliability was assessed twice using Cohen’s Kappa statistic.
Analysis. The classification scheme arose from a content analysis of 3,399 messages from several communities of an online health forum.
Findings. The scheme is divided into four sections of categories and each section has several subcategories, in total there are 23 subcategories. The inter-rater agreement assessment of the scheme showed a good consistency between coders. The majority of the categories has a Cohen’s Kappa agreement above 0.4.
Conclusion. The proposed classification scheme facilitates the analysis of messages exchanged in online health forums for several purposes, including studies of information seeking.

Introduction

Nowadays, the Web is one of the major health information sources being used, for example, to seek for information created by other health consumers (Fox and Jones, 2009). According to Fox and Duggan (2013), the Web is used by many as a diagnostic tool, since 59% of U.S. adults have looked online for health information and 35% of the same users have used it to figure out what medical condition they or other have.

Anonymous, fast and free exchange of medical information and personal experiences (Hartzler and Pratt, 2011; Veinot, 2010) make online health forums very popular and relevant in diagnosing, treating and supporting users (Fox and Duggan, 2013). Many seek for support that they haven’t obtained directly from a health professional (Hartzler and Pratt, 2011) or family and friends. Others use it to share medical experiences, give advices and help others with the same medical condition.

Understanding the interaction dynamics on these forums is useful to understand what motivates participation, to identify the type of information these users are searching for, to understand its impact on the health of those involved and to depict ways to help people satisfy their needs. Studies with these goals usually require the examination of messages exchanged in the forums. This process can be done manually or automatically with the latter having the advantage of easily encompassing a larger number of messages. For these studies it is useful to have a classification scheme that may be used to categorise the types of messages users exchange.

Although other classification schemes have already been proposed, they are focused on specific topics such as HIV/AIDS or Huntington’s disease. Here, we propose a generic classification scheme that will enable the categorization of messages, disregarding the health topic associated with the forum. For that, we examined several messages of different communities of a health forum and considered other schemes with similar purposes. Our classification scheme consists of 4 broad categories (Offering Support, Seeking Support, Group Interactions and Emotions) divided into 23 subcategories.

We begin by describing the background and work related to online health forums, the content analysis technique and the challenges of classifying messages exchanged in the forums. Then, we present our methodology to obtain messages, create and redefine categories and how we performed the computation of the inter-coder agreement. After that, we present the classification scheme, with examples. Discussion and future work are presented in the final section.

Background

Online health forums

According to a report by the Pew Research Internet Project (2013), 72% of users use the Internet to find online health information, and 13% started these searches on health-specific sites, such as health forums. The most searched topics according to the same report are diseases and/or specific treatments and health professionals. Honigman (2013) also reports some interesting statistics: over 40% of users indicate that information found on social sites affects the way they deal with their health; 56% of users search on platforms such as WebMD for health information, 13% use blogs and 12% use online health communities; 40% of users also mentioned that information found on social networks affects how they deal with chronic illness and their views on diet and exercise.

The benefits of online health information search are perceived by several statistics: 60% of users claim that information found online affected the way they decided how to treat a particular disease, 56% of users consider that this information also helped their approach in maintaining their health and also 53% of users report that this same information leads to the formulation of new questions to their doctors and to ask for a second opinion (Fox and Jones, 2009). While 86% of users still seek a health professional to expose their doubts about a subject related to their health (Fox and Jones, 2009), there is a significant increase in online content preference to compare experiences and get quick answers for certain medical conditions. Previous studies show that the nature of health information commonly exchanged between health consumers is experiential, combining practical strategies and personal stories (Hartzler and Pratt, 2011; Veinot, 2010) and differing from the information conveyed by health professionals (Hartzler and Pratt, 2011). These health forums become more pertinent when searching for information related to rare medical conditions, as it’s harder to find someone with the same condition in their circle of contacts.

According to Bender, Katz, Ferris and Jadad (2013), little is known about the factors that motivate users to search for online support or even if participating in these forums on the Web is similar to participating in personal groups. Normally, health forums on the Web are used by those whose health conditions are little known and/or neglected by health professionals (Bender et al., 2013). For Wright (2002), the support generated by these communities has many benefits, including the reduction of stress, an increase in personal self-esteem and the ability to deal with problems more easily. According to Wright (2002), these communities offer: emotional support, the ability to empathize with another user, being the most common type of support; and informational support, referencing to other sites that contain information about a subject, based on the experience of other users.

Because there are so many users on the Internet, there is an abundance of health information available; Solberg (2014) states that more than half of users of these communities can find answers to their questions and considers these platforms very useful to understand medical content and facilitate meeting others with similar experiences. These users thus receive a great emotional support through these communities (Solberg, 2014), as it helps them to achieve their health-related goals and even giving an extra motivation to be part of these communities (Solberg, 2014).

Content analysis

According to Bryman (2008), content analysis is a research methodology that intends to identify the characteristics of documents in as objective way as possible. It has many practical applications. Web content analysis may help to understand behaviour and user preferences (Kim and Kuljis, 2010). For a less traditional sense of Web content analysis, Herring (2010) considers two types of approaches that have several links to content analysis: computer-mediated discourse analysis and social network analysis. The first technique has been applied in e-mail analysis, discussion forums, instant text messages and blogs. Social network analysis is a more general technique that analyses hyperlinks, also considered as content on websites (Herring, 2010). These approaches show the diverse applicability of content analysis and that, whether or not a more traditional approach is used, the essential characteristics of the technique are preserved.

There are not many examples that demonstrate the applicability of content analysis on health forums. This is an area that has not been explored much yet. We found two studies applying this technique in online health forums. Coulson, Buchanan and Aubeeluck (2007), analyse messages taken from an online forum that supports users with Huntington's disease, with the purpose of creating a classification scheme. The authors adapted the classification scheme of other authors, whose content can be associated with 5 different categories - information support, esteem support, network support, emotional support and tangible assistance. As a result of this study, the authors built a classification scheme with 5 categories and 16 subcategories. Coursaris and Liu (2009) also created a classification scheme, based on the content analysis of a sample of 5000 messages in a 1 year period from a health forum about HIV/AIDS.

As referred to earlier, the classification scheme of both papers is very similar: both have 5 categories of offering support, since that is their only focus and the authors adapted the classification scheme from Cutrona and Suhr (1992). However, there are small differences between both schemes that were considered by the authors as relevant in their study. Firstly, the structure of the classification scheme is different, as Coulson et al. (2007) begin their scheme with the Information Support category and after the Emotional Support category; Coursaris and Liu (2009) begin their scheme with Information category and after the Esteem category. Also, some subcategories were named differently and/or placed in a different main category. For example, the Physical Affection category in the first scheme is named as Virtual Affection in the second scheme. Also, the Relief of Blame category is placed under the Esteem Support category in the first scheme but placed under the Emotional category in the second. Also, Coursaris and Liu (2009) included a different subcategory in the Esteem category, Anchorage, indicating the sharing of personal experiences. In fact, Coulson et al. (2007) have indicated that some messages in their health forum were translating the sharing of experiences but they considered that those messages were not offering support, so it was not included in the classification scheme.

Classification

The goal of classification for Madden, Ruthven and McMenemy (2013) is to help capturing relationships and links between different parts of knowledge. Analysis and content distinction allow the correct classification of certain messages, and consequently its use.

As a process, classification involves the systematic allocation of each entity to a class, respecting an established number of principles governing the structure of classes and their relationships (Jacob, 2004). For Mai (2011), the classification process is a deliberate act of organizing a set of entities, and there is a set of rules that determine when an entity goes to a particular class.

In a most traditional manner, content is classified in order to maintain the relationships between those entities. For Cosh, Burns and Daniel (2008) this classification is done via a taxonomy. However, to develop a taxonomy in a digital environment is more challenging, given the continuing production of content, multiple languages, the different media involved and the number of people creating it. To solve this problem, it is considered that the responsibility of creating and maintaining content is of the communities in which it is located, by associating tags, allowing a semantic deduction of those metadata. For Jacob (2004), a classification scheme is a set of mutually exclusive classes, distributed in a hierarchical structure. The same author also considers that it communicates, through the ordering of classes, relevant information about the content. As each class or category has certain essential characteristics, each content inserted in a certain category will inherit these characteristics and therefore each content of this class should be similar to each other. Classes must also be distinct from each other by a determined set of characteristics that are intrinsic to the class; they are its essence. These features are necessary and form the individuality of classes and messages that are embedded in them. Jacob (2004) considers that the scheme itself is artificial and arbitrary - artificial because it is a tool created for the purpose of establishing a significant organization; and arbitrary because the criteria used to define the class reflects one perspective of a domain, excluding all others.

Methodology

Figure 1 depicts our methodology to propose the classification scheme.

Figure 1: Methodology

Figure 1: Methodology

Selection of health forum

We considered several health forums and selected the MedHelp (https://www.medhelp.org/) forum for its number of messages, communities, users. In this forum, communities are predefined and vary from diabetes, pregnancy, women’s and men’s specific health, heart, lung and digestive diseases, children’s health and sleep disorders. Each community has an active tab showing the posts with more participation and the Newest tab shows the more recent posts.

Extraction of messages

Messages were collected in a 4-day period, more specifically, from February 20th, 2016 to February 24th, 2016. To ensure a comprehensive and current sample, it was decided that the sample should cover the most popular communities. The popularity of a community is defined by the number of its messages and replies. For each of the 185 most popular communities, we selected four messages, the 1st, 3rd, 5th and 7th posts of each community within the active tab. The selected sample contained 3,399 messages.

Proposal of a first classification scheme

Using NVivo, we proceeded with the content analysis of the collected sample of messages. The knowledge acquired during this stage and the schemes proposed by Coulson et al. (2007) and Coursaris and Liu (2009), which in turn were based on the one constructed by Cutrona and Suhr (1992), were used as input to the initial proposal of the classification scheme. After examining both schemes and considering the differences between the two, we decided to follow the scheme presented by Coursaris and Liu (2009), as it provides a more detailed description of the categories and examples. However, some subcategories were modified according to the article of Coulson et al. (2007).

Our classification scheme has four main categories, and only two are based on the previous schemes. Although Coursaris and Liu (2009) have only restricted the scheme to messages offering support, our classification scheme also considers messages seeking support and also includes categories for other types of interaction and emotion expression. We considered that messages offering support were not sufficient to characterize the dynamics of the forum.

The categories Offering Support and Group Interactions were based on the previous schemes, including the subcategories and respective definitions. Although the Group Interactions category was not included in the classification schemes mentioned above, we considered that it was an important part of the dynamics of the health forum. The Seeking Support category was included as users frequently initiate contact having doubts or questions. The messages included in this category are mainly the ones that initiate a conversation in the forum. The subcategories presented in this category were determined by analysing the messages of the sample and we noticed that users seeking for information were more prompted to ask a direct question and users seeking emotional support were more prompted to express doubts and insecurities.

The Emotions category was also determined based on the results of content analysis. These emotions were divided into subcategories, positive and negative. We considered only the most basic emotions for the classification scheme, given that emotions are less expressed than other forms of support. We based our choice of these emotions on the Wheel of Emotions by Plutchik (2001).

Testing initial inter-coder reliability

After defining the first version of the classification scheme, we assessed its inter-coder reliability. For this purpose, we asked four volunteers to use the proposed scheme to classify set of the messages of a specific community. Overall, a set of 133 messages was classified by one of the volunteers and one of the researchers.

Volunteers classified the messages on different days over a period not exceeding one hour, although it was not imposed a time limit. After the classification, short interviews with each volunteer were performed to understand their doubts, difficulties and opinions on specific aspects of the classification scheme.

Inter-coder agreement was analysed using the Cohen’s Kappa statistic and was computed for each category of the initial scheme. Cohen’s Kappa varied from -0.05 to 0.66 and allowed us to identify potentially problematic categories.

Refinement of the classification scheme

Using the volunteers’ feedback after the classification and the analysis of inter-coder agreement in the first version of the scheme, we understood that some refinements had to be done. We focused our attention on the categories with the lowest values of Cohen’s Kappa, the ones associated with difficulties by the volunteers and also the categories that were not used by any of the volunteers. We found that some of these categories were initially misunderstood, misinterpreted and misused. In the refinement, some of these categories were eliminated, better detailed or merged with others. This led to the final classification scheme that will be presented in the following section.

Final classification scheme

The final classification scheme is shown in Table 1.


Table 1: Category list with definitions
CategoriesDefinition
Offering supportInformation supportAdviceProvides the recipient with any kind of advice about his or her situation
RecommendationProvides a documental source of expertise or information that may be helpful to the recipient
TeachingFactual information about a disease or about the skills needed to deal with the situation
Emotional supportAffection Express physical contact and affection towards the community
SympathyExpressed pity or sorrow for the distress of others
EncouragementProvide hope, confidence, strength and new information that can be helpful to overcame the recipient’s situation
PrayerOffers of prayer messages for members who were suffering or in need of help
Relief of blameAlleviate another’s feelings of guilt
Esteem supportComplimentPositive comments about the recipient
ValidationExpressed agreement with the recipient’s perspective on the situation, including the person’s beliefs, actions, thoughts, or emotions
Network supportAccessProvide the recipient with access to new contacts through new communities
PresenceEmphasize the presence of listeners and encourage continued use of the support group
Tangible assistancePerform direct taskOffer to perform tasks that directly relates to the recipient conditions
Express willingnessExpressed the poster’s willingness to help without specifying the exact nature of the assistance
Seeking supportSpecific questionAsk a question when in need of factual information or suggestions
ReassuranceExpressed need for emotional support to make the recipient less afraid or doubtful
Group interactionsGratitudeExpressed thankfulness for the previous support and for finding the help they needed
CongratulationsExpress joy or acknowledgment of the recipient’s achievement or good fortune
Sharing personal experiencesStraightforward sharing of personal conditions, thoughts and feelings in response to the recipient’s post
EmotionsNegativeAngerExpression of feelings of anger
FearExpression of feelings of fear
SadnessExpression of feelings of sadness
PositiveHappinessExpression of feelings of happiness and/or excitement

Categories are not mutually exclusive, which means that messages can fall under more than one category.

In the next subsections we will provide a more detailed explanation of the categories, including information on how they differentiate from each other. Examples from each are also provided.

Offering support

The support categories were adapted from the schemes mentioned above, aiming to reveal the different kind of support users offer each other. According to Coursaris and Liu (2009), there are five types of support provided in online health communities, each of them is translated into a subcategory in the scheme.

Information support

In this category are the messages that share knowledge and contribute to reduce uncertainty. Messages of this type were observed in three subcategories: Advice, Recommendation and Teaching. The first subcategory includes messages giving advice on the user’s situation. This advice may include recommendation or suggestions, such as:

I highly recommend seeing a nutritionist that specializes in addiction too. They can recommend supplements that will replace the things that the drugs took from your system and will help the baby replenish what it hasn't been getting.

In the Recommendation subcategory are messages that provide the user with recommendations on health professionals, medical specialties or other informational resources. For example:

Some good books are Diabetes Solutions (by Dr. Richard Bernstein). If he will be on insulin then Using Insulin (by John Walsh), and Think Like a Pancreas, are also very helpful books.

Finally, the Teaching subcategory includes messages that provide factual information, whether on a particular disease, on the skills needed to deal with a particular problem or even on different ways to relieve symptoms:

From what you have written and from what I know by research as well as my own journey it sounds like you do have it.
1) The pill helped with pain, that is an indicator
2) Heavy Bleeding
3) Fainting and vomiting from pain What can help for now is aleve, a heating pad and rest if your doctor will not give you anything stronger.

Emotional support

The emotional support categories include messages expressing empathy for a user’s emotional expressions and can be observed in five subcategories: Affection, Sympathy, Encouragement, Prayer and Relief of blame.

Messages in the Affection subcategory express physical contact and affection towards the community. This category, for Coursaris and Liu (2009), was named Physical Affection. However, it was considered that it should have a more suggestive name, so it was changed to the current, following the hint of Coulson et al. (2007). Examples of such messages are: ‘(((((HUGS)))))’ or ‘Thank you Lynn, you are a good friend, I love you, and I love everyone on here. Bless you all’.

In the subcategory Sympathy, users expressed pity or sorrow for the suffering of others, such as: ‘I'm so sorry to hear (read) that... I hope he gets what's coming to him and I hope you get peace of mind... It will get better’.

The subcategory Encouragement contains messages that provide hope and trust among users, serving also as an extra support, so they can move beyond their current situation:

No, my life isn't perfect! I merely choose to focus more of the good, these days, than the rest. You can have that, too. I'm only trying to encourage you. You don't have to be afraid. That is a choice that YOU must make for yourself, though. Hang in there. Look UP. Don't focus on what's wrong. Focus on what's right.

Messages included in the Prayer subcategory are those that offer direct prayer for user’s suffering: ‘I will pray for you, your kids and also for your wife’. Finally, the subcategory Relief of Blame seeks to alleviate other’s feelings of guilt, as follows: ‘NOTHING you did at this point caused the miscarriage. Don't beat yourself up about it’. In the Coursaris and Liu (2009) scheme, Relief of Blame was included in the Esteem support category but we felt it would be better fitted in the Emotional Support category, as reported by Coulson et al. (2007).

Esteem support

Messages in the esteem support category seek to improve user’s confidence and can be found in two subcategories: Compliment and Validation. In the first subcategory, there are messages containing positive messages about a user’s personality or skills, for example:

You are strong for standing up to it and should not feel embarrassed. Many women are in need of some help during and after pregnancy, it takes a brave and courageous woman to seek help. You can do it I promise.

The Validation subcategory contains messages expressing agreement with a user’s perspective on a situation, for example:

I know how you’re feeling, I have ADD too and I don't do it to the extent that you do but when I get excited or bored I fidget with my hands a lot. I talk with my hands a lot too. It’s definitely normal for us so don't worry, you are not a freak.

Network support

Messages in the network support category allow the user to expand their social contacts and meet users with similar experiences. Messages in this category were examined in two subcategories: Access and Presence. The first subcategory provides the user with access to new contacts and companies. This subcategory differs from the Recommendation subcategory since it suggests ways of increasing users’ social networks (e.g.: through the suggestions of specific groups or communities) and not informative resources. For example: ‘If you haven't already stumbled upon it, the Medhelp forum has a long discussion of B6 Toxicity here’. The Presence subcategory emphasizes the presence of listeners and encourages continued use of the support group: ‘Please keep me posted!!! And good luck,, Us 40 year olds need to stick together!!!!’.

Tangible assistance

The tangible assistance category reflects concrete and physical action in support of the recipient, and two subcategories were identified: Perform Direct Task and Express Willingness.

The Perform Direct Task subcategory refers to messages where users offer to perform activities that are directly related with other’s situation, for example: ‘If you let me know where you are I can help you find a great Endo surgeon which also deals with fertility and Endo. Let me know’.

The messages of the Express Willingness subcategory express the poster’s willingness to help without specifying the exact nature of the assistance: ‘I'm always here if you just want to vent. My heart goes out to you, and I wish you all the best. Take care’.

Seeking support

This category is not based on any of the schemes mentioned above. Its inclusion was motivated by the content analysis of the 3,399 messages. Messages in this category are more easily found in- although not restricted to - initial messages, i.e., messages that initiate a discussion in a community. These messages express the need for a particular type of support. The subcategories that were examined are: Specific Question and Reassurance. The first subcategory expresses a question when in need of factual information or suggestions, for example: ‘Where can I find free insulin pump supplies and insulin?’. The second subcategory relates to messages that express need for emotional support to make the recipient less afraid or doubtful. An example of this subcategory is as follows:

I’m waiting for my scan report and I’m very much worried. If Anybody had any similar experience please share, because I’m in need of some positivity and support.

Group interactions

The category Group Interactions was also adapted from Coursaris and Liu (2009), although they have not included it in their classification scheme as not containing direct support. However, we consider this is a relevant category to understand the dynamics of the interaction in the forum and have, therefore, included it into our classification scheme. Its subcategories correspond to expressions of simple interaction among users, contributing to the construction of its dynamics: Gratitude, Congratulations and Sharing Personal Experiences.

Messages in the Gratitude subcategory express thankfulness for the previous support and for finding the help needed, as follows: ‘Thank you everyone! Your answers have helped a lot!’. The Congratulation subcategory expresses joy and acknowledgment for the user’s achievement or good fortune: ‘Congratulations! Hope all goes well :)’.

Lastly, the Sharing Personal Experiences subcategory allows straightforward sharing of personal conditions, thoughts and feelings in response to the recipient’s post, for example: ‘I have the same kind of headache and I'm about one week from the start of my miscarriage. The pounding is terrible!!! I'm looking for answers as well’.

These types of messages can coincide with a form of exposing problems, either spontaneous or forced, but also helps in promoting reciprocal communications (Coursaris and Liu 2009).

Emotions

This category is not based on any previously proposed scheme. We considered important to insert a category for the expression of different types of emotions, since they are inherent to human interaction. This category is divided into two subcategories that correspond to the negative and positive character of emotions, and each of these still have the following subcategories: Anger, Fear and Sadness and Happiness, respectively. Fear, Sadness and Anger subcategories express, respectively, feelings of fear, sadness and anger. An example of the first subcategory is: ‘I miscarried five days ago. I was 9 weeks pregnant. I was scared to death, I lost so much blood’.

An example of a message in the Sadness category is:

But because of my mental illness, I cannot function on my own. And it is killing me. I want to find a life and be happy. But it seems I have no future or reason to be here. My psychiatric doctor does not help. Nor does my family. I feel like I am drowning.

And an example for the Anger subcategory is:

I can't believe I didn't know all the side effects were occurring with women that had the device. I am reading the horror stories now and praying I can feel like myself soon. I'm angry and want others out there to know that this IUD should be taken off the market.

Finally, the Positive category expresses messages with positive emotions, and the subcategory Happiness expresses feelings of happiness and/or excitement, such as: ‘Sounds like you are doing great, Pip! So very happy for you!’.

Final evaluation

After the refinement of the scheme we assessed, once again, its inter-coder reliability. For this assessment, we selected messages from 2 communities that have not been previously classified. We counted with the help of 4 new volunteers and, to avoid bias, we decided to exclude the researchers from this assessment. Each message was, therefore, classified by two volunteers.

Inter-rater agreement was computed using the Cohen’s Kappa. In Table 2 we present the simple observed agreement, the Cohen’s Kappa and the 95% confidence interval for Cohen’s Kappa.


Table 2: Values of agreement and confidence intervals. Bold values for Cohen’s Kappa values significantly different from 0
CategoriesPercent agreementCohen’s KappaConfidence interval at 95%for Cohen’s Kappa
MinimumMaximum
Advice88%0.60.290.91
Recommendation86%0.17-0.250.6
Teaching84%0.620.380.87
Affection98%0.790.391.2
Sympathy93%0.38-0.160.91
Encouragement88%0.600.290.91
Prayer98%0-5.1e-075.1e-07
Relief of blame100%N/AN/AN/A
Compliment93%0.36-0.20.93
Validation91%0.610.270.96
Access95%0.48-0.121.1
Presence77%0.400.110.7
Perform direct task72%0.18-0.040.41
Express willingness88%0.39-0.030.81
Specific question70%0.380.110.64
Reassurance81%0.32-0.040.68
Gratitude86%0.34-0.060.73
Congratulations98%0.790.391.2
Sharing personal experiences88%0.480.090.87
Anger95%0.48-0.141.1
Fear86%0.580.280.88
Sadness79%0.350.030.68
Happiness91%0.560.190.93

After the refinement of the schema, Cohen’s Kappa values have generally increased. Cohen’s Kappa now ranges from 0 to 0.79, with the majority of the categories having an agreement significantly higher than 0.

Conclusion

This work proposes a classification scheme that can be used to characterize the interactions within an online health forum through the classification of its messages. The scheme here proposed is more extensive than existing ones, including not only categories that express or offer support but also categories related to the interactive dynamics of the community and categories related to the underlying emotions.

The inter-rater agreement assessment of the proposed scheme showed a good consistency between coders. The majority of the Cohen’s Kappa values are above 0.40. The lower agreement values are due to the lack of consensus or to a lack of use by the coders. By the minimum value of the confidence interval for the Cohen’s Kappa, we can see that a large number of categories have agreement values significantly higher than relatively high proportion agreements. For example, the Affection and Congratulations categories have values significantly higher than 0.38.

The increase of the Cohen’s Kappa between the both evaluations has shown that the refinement of the scheme contributed to a greater understanding of the classification scheme. There are however, some categories that are particularly challenging due to their subjectivity, namely, emotion categories. Yet, even these categories show good agreement values.

Although we have reached good results on the reliability analysis of the classification scheme, a limitation of this study is the lack of validation by content experts such as psychologists. We plan to do it in a near future. Moreover, to assess its validity, we will also consider ontology evaluation techniques such as the ones mentioned by Brank, Grobelnik, M. and Mladenić (2005) and Hicks (2017).

We expect this proposal can open doors to future research works focused on studying these types of communication media. As future work, we will develop supervised classifiers to automatically classify messages of an online health forum in the categories proposed in our scheme. Afterwards, we plan to analyse and characterize online health forum regarding the types of messages in it. It would be interesting to know how communities differ from each other, which type of messages are more popular, which type of messages generate more comments, between others.

Acknowledgements

Work supported by project “NORTE-01-0145-FEDER-000016” (NanoSTIMA), financed by the North Portugal Regional Operational Programme (NORTE2020), under the PORTUGAL 2020 Partnership Agreement, and through the European Regional Development Fund (ERDF).

About the author

Carla Teixeira Lopes ia an Assistant Professor in the Department of Informatics Engineering, University of Porto, Portugal. She is also a senior researcher at INESC TEC since 2014. She received a PhD in Informatics Engineering from the University of Porto in 2013. Her research interests lie at the intersection of information retrieval and human-computer interaction. She is interested in studying information search behaviour and in developing tools that help people search more successfully. She can be contacted at ctl@fe.up.pt
Bárbara Silva is a Quality Assurance Analyst in Retail Consult. She received her Master's Degree from University of Oporto. She can be contacted at silva.barbaralia@gmail.com.

References


How to cite this paper

Lopes, C. T. and Da Silva, B. G. (2019). A classification scheme for analyses of messages exchanged in online health forums In Proceedings of ISIC, The Information Behaviour Conference, Krakow, Poland, 9-11 October: Part 2. Information Research, 24(1), paper isic1827. Retrieved from http://InformationR.net/ir/24-1/isic2018/isic1827.html (Archived by WebCite® at http://www.webcitation.org/76lXRlsME)

Check for citations, using Google Scholar