INTRODUCTION

Artificial Intelligence has truly introduced intelligence into the business, from leading to transforming automation to introducing pure intelligence into softwares, AI has transformed the process of how the world develops applications. Such softwares are driven mostly by instruments that can translate natural language perfectly. The advancement of AI has introduced agents which now assist professionals with activities, from automating low-level static tasks. The AI agents are also used to make predictions, to detect frauds, ad generation and making the delivery process smarter to reply back to customers automatically without spending a lot of time.

It has also changed educational industries like making the curricula tailored for the students, making education more individualized, it is also being used to collaborate on learning by applying AI agents in class to improve collaborative discourse among the students. As AI continues to develop, its ability to enhance productivity and efficiency across all sectors grows immensely. The ability of such models to handle vast amounts of data improves decision-making and problem-solving. As AI becomes more involved in transforming the industry, one of the most critical questions this poses is the ethical application of AI, such as bias and fairness.

Open-source software such as fairness libraries and interpretability frameworks are becoming more common across AI development workflows so that developers can detect and repair bias early. Even while all of these promising advances are being achieved, actual fairness in AI is a monumental undertaking. Bias might be embedded in data, algorithms, or even social structure, so it shifts constantly. This is something that requires more than technical solutions, it requires ongoing monitoring, inter-disciplinary collaboration between technologists, ethicists, and policymakers, and greater public awareness and engagement.

Even as AI keeps redesigning our institutions, economies, and day-to-day lives, designing fairness into the fundamental fabric of its construction is a technological as well as societal imperative. Placing equity front and center is the only way to help create public confidence and ensure such mighty technologies act in the global common interest, inclusively, fairly, and responsibly. The ethics of AI is the most controversial debate since there is uncertainty in its application in the industry in a just manner, it requires thorough regulation and scrutiny. Lastly, the future ability of AI to complement human intelligence will transform the work in the future.

Ensuring fairness in AI is vital since unfair algorithms have the power to increase existing inequalities. Bias has a tendency to arise due to biased training data, human biases or improper model assumptions. The consequence of bias in the business world can be very severe, especially in industries such as work, lending, health and law enforcement, where an AI makes a decision that has a direct and tangible impact on individuals' lives. Governments and regulators around the globe are evolving towards coming up with laws which guarantee artificial intelligence platforms to be developed and deployed based on honor, transparency, and ethical quality. Consequently, firms are embracing strict fairness auditing, initiating advanced bias abatement techniques, and employing variety-rich and representative data so that their AI systems will become elevated in validity as well as inclusion.

Following the transformation, it is very important to investigate and understand how society is reacting to the advancement of AI. To be specific, the aspects of fairness and responsibility is the one that must be analysed. While much research has shown to have developing technical fixes or regulatory solutions, there is very little known about how the narrative is being constructed in the public space, especially through the media.News reports often serve as a reflection of social concerns, inclinations, and aspirations and are therefore a valuable resource to analyze. This particular work delves into uncovering patterns of construction of discourses on AI fairness and bias in mainstream media.. It aims to learn the sentiment behind these news articles that are made public to learn more about how the public perceives AI. Through title and content analysis of thousands of news reports, the study attempts to identify patterns in trends, sentiment, and frames that go otherwise unnoticed. This kind of analysis can pick up very powerful signals in the way the word is thinking about the moral structure around AI. This can guide future research agendas, policy, and public debate. The goal of the project is to conduct experiments to answer the following research questions:

Research Questions

What types of bias (gender, racial, political) appear in online discussions about AI?
How does sentiment toward AI fairness differ across various queries in the dataset?
Are specific words or phrases linked more often to certain demographic groups in discussions about AI bias?
Do users express different levels of frustration or trust in AI systems depending on the topic (fairness vs. discrimination)?
How often do discussions about AI bias reference real-world consequences (hiring, policing, education)?
Do AI-generated news articles and user-generated discussions show different perspectives on AI fairness?
Which AI-related topics (facial recognition, hiring algorithms, moderation tools) are most commonly discussed in relation to bias?
Are certain AI companies or models mentioned more frequently in bias-related discussions, and in what context?
How do discussions about AI fairness evolve over time, are certain concerns becoming more prominent?
Do different subgroups of users (technical vs. non-technical) discuss AI bias and fairness differently?

DATA PREPARATION & EDA

The data was collected through different sources by means of systematic scraping.

First of all, an API request was initiated to Reddit to scrape comments related to AI bias, AI discrimination, AI fairness, and AI ethics.

The request was initiated to comments and posts on similar Reddit pages where individuals share thoughts related to fairness and bias in AI systems openly. The information was stored along with its respective labels (400 rows), in such a way that every record was labeled according to its thematic value. Then, an API call was sent to NewsAPI to fetch news articles on the same topics of discussion (398). The aim was to gather opinions from news sources and compare them with user forums on Reddit. All the articles were also stored along with their metadata, like their label, title, description, and entire content if accessible. In the case of news and user-generated content, the dataset provides a fascinating insight into how AI bias and fairness are perceived on different platforms. To further enrich the dataset, additional data were scraped from one webpage using BeautifulSoup. The process involved scraping relevant text, stripping redundant HTML tags, and re-structuring the scraped rows to conform to the data already gathered. The newly collected entries were appended to the dataset without disrupting format and label integrity.

Data Cleaning

After data collection, a rigorous process of data cleaning was performed to clean the text for analysis. URLs, special characters, and unwanted symbols were removed to retain only useful textual information. The text was converted to lowercase to ensure uniformity, and stop words were removed to focus on meaningful terms. A sample raw data and its cleaned version is presented in fig. 1 and fig. 2. For normalizing word forms, stemming (refer fig. 3) and lemmatization (refer fig. 4) have been employed, shortening words back to their roots without losing their contextual meanings. Query columns of the figures are dataframe names.

Fig. 5 TF-IDF Vectorized Data intro_1

Fig. 6 Count Vectorized Data

Fig. 1 Sample Data before Cleaning intro_1

Fig. 2 Sample Data after Cleaning intro_1

Fig. 3 Stemmed Data intro_1

Fig. 4 Lemmatized Data

Finally, the cleaned dataset was transformed into a format suitable for analysis. CountVectorizer (refer fig. 5) and TF-IDF vectorization (refer fig. 6) were applied to convert textual data into numerical representations to enable further analysis of word frequency and its significance in discussions on AI bias and fairness. Exploratory Data Analysis On the preprocessed data, an Exploratory Data Analysis (EDA) was conducted to uncover significant trends in the discussions on AI bias, discrimination, fairness, and ethics. The analysis involved generating TF-IDF bar charts to determine the best words in responding to each question, as well as word clouds to portray commonly occurring words in user dialogue and news articles.

Exploratory Data Analysis

On the cleaned dataset, an Exploratory Data Analysis (EDA) was conducted to uncover key patterns in discussions surrounding AI bias, discrimination, fairness, and ethics. The analysis involved generating TF-IDF bar charts to highlight the most impactful words for each query, as well as word clouds to visualize frequently occurring terms in user discussions and news articles.

TF-IDF bar charts provided us with an understanding of the most significant words for each query and the trends in the conversation on AI-related fairness and bias. For instance, words "bias," "data," and "people" were very important when discussing AI bias (fig. 7 and fig. 8), while words "ethical," "intelligence," and "artificial" were prominent in conversations on ethical AI. In the same vein, conversations about AI fairness often highlighted "people," "think," and "make" as concerns relating to decision-making in AI (see fig. 9 and fig. 10).

Fig. 10 Word Cloud representing words from 'AI Ethics' data

Word clouds also evidenced the range of conversations by extracting the most occurring words within varying themes. The use of words like "human," "system," and "work" in debates about AI bias suggests ongoing debate about the way humans interact with AI systems and their perceived fairness (see fig. 10). When debating AI ethics, words such as "think," "use," and "need" suggest the importance of decision-making and ethical considerations of AI adoption (see fig. 9).

Fig. 12 Word Cloud representing TF-IDF from 'AI Bias' data

The TF-IDF measures of the different topics related to AI indicate distinctive patterns of word importance in different conversations. In Fig. 11, the words discrimination, artificial, and intelligence indicate significant problems in AI discrimination topics, with people and just being some of the most important words, indicating fairness-related conversation. Similarly, Fig. 12 on AI bias uses words like human, biased, and data, mirroring discussion on human influence in AI fairness and system biases. Fig. 13 on AI fairness is interestingly inclusive of French terms (dans, mais, pour), mirroring potential multilingual effects on the dataset. Lastly, Fig. 14, which is on AI ethics, has salient words like ethics, artificial, and intelligence, mirroring the significance of ethical concerns in AI systems. These findings collectively represent the nuanced discussions in AI bias, fairness, discrimination, and ethics with a strong emphasis on human influence, principles of fairness, and ethical AI deployment.

Fig. 7 Word Cloud representing words from 'bias' data intro_1

Fig. 8 Word Cloud representing words from 'AI Discrimination' data intro_1

Fig. 9 Word Cloud representing words from 'AI Fairness' data intro_1

Fig. 11 Word Cloud representing TF-IDF from 'AI Discrimination' data intro_1

Fig. 13 Word Cloud representing TF-IDF from 'AI Ethics' data intro_1

Fig. 14 Word Cloud representing TF-IDF from 'AI Ethics' data

Overall, these visualizations provided an unambiguous snapshot of the dataset, helping identify salient themes, shared concerns, and linguistic patterns in AI discourses from various sources.

The link to the code and the data generated after cleaning can be found here.

CLUSTERING

Overview :

Clustering is an unsupervised machine learning method that deals with finding clusters of most similar tokens among the set of large sets of tokens in a dataset. The task of the algorithm is to use an unlabeled dataset to find clusters of similar data using distance metric like the cosine similarly or cosine distance. The two types of clustering algorithm used in this project are, Kmeans clustering and Hierarchical clustering.

K-means Clustering:

The algorithm behind k-means is a learning algorithm that groups data points into k clusters, given the value of k. Each cluster is represented by a centroid, which is the average or mean of all data points within that cluster. The algorithm works as follows, it works iteratively, assigning data points to the nearest centroid and then recalculating the centroids based on the new cluster assignments, it’s done using a distance metric e.g., Euclidean distance (refer to fig. 15 ) to determine the proximity of data points to cluster centroids, it calculates the direct linear distance between data points, effectively acting as a great similarity score for clustering algorithms. The main objective of k-means is to minimize the sum of squared distances between each data point and its assigned cluster centroid, eventually finding the most minimum squared distance between points and the centroid and forming clusters within the dataset.

Hierarchical clustering:

The algorithm behind hierarchical clustering groups data points into a hierarchy of clusters, represented as a tree-like structure (dendrogram), where clusters are eventually merged or split based on similarity. The similarity measure used in the project is the cosine similarity (refer to fig. 16), it computes the cosine of the angle between two vectors, giving a robust similarity measure that focuses on direction rather than magnitude. It either starts with each data point as its own cluster and iteratively merges the closest clusters until a single cluster remains, or starts with all data points in one cluster and iteratively splits it into smaller clusters until each data point is in its own cluster.

In the current project, the work deals with using clustering to group the most similar words from the description of the new articles that talk a lot about bias and fairness in AI systems. Through clustering the algorithm is predicted to be able to clearly distinguish between the three classes of data and probably seep in and discover a fourth class to the dataset.

Data Prep:

Clustering as mentioned in the overview does not require labels as it works on unsupervised principles of machine learning. The dataset that is used in the current work also excludes labels as part of the data (refer to fig. 17). The dataset is also cleaned to remove stop words, special symbols, words that aren’t in english, words that aren’t part of the dictionary to obtain a cleaned dataset for processing. The cleaned text is then transformed into a numerical representation using TF-IDF (Term Frequency-Inverse Document Frequency). This method converts each document into a vector where each element represents the importance of a particular word in that document relative to the entire corpus, refer to fig. 18. Sample data before and after cleaning can be found here

Code:

The following code file deals with a set of code that performs Kmeans and Hierarchical clustering on the dataset. Documents represented as TF-IDF vectors, the clustering algorithm, which is KMeans, computes the distance between these points. Euclidean distance is used on the TF-IDF vectors to determine how similar or dissimilar the documents are. Hierarchical clustering, groups data points into a hierarchy of clusters, represented as a tree-like structure (dendrogram), where clusters are eventually merged or split based on similarity. The code file can be found here

Fig. 15

Fig. 16

Fig. 17

Fig. 18

Fig. 19

Fig. 20

Results:

The figure 19 represents the silhouette scores that were calculated for various numbers of clusters. In our example the silhouette scores represent that the data can be best represented with three clusters as observed by a high score. To represent the data in three dimensions, PCA has been applied and the new data points are represented in three dimensions along with their cluster information, as observed in figure 20. From the figure it can be observed that the data has more or less smoothly distinguished itself into three clusters. Though the dataset was collected to represent bias and fairness, it’s fascinating to see a new class of dataset. This could potentially be the cause of outliers, it is found that the cluster represented using the purple color seems to be an outlier to the dataset, these utterances could be the ones that have nothing to do the bias and fairness as expected in any utterances.

Now moving our attention to the Hierarchical clustering (hclust) results, it is observed from figure 21 that the 30 most influential words fell into two categories as expected. Observing the words that fell into each of these classes, it looks as though the model was able to distinguish between fairness and bias well, the algorithm classifying words such as ‘good’ and ‘happy’ in the first class potentially represents fairness, while ‘bias’ in the second class obviously represents ‘bias’. The most fascinating fact about the clusters is that ‘data training’ has fallen into the ‘bias’ class, indicating that often people relate training to bias, which is originally the cause of bias, and happiness being associated with fairness suggests that people often are happy about AI algorithms being fair in certain cases, indicating that fairness in AI algorithms creates happiness among humans.

Comparing the two algorithms, the silhouette score (kmeans) enhanced with PCA shows that the data is best represented by three clusters; this suggests that there exists a nuanced separation in the data. In contrast, hierarchical clustering grouped the 30 most influential words into two clear categories. It was able to better classify the two categories, bias and fairness effectively.

Conclusion:

From the detailed analysis conducted using various clustering algorithms, the following information was perceived. It was observed that bias and data training occurred together suggesting a close relationship between the two as expected. It is also observed that fairness in AI systems creates happiness among humans. Therefore, fairness must be the first priority while developing AI models.

Fig. 21

ASSOCIATION RULE MINING (ARM)

Overview:

Association Rule Mining or ARM is another unsupervised machine learning algorithm. The logic behind ARM is the technique that discovers relationships and patterns between items in datasets. The relationship that talks about the association between items in the transaction. Given a set of items, ARM tries to find the support, confidence and lift of items occurring together. To understand more about the terms associated with ARM, here is a quick overview,

Support: It measures how frequently an itemset appears in the dataset. The measure mainly deals with the frequency of itemset.

Confidence: Indicates how often the consequent (the "then" part of the rule) occurs when the antecedent (the "if" part) is present.

Lift: Compares the observed confidence with the expected confidence, indicating how much more likely the consequent is to occur given the antecedent.

The formulas associated with each of the terms is as seen in Figure 22

Apriori Algorithm: it’s the backbone of the rule mining that takes place using ARM, it is designed to uncover frequent items within a large dataset. It first generates frequent item sets and then throws out those that do not fall within a minimum support count. This iterative approach makes it highly effective for tasks like market basket analysis

ARM is used in the project to find the most closely occurring words, the goal is to realize the words in the domain of fairness and bias that occur together. Through that approach, we aim to understand the relationship between words surrounding bias and fairness.