8 Hashtags

The twinetverse, or graphTweets more specifically, not only enables visualising interactions between users, it also lets one build and visualise networks of users and the hashtag(s) they use as well as network of hashtags co-mentions. Let’s first look at connecting users to the hashtags they use.

8.1 Collect

Let’s collect some tweets, since we want to plot relationships between users and hashtags we’ll specify two hashtags: #python and #rstats. This way we’ll be able to see who uses both or either. We’ll also narrow it down to tweets in English to avoid #hashtags we cannot understand.

# TK <- readRDS(file = "token.rds")
tweets <- search_tweets("#rstats OR #python", n = 1000, token = TK, include_rts = FALSE, lang = "en")
## Searching for tweets...
## Finished collecting tweets!

8.2 Build

Now let’s build a network of hashtags to visualise which user tweets which #hashtag.

net <- tweets %>% 
  gt_edges(screen_name, hashtags) %>% 
  gt_nodes() %>% 
  gt_collect()

Let’s inspect the edges first.

head(net$edges)
source target n
_andrew_nkhata #programming 1
_andrew_nkhata #python 1
_es_ther #100daysofcode 1
_es_ther #datascience 1
_es_ther #python 1
_testanic #python 1

The edges include the n_tweets variable which is the number tweets the #hashtag is found in. Note that in order to ensure we can always distinguish between a user and a hashtag, hashtag are preceded by the the # sign. Let’s take a look at the nodes.

head(net$nodes)
nodes type n
_andrew_nkhata user 2
_es_ther user 3
_testanic user 2
#100daysofcode hashtag 41
#100daysofmlcode hashtag 3
#100daysofpython hashtag 1

Nodes also include the type which is either set to user or hashtag.

8.3 Visualise

Now onto the visualisation. As we did before, we unpack our network, then we prepare the data to fit sigmajs’ expectation, and we color the nodes according to the type, one colour for #hashtags and another for @users.

c(edges, nodes) %<-% net

nodes <- nodes2sg(nodes)
edges <- edges2sg(edges)

nodes$color <- ifelse(nodes$type == "user", "#0084b4", "#1dcaff")

In the visualisation we add sg_neighbours to highlight neighbours of the node clicked, try it.

sigmajs() %>% 
  sg_nodes(nodes, id, size, color, label) %>% 
  sg_edges(edges, id, source, target) %>% 
  sg_layout(layout = igraph::layout_components) %>% 
  sg_settings(
    edgeColor = "default",
    defaultEdgeColor = "#d3d3d3"
  ) %>% 
  sg_neighbours()

It is interesting to see that few users actually tweet both hashtags, this wouldn’t have to be in the same tweet so it is somewhat surprising.