I often encounter myths and misunderstandings about political data, whether it's in the classes I teach or broader news coverage.
Polling problems were well discussed after the 2016 election, after polls missed Donald Trump's victory. But, much less attention has been given to ongoing problems with political social media metrics - assessments of public opinion on platforms like Facebook or Twitter.
You've likely seen the headlines, from "Bernie Sanders Is Running for President, and Twitter Is Exploding" to "Joe Biden Returns to Instagram and Draws 1 million Followers."
Just like the public's obsession with polling data, coverage is often driven by anything from the volume of one's followers to something as limited as a few random negative tweets.
Social media metrics matter for many reasons, but two are particularly meaningful.
First, online discussion can influence what - or who - the news media, or the broader public, are talking about.
Second, social media are often used by journalists, as well as political campaigns, to assess public opinion.
At the broadest of levels, social media metrics, like coverage of polling, are used to determine which candidates are popular. But, in 2016, I found that Ben Carson of all the candidates was outpacing any candidate on Facebook. Obviously, he never came close to being president.
Even more nuanced analyses can miss broader realities. For example, a 2016 Forbes article noted Bernie Sanders' stronger position over Trump in terms of social media engagement.
Coverage such as these can lead to false perceptions about which candidates and issues should be covered, as well as understandings about broader public opinion.
As I see it, there are a few simple explanations for why the public should beware using social media posts or data as an assessment of broader reality.
1. Filter bubbles
If you're a political junkie, there's a good chance you like reading the news or watching TV shows about politics.
Let that sink in for a second. There's a good chance the vast majority of people's media lives don't include traditional sources of news.
Some of these same limitations apply to social media, due to the algorithm that filters people's feeds.
While tech companies have discussed changing how they operate, the companies' existence is still largely based on giving you relevant content - in other words, creating a bubble that can limit one's view of broader reality.
A research team at Stanford University found that social media echo chambers tend to mute moderate voices during debates about highly topical issues, like gun control. This can cause problems for people as they try to parse out information.
It's also an issue that affects journalists and their broader coverage. The same algorithms that limit the public's view of the world limit theirs. For example, researchers found that, when journalists cite Twitter, they tend to overemphasize "elite" sources, such as politicians or celebrities.
2. Twitter bias
One study showed that, through 2016, Twitter was used as a source 12,323 times by The New York Times and 23,164 times by The Guardian. By comparison, Facebook was cited 6,846 times and 7,000 times, respectively.
There's a big difference between Facebook and Twitter. While Facebook has been used by nearly 70% of Americans, the Pew Research Center found that only 22% of Americans use Twitter.
Thus, one of the key platforms driving U.S. political coverage is only used by about one-fifth of the population.
Furthermore, Twitter users are not nearly representative of their party. For example, a study done by The New York Times found that Democratic voters on Twitter were far more progressive and liberal than the average Democratic voter.
Twitter metrics not only fail to capture most Americans, but the ones they do capture tend to be farther from the center than their parties.
3. The older voter blind spot
This data gap grows more pronounced when you zoom out to social media behavior more broadly.
Traditional polls try to find a public that looks like those who are currently voting. But social media are a different story.
It's predicted that 23% of voters in 2020 will be over the age of 65. As the Pew notes, this would be "the highest such share since at least 1970."
And yet, guess who still doesn't use social media?
While social media use has expanded among those over the age of 65 over the last few years, no platform is used by more than 46% of adults over 65.
Seven percent of citizens over 65 use Twitter. Reddit usage - another political-centric platform - is at just 1%.
There is a big gap between those who are most likely to use social media and those that are most likely to vote. That causes major problems when comparing broader voter dynamics to social media metrics.
4. The younger and diverse voter blind spot
There's another problem: Voters aged 18 to 24 are just as likely to use Instagram or Snapchat as they are Facebook.
Since journalists rely on platforms like Facebook and Twitter, they may be missing what's important to, and being discussed by, the youngest of eligible voters.
Ignoring social media data can mean missing out on some helpful insights into voters. But any assessment of social data needs to be careful not to misread what the data are really saying about the public. Blind spots abound when analyzing social media data - and pollsters need to think critically about what voters they're actually trying to find answers about.
So, don't assume that what you see in media or social media matches the voter dynamics among likely voters, let alone those in certain states, counties or demographics.
[ Like what you've read? Want more? Sign up for The Conversation's daily newsletter. ]