Connectionists: SBP-BRiMS 2017: Grand Interdisciplinary Data-Science Challenge.

Donald Adjeroh Donald.Adjeroh at mail.wvu.edu
Tue Feb 21 12:48:11 EST 2017


Apologies if you receive multiple copies ...




SBP-BRiMS 2017 Grand Interdisciplinary Data-Science Challenge.
George Washington University,Washington DC, USA
(as part of the SBP-BRiMS 2017 conference - July 5-July 8, 2017)

Social networking websites let users post and share opinions on the web ‘freely’. The authors of the posts are not required to verify the authenticity of the content created. This has lead to the issue of spreading misinformation. False or misleading information that has not been verified and could not be substantiated by facts are sometimes virally circulated on social media giving an impression of authenticity, and sometimes leads to undesirable consequences.

In this year’s SBP-BRiMS challenge problem, we ask participants to consider the issue of circulation of fake news and propaganda on online media for personal or political gains. Using at least one of the suggested datasets (discussed later), and any other relevant publicly accessible dataset, this year’s challenge problem asks participants to address any issue of interest to you (or your team). The issue should relate to any of the following topics:   Fake News and/or Propaganda.

The broad questions of interest are:

  *   How can we use openly available data to find fake news?
  *   How is propaganda used on online social media and news outlets?
  *   How are bots used to propagate fake news or propaganda?
  *   What interventions might slow the spread of fake news or propaganda?

These questions are only intended to give a rough idea of what might be an interesting topic to explore for this challenge problem, and by no means the only questions of interest. All entries must have both a strong social theory, political theory or policy perspective and a strong methodology perspective.

Rules
●  Participants may work individually or in teams.
● Participants must use at least one of the suggested databases described later.
●  Participants should address a social science theory or policy relevant issue and should employ one or more methodologies appropriate for the empirical assessment of big data (e.g., computational algorithms, machine learning, computer simulation, social network analysis, text mining).
●  Each participating team may prepare only one entry.
●  Entries must represent original work that has not been previously published or submitted to other challenges.
●  Each participating team must send at least one member to the SBP-BRiMS 2017 conference to present a poster describing their entry.  At least one team member must register and attend the conference.
●  Participants are encouraged to use a second data set.  This data set must be either publicly available data or the participants must make it publicly available by submitting it along with their challenge paper.

●  At the conference, all entries will be judged by the community using a participant voting system.
●  The individual or group that submits the winning entry will be invited to present a talk at the conference in 2018, and will contribute a full length paper describing their challenge solution to an SBP-BRiMS special issue of the journal Computational and Mathematical Organization Theory.

Guidelines
A strong entry will have one or more of these components:
●  Employ multiple data sets.
●  Include at least one high quality visualization (note that participants will be allowed to display dynamic visualizations via some form of electronic media e.g., by hanging a tablet from the poster. However, please note that tables will not be provided).
● Account for biases in the data across time, space, topics and sources.
●  Provide a new metric, simulation or algorithm development such as:

  1.  A new spatial, temporal, or network analytic methodology or algorithm, or a simulation model, that can cope with the vast scale of open source data (e.g.Twitter data) and support answering a key social or policy issue.
  2.  A new spatial analytic methodology that can better take into account change over time and non-spatial distances (such as co-occurrences and semantic similarity between locations).
  3.  A new network methodology that better incorporate the diversity of actor and relationship types in the data, spatio-temporal information, or for constructing edges from the data and for distributing actor and edge attributes onto the graph.
  4.  A novel simulation that that supports reasoning about the spread of fake news or propaganda that uses empirical data to either instantiate the model or to confirm some of the results.
●  Generate a new empirical finding that challenges or provides novel support for existing social or political theory, or provides information of policy relevance.  Note, the results of computer simulation are viewed as empirical findings/.

In addition, a strong entry should be well-written and provide some level of creativity in its use of or combination of data.
Submitting and Entry
What to Submit
You need to submit 3 things - An extended abstract, A PDF of your poster, and A powerpoint promotion slide.  All three of these will go in the on-line proceedings.

Extended Abstract: A 2-page abstract describing the project.  This should define:
●      What social/policy question was asked or challenge addressed?
●      Why is this question important or the challenge critical in the context of fake news and/.or propaganda?
●      What data sets were used?
●      What is the novel contribution?
●      What is the key methodology or methodologies used?
●     What is the key policy issue or theory being addressed?

  1.   Who is the team? Provide names, email and institution.

A pdf of the poster.  This will be put on line.
You are, however, responsible for printing and bringing your own poster to the conference. An easel will be provided, but not posterboard.
The poster should be either 4’x3’ or 3’x4’.

Promotion Slide
This is a single powerpoint slide.  The purpose of this slide is to excite people to come to your poster.  This slide will also be put on line.  You will be given one minute to present this slide at the conference to encourage people to come and see your poster. This slide should contain:

  *   Title of project
  *   Names of all team members
This slide may contain

  *   Any word or image or idea that you think will promote your poster
  *   Logos for your group, company or organization

When to Submit
This abstract is due on: June 12, 2017.
The Poster PDF is due on July 5, 2017.
The one page promo slide is due on July 5, 2017.
How to Submit
All abstracts should be submitted through the EasyChair link at at: https://easychair.org/conferences/?conf=sbpbrims2017 for the challenge on the SBP-BRiMS 2017 website. Please be sure to select the Challenge track.

The poster pdf and the one page slide in powerpoint is to be sent to sbp-brims at andrew.cmu.edu<mailto:sbp-brims at andrew.cmu.edu>.

What to present
All entries will send at least one team member to SBP-BRiMS 2017 who will be registered for the conference by the <<early registration deadline>> to present their poster in the poster-session on July 6.  The poster will be 4’x3’ or 3’x4’.  Participants may bring in additional props to enhance their presentation.  In addition, the team spokesperson should be ready to present a 1 minute talk using the slide, to encourage people to come to their poster.
How entries will be judged
Entries will be judged by community voting at the poster session.

Who is eligible
Anyone with an interest in using this data to address a social or policy issue. Entries are accepted from single individuals or teams.
Winning Entry
The winning entry will be invited to send a team member to present the project at SBP-BRiMS 2018.  In addition, winning entry and selected other entries will submit a full paper to a special issue of Computational and Mathematical Organization Theory.
Data Sets
We invite participants to explore one or more of these data sets: Kaggle Fake News, Kaggle Propaganda, ClickBait, Wikipedia or GDELT in conjunction with a second data set which may or may not be from this list. The datasets below consists of records of individual politically-relevant events, from statements of support to military attacks. Each record includes information on the source and target of each action, its date and location, and information about the event itself, machine-coded from media reports.  It is important to note that the sites listed as “fake” are those that appear fake to the coding algorithm or human coder, and may or may not actually be fake news sites.

Kaggle Fake News:
Source: https://www.kaggle.com/mrisdal/fake-news
Kaggle has a dataset on fake news. The dataset was added by Megan Risdal<https://www.kaggle.com/mrisdal> and shares data collected by ‘BS Detector’ chrome extension that was created by Daniel Sieradski<https://github.com/selfagency>. The data contains text and metadata information from 244 different websites and contains 12,999 posts.

Please check the website https://www.kaggle.com/mrisdal/fake-news for more information.

Kaggle Propaganda:
Source: https://www.kaggle.com/kzaman/how-isis-uses-twitter
Kaggle also has a dataset on how ISIS uses Twitter. The dataset contains over 17,000 pro ISIS tweets since Nov 2015 Paris attack.
For more information, please check this website http://blog.kaggle.com/2016/06/03/dataset-spotlight-how-isis-uses-twitter/ .


ClickBait[1]<file:///C:/donn/donn/mmResearch/reviewWork/2017/sbp17publicity/SBP-BRiMS2017GrandInterdisciplinaryData-ScienceChallenge.docx#_msocom_1> :
This dataset contains four categories of news sources: a) fake b) misleading c) clickbait-y d) satirical.

Check this link http://theduran.com/updated-list-of-false-misleading-clickbait-y-andor-satirical-news-sources/ for more information.


Wikipedia Fake News
https://en.wikipedia.org/wiki/List_of_fake_news_websites

GDELT:
The GDELT database (http://www.gdeltproject.org/) is divided into three core data streams, capturing physical activity, counts of key incidents like death, and a graph structure capturing the latent and physical aspects of the global news into a single computable network. GDELT’s event data consists of a quarter-billion geo-referenced dyadic “event records” covering all countries in the world 1979 to present, capturing who did what to whom, when, and where in the CAMEO taxonomy. GDELT’s count data (2013-present) records mentions of counts of things with respect to a set of predefined categories such as a number of protesters, a number killed, or a number displaced or sickened. GDELT’s Global Knowledge Graph (2013-present) is an attempt to connect the people, organizations, locations, counts, themes, news sources, and events appearing in the news media across the globe each day into a single massive network that captures what’s happening around the world, what its context is and who’s involved, and how the world is feeling about it, each day.
Challenge Committee
Kathleen M. Carley
Sumeet Kumar
Brandon Oselio
Justin Sampson
Submit Questions Regarding Challenge
All questions and concerns can be sent to sbp-brims at andrew.cmu.edu


Some useful references:
SOCIAL MEDIA AND FAKE NEWS IN THE 2016 ELECTION
 http://www.nber.org/papers/w23089.pdf

When facts and news diverge http://docs.rwu.edu/cgi/viewcontent.cgi?article=1464&context=law_pubs_blogs

How to spot fake news
http://www.factcheck.org/2016/11/how-to-spot-fake-news/
________________________________
The list appears to be very subjective (author's personal opinion). The author has marked BBC as 'click-bait' and Yahoo News as 'Fake'. You should
definitely check the website once.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/connectionists/attachments/20170221/9dfe098a/attachment.html>


More information about the Connectionists mailing list